public class NgramExtractor extends Object
Modifier and Type | Method and Description |
---|---|
@NotNull Map<String,Integer> |
extractCountedGrams(@NotNull CharSequence text) |
@NotNull List<String> |
extractGrams(@NotNull CharSequence text)
Creates the n-grams for a given text in the order they occur.
|
NgramExtractor |
filter(NgramFilter filter) |
List<Integer> |
getGramLengths() |
static NgramExtractor |
gramLength(int gramLength) |
static NgramExtractor |
gramLengths(Integer... gramLength) |
NgramExtractor |
textPadding(char textPadding)
To ensure having border grams, this character is added to the left and right of the text.
|
public static NgramExtractor gramLength(int gramLength)
public static NgramExtractor gramLengths(Integer... gramLength)
public NgramExtractor filter(NgramFilter filter)
public NgramExtractor textPadding(char textPadding)
Example: when textPadding is a space ' ' then a text input "foo" becomes " foo ", ensuring that n-grams like " f" are created.
If the text already has such a character in that position (eg starts with), it is not added there.
textPadding
- for example a space ' '.@NotNull public @NotNull List<String> extractGrams(@NotNull @NotNull CharSequence text)
Example: extractSortedGrams("Foo bar", 2) => [Fo,oo,o , b,ba,ar]
text
- @NotNull public @NotNull Map<String,Integer> extractCountedGrams(@NotNull @NotNull CharSequence text)
Copyright © 2024. All rights reserved.