public abstract class VocabularyDetector extends RegexNamedEntityFactory
| Modifier and Type | Class and Description |
|---|---|
static class |
VocabularyDetector.CaseSensitivity |
RegexNamedEntityFactory.NamedPatternlog| Constructor and Description |
|---|
VocabularyDetector(String name,
NerTag type,
Locale lang,
VocabularyDetector.CaseSensitivity caseSensitivity) |
| Modifier and Type | Method and Description |
|---|---|
protected RegexNerProcessor.NamedEntity |
createNamedEntity(String patternName,
MatchResult match)
Creates a token for the parsed
MatchResult originating from the
RegexNamedEntityFactory.NamedPattern with the parsed name |
VocabularyDetector.CaseSensitivity |
getCaseSensitivity() |
Locale |
getLanguage() |
String |
getName() |
protected List<RegexNamedEntityFactory.NamedPattern> |
getRegexes(SpanCollection section,
String lang)
Getter for the
RegexNamedEntityFactory.NamedPattern to be used by the RegexNerProcessor |
protected void |
init() |
protected abstract Collection<VocabularyEntry> |
loadEntries() |
protected String |
normalize(String label)
Normalizes labels by
StringUtils.trimToNull(String) and
if #isCaseSensitive() converts the label to lower case
using getLanguage() specific rules |
processpublic VocabularyDetector(String name, NerTag type, Locale lang, VocabularyDetector.CaseSensitivity caseSensitivity)
public final String getName()
protected String normalize(String label)
StringUtils.trimToNull(String) and
if #isCaseSensitive() converts the label to lower case
using getLanguage() specific ruleslabel - the label to normalizenull if the label is invalidpublic VocabularyDetector.CaseSensitivity getCaseSensitivity()
public Locale getLanguage()
@PostConstruct protected final void init() throws IOException
IOExceptionprotected abstract Collection<VocabularyEntry> loadEntries() throws IOException
IOExceptionprotected RegexNerProcessor.NamedEntity createNamedEntity(String patternName, MatchResult match)
RegexNamedEntityFactoryMatchResult originating from the
RegexNamedEntityFactory.NamedPattern with the parsed namecreateNamedEntity in class RegexNamedEntityFactorypatternName - the name of the RegexNamedEntityFactory.NamedPatternmatch - the MatchResultRegexNerProcessor.NamedEntity or null if no Token was created.protected List<RegexNamedEntityFactory.NamedPattern> getRegexes(SpanCollection section, String lang)
RegexNamedEntityFactoryRegexNamedEntityFactory.NamedPattern to be used by the RegexNerProcessorgetRegexes in class RegexNamedEntityFactorysection - the section of an AnalyzedText to be analyzed with the
returned patternslang - the language of the parsed text sectionRegexNamedEntityFactory.NamedPattern or an empty list if noneCopyright © 2016–2017 Redlink GmbH. All rights reserved.