public abstract class RegexNerDetector extends RegexNamedEntityFactory
RegexNamedEntityFactory in cases where
NamedEntities do use a single NerTag
NamedEntities do use MatchResult.group() as Token#getValue()
initPatterns() Method that
is called once and is expected to provide the list of Regex patterns.
The acceptMatch(String) provides an callback so that unwanted matches can be
filtered out. The default implementation will filter all blank
matchesRegexNamedEntityFactory.NamedPatternlog| Constructor and Description |
|---|
RegexNerDetector(String name,
NerTag type) |
| Modifier and Type | Method and Description |
|---|---|
protected boolean |
acceptMatch(String value)
Can be overwritten to validate matches based on the
Token#getValue(). |
protected RegexNerProcessor.NamedEntity |
createNamedEntity(String patternName,
MatchResult match)
Creates a token for the parsed
MatchResult originating from the
RegexNamedEntityFactory.NamedPattern with the parsed name |
String |
getName() |
protected List<RegexNamedEntityFactory.NamedPattern> |
getRegexes(SpanCollection section,
String lang)
Getter for the
RegexNamedEntityFactory.NamedPattern to be used by the RegexNerProcessor |
NerTag |
getType() |
protected void |
init() |
protected abstract Map<String,List<Pattern>> |
initPatterns() |
processpublic String getName()
public NerTag getType()
@PostConstruct protected final void init() throws IOException
IOExceptionprotected abstract Map<String,List<Pattern>> initPatterns() throws IOException
IOExceptionprotected final List<RegexNamedEntityFactory.NamedPattern> getRegexes(SpanCollection section, String lang)
RegexNamedEntityFactoryRegexNamedEntityFactory.NamedPattern to be used by the RegexNerProcessorgetRegexes in class RegexNamedEntityFactorysection - the section of an AnalyzedText to be analyzed with the
returned patternslang - the language of the parsed text sectionRegexNamedEntityFactory.NamedPattern or an empty list if noneprotected final RegexNerProcessor.NamedEntity createNamedEntity(String patternName, MatchResult match)
RegexNamedEntityFactoryMatchResult originating from the
RegexNamedEntityFactory.NamedPattern with the parsed namecreateNamedEntity in class RegexNamedEntityFactorypatternName - the name of the RegexNamedEntityFactory.NamedPatternmatch - the MatchResultRegexNerProcessor.NamedEntity or null if no Token was created.protected boolean acceptMatch(String value)
Token#getValue().
The default implementation accepts all none black valuesvalue - the valueCopyright © 2016–2017 Redlink GmbH. All rights reserved.