Analyzer
BaseTextAnalyzer class to analyze the text and extract the PII entities.
References
NONE
BaseTextAnalyzer
Bases: Component
, ABC
Analyzer class to analyze the text and extract the PII entities.
analyze(text, language, entities=None, score_threshold=None, allow_list=None, allow_list_match='exact', regex_flags=re.DOTALL | re.MULTILINE | re.IGNORECASE, **kwargs)
abstractmethod
Analyze the text and extract the PII entities.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text |
str
|
The text to be analyzed |
required |
language |
str
|
The language of the text |
required |
entities |
list | None
|
The list of entities to be extracted. Default is None. |
None
|
score_threshold |
float | None
|
The threshold score for the extracted entities. Default is None. |
None
|
allow_list |
list | None
|
List of words that the user defines as being allowed to keep in the text. Default is None. |
None
|
allow_list_match |
str | None
|
The matching strategy for the allow list. Default is "exact". |
'exact'
|
regex_flags |
int | None
|
The regex flags for the text analysis. |
DOTALL | MULTILINE | IGNORECASE
|
**kwargs |
Any
|
Additional keyword arguments that may be needed for the text analysis process. |
{}
|
Returns:
Type | Description |
---|---|
list[RecognizerResult]
|
list[RecognizerResult]: The list of extracted entities |