Skip to main content
Skip table of contents

AO Rules Entity Extractor

Intended audience: ANALYSTS DEVELOPERS ADMINISTRATORS

AO Platform: 4.3

Overview

This topic contains the Parameters and Entity Identification Rules configuration sections of the AO Rules Entity Extractor Strategy.

Properties - Parameters

Labels

Descriptions

Model Name

Entities to be Identified

Properties - Entity Identification Rules

Labels

Descriptions

Full Scan

RULES

  • Rule name

A friendly name for the Rule being created. Either enter Rule name manually, or search and select for a Rule template by clicking the Search icon. Once selected, the Rule name and Description will be populated from template.

  • Description

A short description for the Rule being created. Either enter Rule name manually, or search and select for a Rule template by clicking the Search icon. Once selected, the Rule name and Description will be populated from template.

STEPS

  • Tag

Select the Tag radio-button for the step in the Rule that will produce the extracted entity.

  • Key

Select the Key from dropdown that will be used, possibly with other steps, to identify the entity to be extracted. See this table of Possible Keys in the dropdown with descriptions and examples.

  • Value

Add the Value based on the selected Key. See the following tables for predefined or manually entered values:

Possible Keys

Display Names

Descriptions

Examples

Word (WORD)

The word to be searched in text. Words will have to be provided in Bag of Words bucket in Linguistics section

  • TAG-WORD-United States

Named Entity Recognition (NER)

Named Entity Recognition type to be searched - Fixed list of values - see below

  • TAG-NER-GPE

Part-of-Speech (POS)

Part-of-Speech - Fixed list of values - see below

  • POS-ADJ

Regular Expression (REGEX)

Regular Expression can be searched

  • TAG-REGEX-[Uu]nited\s?[Ss]tates

Match Regular Expression (MATCH_REGEX)

Tells the engine to execute latter steps only if regex is matched

  • MATCH_REGEX-(organization.*?\\$[0-9\\.,]+ per share)

Filter Regular Expression (FILTER_REGEX)

Filters out the text as per the regex provided, and sends that “filtered” text to downstream steps

  • FILTER_REGEX-(organization.*?\\$[0-9\\.,]+ per share)

Bag of Words (BOW)

Bag of Words, can come through word provider. Add Bag of Words Provider in separate field

  • BOW-BOW

Maximum Tokens (MAX_TOKENS)

Limit search to N number of Tokens

  • MAX_TOKENS-2

Maximum Characters (MAX_CHARS)

Limit search to N number of Chars

  • MAX_CHARS-10

Possible values for NER

Display Names

Descriptions

Examples

Person (PERSON)

People, including fictional

President Obama, Kim

Group (NORP)

Nationalities or religious or political groups

European, Christianity, The Democratic Party

Facility (FAC)

Buildings, airports, highways, bridges, etc.

Washington Monument 

Organization (ORG)

Companies, agencies, institutions, etc.

WHO, Google, App Orchid

Geo-Political Entity (GPE)

Countries, cities, states

North West America , U.K.

Geo-Location (LOC)

Non-GPE locations, mountain ranges, bodies of water

Mount Everest, London, England, United Kingdom, California

Product (PRODUCT)

Objects, vehicles, foods, etc. (not services)

Ferrari, Mustang, banana, apple, Alexa,

Event (EVENT)

Named hurricanes, battles, wars, sports events, etc.

ww2, UEFA cup, Masters series 

Work of Art (WORK_OF_ART)

Titles of books, songs, etc.

The Hitchhiker's Guide to the Galaxy, Hey Jude, War and Peace

Law (LAW)

Named documents made into laws.

 

Language (LANGUAGE)

Any named language

Japanese, Danish, Hindi, Arabic

Date (DATE)

Absolute or relative dates or periods

2020-07-10, Wednesday

Time (TIME)

Times smaller than a day

12:50 P.M. 

Percent (PERCENT)

Percentage, including "%"

98.24% 

Money (MONEY)

Monetary values, including unit

One Million Dollars , $5.1 billion

Quantity (QUANTITY)

Measurements, as of weight or distance

 10 km, 25 ft, 1 kg, 3 gallons

Ordinal (ORDINAL)

"first", "second", etc.

second

Cardinal (CARDINAL)

Numerals that do not fall under another type

93,000, two 

Possible values for POS

Display Names

Descriptions

Examples

Adjective (ADJ)

Noun modifiers describing properties

red, young, awesome, big, old, green, incomprehensible, first

Ad Position (ADP)

Marks a noun’s spatial, temporal, or other relation

in, on, by, under, to, during

Adverb (ADV)

Verb modifiers of time, place, manner

very, slowly, home, yesterday, tomorrow, down, where, there

Auxiliary (AUX)

Helping verb marking tense, aspect, mood, etc.

can, may, are, is, has (done), will (do), should (do)

Conjunction (CONJ)

Joins two phrases/clauses (replaced by CCONJ and SCONJ)

and, or, but

Coordinating Conjunction (CCONJ)

Joins two phrases/clauses

and, or, but

Determiner (DET)

Marks noun phrase properties

a, an, the, this

Interjection (INTJ)

Exclamation, greeting, yes/no response, etc.

oh, um, yes, hello, psst, ouch, bravo

Noun (NOUN)

Words for persons, places, things, etc.

algorithm, cat, mango, beauty, girl, tree, air

Numeral (NUM)

Numeral

one, two, first, second, 1, 2017, seventy-seven, IV, MMXIV

Particle (PART)

A preposition-like form used together with a verb

up, down, on, off, in, out, at, by, ’s, not,

Pronoun (PRON)

A shorthand for referring to an entity or event

who, I, others, you, he, she, myself, themselves, somebody

Proper Noun (PROPN)

Name of a person, organization, place, etc.

Regina, IBM, Colorado, Mary, John, London, NATO, HBO

Punctuation (PUNCT)

Punctuation

., (, ), ?

Subordinating Conjunction (SCONJ)

Joins a main clause with a subordinate clause such as a sentential complement

that, which, if, while

Symbol (SYM)

Symbols like $ or emoji

$, %, §, ©, +, −, ×, ÷, =, :), 😝

Verb (VERB)

Words for actions and processes

draw, provide, go, run, runs, running, eat, ate, eating

Possible values for BOW

Display Names

Descriptions

Examples

Bag of Words (BOW)

All words to be added in Linguistics section in the bucket referred to by the selection in Bag of Words Provider field

  • BOW-BOW

Possible values for WORD

Descriptions

Examples

Add word(s) directly in value field

  • WORD1-commercial

  • WORD2-umbrella|excess|excessive

  • TAG-WORD-commercial

  • TAG-WORD-umbrella|excess|excessive

Possible values for REGEX, MATCH_REGEX, FILTER_REGEX

Descriptions

Examples

Add regex directly in value field

For interactive learning about Regular Expressions, visit these online tools: https://regex101.com/ and https://regexr.com/

  • REGEX1-\\d+

  • TAG-REGEX-[\\d,\\.]+

Possible values for MAX_CHARS

Descriptions

Examples

Add number (or number range) directly in value field

  • MAX_CHARS1-5

Possible values for MAX_TOKENS

Descriptions

Examples

Add number (or number range) directly in value field

  • MAX_TOKENS1-5

Also see Testing Strategies.


Contact App Orchid | Disclaimer

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.