Skip to main content

Babel Street Analytics API

Name Endpoints

Match

Babel Street Match uses machine learning and cutting-edge NLP techniques to perform name matches, address matches, record matches, and name deduplication across a large set of languages and writing scripts. Match functionality is provided through four endpoints:

Names are complex to match because of the large number of variations that occur within a language and across languages. Match breaks a name into tokens and compares the matching tokens. Match can identify variations between matching tokens including, but not limited to, typographical errors, phonetic spelling variations, transliteration differences, initials, and nicknames.

Table 2. Examples of Name Variations

Variation

Example(s)

Phonetic and/or spelling differences

Nayif Hawatmeh and Nayif Hawatma

Missing name components

Mohammad Salah and Mohammad Abd El-Hamid Salah

Rarity of a shared name component

Two English names that contain Ditters are more likely to match than two names that contain Smith

Initials

John F. Kennedy and John Fitzgerald Kennedy

Nicknames

Bobby Holguin and Robert Holguin

"Cousin" or cognate names

Pedro Calzon and Peter Calzon

Uppercase/Lowercase

Rosa Elena PACHECO and Rosa Elena Pacheco

Reordered name components

Zedong Mao and Mao Zedong

Variable Segmentation

Henry Van Dick and Henri VanDickRobert Smith and Robert JohnSmyth

Corresponding name fields

For [Katherine][Anne][Cox], the similarity with [Katherine][Ann][Cox] is higher than the similarity with [Katherine Ann][Cox]

Truncation of name elements

For Sawyer, the similarity with Sawy is higher than the similarity with Sawi.



Supported Language Matches

Name matching within a language

Match fully supports matches between names in the following languages. It also fully supports matching names between all languages and English.

Cross-language matches

This table identifies the range of cross-language matching that Match fully supports.