Name Endpoints
Match
Babel Street Match uses machine learning and cutting-edge NLP techniques to perform name matches, address matches, record matches, and name deduplication across a large set of languages and writing scripts. Match functionality is provided through four endpoints:
Names are complex to match because of the large number of variations that occur within a language and across languages. Match breaks a name into tokens and compares the matching tokens. Match can identify variations between matching tokens including, but not limited to, typographical errors, phonetic spelling variations, transliteration differences, initials, and nicknames.
Variation | Example(s) |
---|---|
Phonetic and/or spelling differences | Nayif Hawatmeh and Nayif Hawatma |
Missing name components | Mohammad Salah and Mohammad Abd El-Hamid Salah |
Rarity of a shared name component | Two English names that contain Ditters are more likely to match than two names that contain Smith |
Initials | John F. Kennedy and John Fitzgerald Kennedy |
Nicknames | Bobby Holguin and Robert Holguin |
"Cousin" or cognate names | Pedro Calzon and Peter Calzon |
Uppercase/Lowercase | Rosa Elena PACHECO and Rosa Elena Pacheco |
Reordered name components | Zedong Mao and Mao Zedong |
Variable Segmentation | Henry Van Dick and Henri VanDick, Robert Smith and Robert JohnSmyth |
Corresponding name fields | For [Katherine][Anne][Cox], the similarity with [Katherine][Ann][Cox] is higher than the similarity with [Katherine Ann][Cox] |
Truncation of name elements | For Sawyer, the similarity with Sawy is higher than the similarity with Sawi. |
Supported Language Matches
Name matching within a language
Match fully supports matches between names in the following languages. It also fully supports matching names between all languages and English.
Cross-language matches
This table identifies the range of cross-language matching that Match fully supports.