Data Profiling Software for Data Discovery,Data Assessment,Data Analysis
A lot of different applications require name matching functionality. Examples include customer account management, medical record searching, duplicate database record identification and in general any form of information retrieval.

There are a variety of approaches to name matching such as exact matching, wildcarding, fuzzy matching, alphanumeric matching, soundex or other forms of phonetic matching, string distance metrics, probabilistic record linkage, knowledge-based expertise and learning. Each of these approaches has advantages and disadvantages, some work in specific cases while others are able to meet more specialized applications and business requirements. The best name matching technique, off course, is employing live data operators to sift through records and make an intelligent decision if the records are duplicates using human cognitive abilities. Employing a large number of data operators and having them work through millions of records can be very costly and time consuming. The problem gets even worse if this has to be done weekly or monthly.

Intelligent Search Technology's searching and matching software produces an intelligent numeric score that determines the likelihood of a match between two records. Our signature NameSearch® searching and matching software engine utilizes a combination of different matching techniques such as fuzzy matching, phonetic matching, knowledge-based expertise and advanced heuristic algorithms to arrive at the matching scores. The production of the scores mimics the way in which humans go about producing a numeric score that determines the likelihood of a match between two input strings such as personal names, company names, addresses, account numbers, etc.

NameSearch® Matching Algorithms

ALFACOMP - Heuristic Comparison Algorithm - Scores are calculated through the use of advanced heuristic pattern recognition. The Alfacomp scoring method is used to evaluate two multi-word alpha-numeric strings such as names and street addresses. The two input strings are parsed into their word components. Each word component in string one is systematically scored using an advanced heuristic pattern recognition routine against every word component in string two. The results are stored in a weighted neural net having nodes that represent every combination. The optimized results are returned in the result parameter. The Alfacomp routine is based solely on a heuristic algorithm and is not dependent on rulebase expertise.

COMP2 - Rule-based Comparison Algorithm - This is NameSearch®’s comparison routine used for scoring names and addresses. The Comp2 comparison routine is used to intelligently determine the likelihood that two entities are the same. The Comp2 routine is more precise than Alfacomp because it uses rule-based knowledge and phonetic awareness in addition to advanced heuristic pattern recognition. However, the Alfacomp routine executes approximately eight times faster than its more intelligent sibling – Comp2.

MULTICOMP - Multiple Field Comparison Algorithm - The purpose of MultiComp is to evaluate multiple fields and deliver a combined, weighted score. For example, users may want to compare personal names, corporate names and addresses, and receive a single score. This score is created based on a field definition string that contains the methodology used for deriving and weighing scores.Multicomp offers the ability to compare fields with different comparison routines, choose score tresholds and score types and assign weights to each individual field comparison definition.

DATESCR - Date Comparison Algorithm - The datescr routine is used to intelligently determine the similarity between two dates. The date comparison routine examines the input dates, parses them into separate components and standardizes these components. The month and day represent one component, which accounts for half the score; the year component accounts for the other half. Date parsing and standardization is accomplished through the use of NameSearch®’s sanitization and word recognition routines. Upon completion of date standardization, datescr calculates a score between 0 and 100.

NUMCOMP - Numeric Comparison Algorithm - The number comparison routine can be used to compare two alpha-numeric fields. Numcomp uses advanced character-by-character analysis to recognize matching patterns within the input strings. The function returns four scores, each of which represents a different way of calculating the number of characters that matched against the total number of input characters.

>> Page up <<


HomePrivacyLegalContactSite Map
Follow IST on Linkedin®