Matching, Searching, Comparison
The MerlinMerge® SpeedPro product is a software tool that will enable systems
to find and match records using personal names, corporate names, addresses, social
security, phone and account numbers and other identifying information. This sophisticated
software will increase the quality of matching while minimizing I/O expense.
The first aspect is of the matching process involves the creation of intelligent
keys and search range. This facility is used for the retrieval of records regardless
of variation caused by phonetics, transcription or keyboarding errors, nicknames,
short forms, missing words, extra words, noise and sequence variations.
Matching is achieved through advanced comparison functions that utilize neural
net technology, rule-based intelligence and advanced heuristical pattern recognition.
The matching functionality will deliver comparison scores that approximate values
generated by an individual with significant linguistic expertise. These comparison
routines
enable systems to make decisions without human intervention.
Deduplication
(Deduping)
The process of identifying and removing duplicate
records is called deduplication or deduping. Deduplication
is a key operation when dealing with large volumes of data
and especially when
integrating data from multiple sources.
The truth is that all databases contain duplicate records,
for a number of reasons - spelling and transposition errors,
misheard names and addresses, different
people entering data often from multiple locations, merging external data, etc.
This
is
a
huge
problem
for
an organization but often it does not emerge until a larger project
is undertakem, or the organization expands or an unhappy customer
complains.
As
data
is
collected
from
a
wide
variety
of
sources,
the
number of duplicate entries will keep growing. Most databases contain
at least
3% duplicate
records and in many cases, significantly more.
The main challenge in this task is identifying when a pair of
records refers to the same entity in spite of various data inconsistencies.
MerlinMerge® SpeedPro
is specifically designed to provide a solution to this issue.
Merge/Purge
Merging data refers to the process of integrating data from multiple sources.
When combining the data from two or more separate lists into
a single one, it is often the case that certain records from the lists repeat.The
process of removing the repeated records is referred to as purging the
records.
Whether we are talking about redundant data,
wrong data, missing data or miscoded data, every company
has some of each, probably
residing in several different departments. Companies often
focus on the business process and not on the form and congruity
of the resulting data.
MerlinMerge® SpeedPro provides an intuitive interface
designed specifically for merging two separate
lists into a single one.
Household Determination
A variation of deduplication is called household determination.
Household determination involves combining records for
different individuals living at the same address into a single
entry. For
example,
records
for
John Smith, Jane Smith and Junior Smith containing the same address
information become a single record for “The Smith Household”.
The same can be done with people working in the same company.
A householding merge/purge considers last
name and address, including variations in the name and address.
Records with unique last names and the same address are not considered
a match.
MerlinMerge® SpeedPro is an excellent tool for
performing household determination.
Data Quality
The problem of poor data quality processed by an information
system is widespread in the industrial, government and
academic environments. Poor data quality
has a negative impact on the competitiveness of an organization
and can cause many other problems especially when working
on bigger projects.
The MerlinMerge® SpeedPro software reduces the cost of
doing business by improving the accuracy and usability
of your
data.
Performance
Computer technology always faces the issue
of speed versus accuracy. This is especially relevant
when dealing with large data sets. Tasks such as comparing
two strings take a lot of processing time. In addition
reading through millions
of records is very I/O intensive.
The more processing time spent, the better the results,
but
sometimes fast performance is more important.
MerlinMerge® SpeedPro
does a great job at finding the right balance between
speed and accuracy. Through many years of experience,
testing and optimization our technology
experts have fine-tuned MerlinMerge® SpeedPro to achieve a
very high degree of accuracy while preserving fast performance.
Additionally you are able
to change the matching criteria to achieve a balance
of speed and accuracy that is custom fit to your organization.
|