LinkMatchLite frequently asked questions
Table of Contents
What format does the input data need to be?
The current version requires a tab delimited text file, one line per entry, with the full name in the first field.The program will assume that any words before the first tab constitute the full name.Anything after the optional first tab is amalgamated into one 'context' or 'details' field.
Optionally, if your data matches the columns of the gazetted list you are checking against, you can choose a custom mode where context is checked only against context in the corresponding column.This allows you to match DOB against DOB, Address against address etc, but your datas columns must match the gazette.
Optionally, the last column of your data can include a 'checked to date'.I am anticipating the gazetted lists will soon incorporate an 'amendment date' and have added support for this. This will allow you to mark your clients as checked up to an amendment date.When a new list appears, subsequent testing will only reveal new possible matches - not those you have previously checked and dismissed.This date field needs to be the last field in the data line.
The DFAT list now includes this as a 'Control Date' and initially these will all be set to 05/10/2004.
The UK list includes a 'Listed on' date and this can serve the same purpose.
John Jacob Smith<tab>32 Flower St Redfern<tab>23 June 1934<tab>P76534673
John Jacob Smith<tab>32 Flower St Redfern DOB 23 June 1934 Passport P76534673
John Jacob Smith<tab>32 Flower St Redfern, 23 June 1934, P76534673
Smith, John Jacob<tab>32 Flower St Redfern, 23 June 1934, P76534673
John Jacob Smith<tab>
John Jacob Smith
are all OK
John<tab>Jacob<tab>Smith<tab>32 Flower St Redfern<tab>23 June 1934<tab>P76534673
Smith<tab>John<tab>Jacob<tab>32 Flower St Redfern 23 June 1934 P76534673
will not work.
Punctuation such as commas is stripped as the data file is read.
A recent addition to this format is to be able to search for any string as a regular expression.
A regular expression can be as simple as ‘SMITH’ or as complex as regular expression syntax can get ( An internet search will give help on the syntax).
If you wish to search for any entry that contains ‘SMITH’ anywhere in the name field, put that in the name search text box. The software will use its name searching algorithm but this won’t find any matches as it was designed to match full account names such as held by banks.
When no matches are found, it will then scan all the selected lists for ‘SMITH’ as a substring.
If a match is found with the usual method, this stage is skipped.
What else do I need? Is there anything else I need to download?
The software program 'LinkMatchLite', when released, contains the latest DFAT and RBA lists built in. It also contains the latest UK/UN/EU and US lists at the time of the DFAT list update.
The dates of the internal lists are specified on the main screen.
There are two text files (explained in the manual) that will help the software perform best.These are 'equivalent.txt' and 'reject.txt'.
There is also an initialisation file 'LinkMatchLite.ini' that ensures the program runs with your perferences.
Files such as 'DFAT.txt' and 'UNSCS.txt' are not required by LinkMatchLite.
For correct operation, your application directory should contain the following:
What about updating other lists?
'LinkMatchLite', when released, contains the latest DFAT and RBA lists and others as above. I don’t release a new version with changes to any list – only with changes to the DFAT list.
At any time, you can download sdall.zip (for the US Treasury OFAC list) or sanctionsconlist.txt (for the UN UK EU list) and convert these to the format suitable for LinkMatchLite. There is a linux script ‘pro_lists’ to do this. It creates a number of product files but amongst these will be the text files suitable for use as ‘external reference files’. Linux scripts can be run under Windows using ‘cygwin’ or ‘gnuish’ utilities.
Please see the manual for setting up external lists and if this doesn’t help, ask for support.
What are the best settings?
This depends on how you will use the program.
If you are searching a single name, you can afford to scan by eye perhaps even a hundred possibilities to be absolutely sure there is no match.In this case the settings can be lowered from the defaults.
If you are searching a large batch of names, it is impractical to scan more than the closer matches. In this case, the settings may need to be raised.
This also depends on the quality of your data (and the gazetted lists).
Data that contains initials rather than full names and has few details will generate more false matches.Batch testing will only be practical if the result set is managable or is efficiently sorted.
The program will sort the results from highest to lowest so the best approach is to use lower settings and scan from the top downwards until the chance of a true match is insignificant.
Also, fixing the program settings would allow a person with access to the program to know how to obfuscate their name enough to be missed.It is better that such a case appears down in the result set rather than not at all.
What is the current version?
The current version is 3.16.177. The first number is the major version, the second is for minor program changes and the third changes with the inbuilt lists.
Make sure you are using the most current version as it will contain the most recent DFAT, RBA, OFAC and Bank of England (UK-UN-EU) lists.
If your version is out of date, you can add external entries temporarily. See the manual for details.
One client matches many aliases. Is there a way around this?
The gazetted lists contain many entries with aliases. Some of these are phonetic matches so the program may match a single client with several aliases when only one would be necessary.
Unfortunately many aliases also have different details such as address. Rejecting all but the first alias match (which, for example, is subsequently discounted due to address) may miss another alias (with a closer matching address).
This dilemma will be further explored but at the moment, all possible solutions have drawbacks.
Why did it stop with an error?
Earlier versions were susceptible to an unforeseen condition.
If a name was checked and there were tab characters between the name elements and the first element was on the reject list (DE, DEL , THE etc),a program exception resulted.
This has been the most common problem with running the program.
Later versions trap this error to keep the program running. They also log the situation as a data quality warning and calculate the average number of elements per name to warn if there is a general problem with the data.
Where do I get support?
Direct any requests for support (or suggestions) to firstname.lastname@example.org.
Include the program name "LinkMatchLite" in the subject line to facilitate redirection to program support.
What do I report?
The default settings in the software are only for eliminating the bulk of non-matches. The remaining subset is presented in decreasing order of similarity for you to scan through.I can't advise at what point you can consider a cutoff or what to report.
I strongly recommend making contact with the AFP before embarking on name checking and reporting. The name of the current contact officer can be obtained from the DFAT legal branch. The AFP and DFAT can advise on your responsibilities under the regulations and also outline the reporting process.
What are the question marks in some name lists?
The latest DFAT list includes the original Arabic script version of some entity names. As the software doesn't support these scripts, they are translated to question marks.
There should always be an Anglicised version for searching.
Why am I only getting scores of 30 for exact matches?
The default settings in the software show the composite score - reference name against sample client plus sample client against reference - up to 100% in both directions for an exact match.If the scoring method is changed in the options, this will show a different range. For name element matching, an exact match could be 30 (for a two element name).
Select the scoring method that best suits your needs.
Why is almost every button disabled?
The option to "Listen to TCP requests" turns LinkMatchLite into a server. In this state, it won't allow any use from the main form, but will listen for requests from a web browser or custom interface. If this wasn't what you wanted to do, uncheck File |listen to TCP requests.
If only some options are disabled, it's because you only have read access to the configuration file LinkMatchLite.ini. This is how an administrator can permanently set all options.