ACQ132
From ICISWiki
GRIMS main > GRIMS functionality > Seed Acquisition
Contents |
Checking Duplicates - Edit-distance algorithm
Overview
This user interface uses a popular alorithm, Levenshtein Edit Distance Lagorithm. It measures the 2 string similarities by the number of deletion, substitution, or change. For example two names of different Germplasm ID of the same name type can be chacked against each other and see the numeric value the closeness or difference of the strings.
"Levenshtein distance (LD) is a measure of the similarity between two strings, which we will refer to as the source string (s) and the target string (t). The distance is the number of deletions, insertions, or substitutions required to transform s into t. For example,
- If s is "test" and t is "test", then LD(s,t) = 0, because no transformations are needed. The strings are already identical. - If s is "test" and t is "tent", then LD(s,t) = 1, because one substitution (change "s" to "n") is sufficient to transform s into t.
The greater the Levenshtein distance, the more different the strings are." [1]
The user shall set the maximum number of allowable difference in the Max Difference combobox to efficiently use this UI .
User interface form
Interactive user interface elements
icon | action |
---|---|
Query record | |
Clear function | |
Save function | |
Previous record | |
Next record | |
Help function | |
Hot keys display | |
Close form |
User input fields
label | description |
---|---|
Find | Name (value) to be checked for the Edit-Distance algo |
Max Difference | Maximum number of allowable difference between 2 name values |
Name Type | Search only within the specified name type given by the user (i.e. Variety Name, Collector's Number, Donor Name, etc.) |
Comapre To: | Status of germplasm to be selected for data analysis |