Data Validation Tool 5.4
From ICISWiki
Application Programs 5.4 > Data Validation Tool
This version: 5.4.1 (ICIS Workshop 2007 Release)
See Version 5.4.2
Contents |
Introduction
The ICIS Data Validation Tool is an application that searches ICIS for data errors that might render it meaningless. This is useful in making sure that published data are always of excellent quality.
It is so simple to use! Just choose the tests you want to execute and click on the "Run" button.
The ICIS Data Validation Tool (version 5.3; unofficially released) was developed using Visual Basic 6. Version 5.4 was developed using Delphi.
ICIS - GMS (i) Queries
Checks the Genealogy Management System (GMS) central database (applicable to all ICIS implementations).
Invalid parent references [2 checks]
(from Version 5.3)
Error messages:
Error-0001: Unknown group source
Error-0002: Germplasm with non-generative group source
Circular references [3 checks]
(from Version 5.3)
If germplasm A has germplasm B as one of its parent and if germplasm B has germplasm A as one of its parents, then we have a circular reference situation. This option also checks for two and three-level circularity.
Error messages:
A references B and B references A:
Error-0003: 1st level circular reference
A references B and B references C and C references A:
Error-0004: 2nd level circular reference
A references B and B references C and C references D and D references A:
Error-0005: 3rd level circular reference
Invalid Method
(from Version 5.3)
Error message:
Error-0006: Invalid germplasm methods
Deleted parent references
(from Version 5.3)
Let us say germplasm A is replaced with germplasm B as shown in the GERMPLSM.GRPLCE database column. All germplasm that references germplasm A should be corrected to germplasm B.
Error message:
Error-0007: Germplasm with deleted parent references
Foreign Key References [4 checks]
(from Version 5.3)
The ICIS database has no foreign key constraints enabled so we need to check manually if the foreign key definition is violated.
Error messages:
Error-0008: Invalid LOCATION.LOCID references
Error-0009: Invalid GERMPLASM.GID references
Error-0010: Invalid METHODS.MID references
Error-0011: Invalid BIBREFS.REFID references
ICIS - GMS (ii) Queries
Checks the Genealogy Management System (GMS) central database (applicable to all ICIS implementations).
Progenitor germplasm dates [3 checks]
CropForge Feature Request # 160 and/or Version 5.3 Discussion Article on ICISWiki
The GDate of a GID must not predate GDate of any of its progenitors. (Beware of missing links in the chain of dates! If the test checks only the GDATE of the progenitors GPID1, GPID2 and MGID, then it will not detect the following error: non-zero GPID1, GPID2 or MGID has GDATE=0 but their GPID1, GPID2 or MGID are younger than the target GID. Therefore, if a GPID1, GPID2 or MGID has GDate=0, iterate to check their GPID1, GPID2 and MGID)
Error messages:
Error-0012: Germplasm with germplasm date (GDATE) earlier than GDATE of GPID1
Error-0013: Germplasm with germplasm date (GDATE) earlier than GDATE of GPID2
Error-0014: Germplasm with germplasm date (GDATE) earlier than GDATE of MGID
Progenitor ID1 unknown, Progenitor ID2 known
CropForge Feature Request # 160 and/or Version 5.3 Discussion Article on ICISWiki
A GID can’t have GPID2>0 and GPID1=0
Error message:
Error-0015: Germplasm with Progenitor ID1 unknown ( GPID=0 ), Progenitor ID2 known ( GPID2 > 0 )
Name inheritance from GPID2: check NDATE and NLOCN [2 checks]
CropForge Feature Request # 160 and/or Version 5.3 Discussion Article on ICISWiki
If a GID inherits a name from its source (GPID2) then that name record must also inherit the name date (NDATE) and name location (NLOCN)
Error messages:
Error-0016: Germplasm with inherited name from GPID2 but NDATE not inherited
Error-0017: Germplasm with inherited name from GPID2 but NLOCN not inherited
IRIS - GMS Queries
Checks the Genealogy Management System (GMS) central database (applicable to International Rice Information System (IRIS)).
Preferred Names [2 checks]
CropForge Feature Request # 160 and/or Version 5.3 Discussion Article on ICISWiki
No more than one name of a GID must have NSTAT=1 (preferred English name)
Error-0018: Germplasm with more than one preferred English name (NSTAT=1)
Names eligible to be the preferred name are: CRSNM, RELNM, DRVNM, CVNAM, ELITE
Error-0019: Germplasm with invalid name type as preferred name
Preferred IDs [3 checks]
CropForge Feature Request # 160 and/or Version 5.3 Discussion Article on ICISWiki
No more than one name of a GID can have a "preferred ID" status (NSTAT =8)
Error-0020: Germplasm with more than one preferred ID (NSTAT=8)
The following names types are eligible to be preferred ID: DRVNM, COLNO, ACCNO (if present), GACC (if present), ITEST (if present), CIATGB (if present)
Error-0021: Germplasm with invalid name type as preferred ID
The preferred ID must be unique for the given name type – that is, two accessions must not share the same name of the same type if one or both is a preferred ID.
Error-0022: Germplasm with preferred ID not unique for the given name type
Location (GLOCN) of germplasm
CropForge Feature Request # 160 and/or Version 5.3 Discussion Article on ICISWiki
For all germplasm with germplasm creation method not Method ID 62 (Import), germplasm location (GLOCN) of a GID should be the same as the GLOCN of its GPID2
Error-0023: Germplasm with GLOCN different from GLOCN of GPID2
Method - name type combinations [3 checks]
CropForge Feature Request # 160 and/or Version 5.3 Discussion Article on ICISWiki
Only certain combinations of name type and germplasm creation method are acceptable. Namely:
Method type GEN: valid name types are: CRSNM, UNCRS, UNRES
Error-0024: Invalid method-name type combination for method GEN
Method type DER: valid name types are: RELNM, DRVNM, CVNAM, CVABR, NTEST, LNAME, ADVNM, ACVNM, AABBR, OLDMUT1, OLDMUT2, ELITE, UNRES
Error-0025: Invalid method-name type combination for method DER
Method type MAN: valid name types are: ACCNO, RELNM, CVNAM, CVABR, COLNO, FACCN, ITEST, NTEST, LNAME, TACC, ADVNM, ACVNM, ELITE, GACC, DACCN, LCNAM, CIATGB
Error-0026: Invalid method-name type combination for method MAN
Name type occurrence [3 checks]
CropForge Feature Request # 160 and/or Version 5.3 Discussion Article on ICISWiki
Some name types must not occur more than once for a single GID. These are ACCNO, CRSNM, UNCRS, COLNO, ITEST, GACC, CIATGB, RELNM, DRVNM, CVNAM
Error-0027: Germplasm with certain name types occurring more than once
Cross-names (CRSNM) and Line-names (LNAME) cannot occur together
Error-0028: Germplasm with both CRSNM and LNAME name types
Release names (RELNM) and Collector's Numbers (COLNO) cannot occur together
Error-0029: Germplasm with both RELNM and COLNO name types
Name type RELNM (Release Name)
CropForge Feature Request # 160 and/or Version 5.3 Discussion Article on ICISWiki
A Release Name (RELNM) cannot occur more then once for a GID-GLOCN-COUNTRY combination. Two GIDs can share the same RELNM only if their GLOCNs are in different countries
Error-0030: Germplasm sharing a RELNM (release name) with another germplasm in the SAME country
WHAT'S NEW in Version 5.4.1
Data Checks
- IRIS-specific data validation.
- The following are new error codes:
- Error-0012: Germplasm with germplasm date (GDATE) earlier than GDATE of GPID1
- Error-0013: Germplasm with germplasm date (GDATE) earlier than GDATE of GPID2
- Error-0014: Germplasm with germplasm date (GDATE) earlier than GDATE of MGID
- Error-0015: Germplasm with Progenitor ID1 unknown ( GPID=0 ), Progenitor ID2 known ( GPID2 > 0 )
- Error-0016: Germplasm with inherited name from GPID2 but NDATE not inherited
- Error-0017: Germplasm with inherited name from GPID2 but NLOCN not inherited
- Error-0018: Germplasm with more than one preferred English name (NSTAT=1)
- Error-0019: Germplasm with name type not eligible to be preferred name
- Error-0020: Germplasm with more than one preferred ID (NSTAT=8)
- Error-0021: Germplasm with name type not eligible to be preferred ID
- Error-0022: Germplasm with preferred ID not unique for the given name type
- Error-0023: Germplasm with GLOCN different from GLOCN of GPID2
- Error-0024: Invalid method-name type combination for method GEN
- Error-0025: Invalid method-name type combination for method DER
- Error-0026: Invalid method-name type combination for method MAN
- Error-0027: Germplasm with certain name types occurring more than once
- Error-0028: Germplasm with both CRSNM and LNAME name types
- Error-0029: Germplasm with both RELNM and COLNO name types
- Error-0030: Germplasm sharing a RELNM (release name) with another germplasm in the SAME country
Miscellaneous
- Option to output query results to MS Excel files. Usually one file for each error code.
- Removed "Local Database Queries" tabsheet
- Removed "IRRI-GRC Queries" tabsheet
- "Central Database" tabsheets (from Version 5.3) renamed to "ICIS - GMS" (checks are applicable to all ICIS implementations).
- Improved "About" form containing more information, plus the GNU General Public License. CRIL and IRRI logos also included.
- Introduction of application icon