Data Validation Tool 5.4

From ICISWiki

Jump to: navigation, search

Application Programs 5.4 > Data Validation Tool

This version: 5.4.1 (ICIS Workshop 2007 Release)

See Version 5.4.2


Contents

Introduction

The ICIS Data Validation Tool is an application that searches ICIS for data errors that might render it meaningless. This is useful in making sure that published data are always of excellent quality.

Image:Icis-validate5.4.JPG

It is so simple to use! Just choose the tests you want to execute and click on the "Run" button.

The ICIS Data Validation Tool (version 5.3; unofficially released) was developed using Visual Basic 6. Version 5.4 was developed using Delphi.


ICIS - GMS (i) Queries

Checks the Genealogy Management System (GMS) central database (applicable to all ICIS implementations).

Image:icisgms1.JPG



Invalid parent references [2 checks]

(from Version 5.3)


Error messages:

Error-0001: Unknown group source

Error-0002: Germplasm with non-generative group source



Circular references [3 checks]

(from Version 5.3)


If germplasm A has germplasm B as one of its parent and if germplasm B has germplasm A as one of its parents, then we have a circular reference situation. This option also checks for two and three-level circularity.

Error messages:

A references B and B references A:

Error-0003: 1st level circular reference 


A references B and B references C and C references A:

Error-0004: 2nd level circular reference


A references B and B references C and C references D and D references A:

Error-0005: 3rd level circular reference



Invalid Method

(from Version 5.3)


Error message:

Error-0006: Invalid germplasm methods



Deleted parent references

(from Version 5.3)


Let us say germplasm A is replaced with germplasm B as shown in the GERMPLSM.GRPLCE database column. All germplasm that references germplasm A should be corrected to germplasm B.

Error message:

Error-0007: Germplasm with deleted parent references



Foreign Key References [4 checks]

(from Version 5.3)


The ICIS database has no foreign key constraints enabled so we need to check manually if the foreign key definition is violated.

Error messages:

Error-0008: Invalid LOCATION.LOCID references

Error-0009: Invalid GERMPLASM.GID references
Error-0010: Invalid METHODS.MID references
Error-0011: Invalid BIBREFS.REFID references



ICIS - GMS (ii) Queries

Checks the Genealogy Management System (GMS) central database (applicable to all ICIS implementations).

Image:icisgms2.JPG


Progenitor germplasm dates [3 checks]

CropForge Feature Request # 160 and/or Version 5.3 Discussion Article on ICISWiki


The GDate of a GID must not predate GDate of any of its progenitors. (Beware of missing links in the chain of dates! If the test checks only the GDATE of the progenitors GPID1, GPID2 and MGID, then it will not detect the following error: non-zero GPID1, GPID2 or MGID has GDATE=0 but their GPID1, GPID2 or MGID are younger than the target GID. Therefore, if a GPID1, GPID2 or MGID has GDate=0, iterate to check their GPID1, GPID2 and MGID)

Error messages:

Error-0012: Germplasm with germplasm date (GDATE) earlier than GDATE of GPID1
Error-0013: Germplasm with germplasm date (GDATE) earlier than GDATE of GPID2
Error-0014: Germplasm with germplasm date (GDATE) earlier than GDATE of MGID



Progenitor ID1 unknown, Progenitor ID2 known

CropForge Feature Request # 160 and/or Version 5.3 Discussion Article on ICISWiki


A GID can’t have GPID2>0 and GPID1=0

Error message:

Error-0015: Germplasm with Progenitor ID1 unknown ( GPID=0 ), Progenitor ID2 known ( GPID2 > 0 )




Name inheritance from GPID2: check NDATE and NLOCN [2 checks]

CropForge Feature Request # 160 and/or Version 5.3 Discussion Article on ICISWiki


If a GID inherits a name from its source (GPID2) then that name record must also inherit the name date (NDATE) and name location (NLOCN)

Error messages:

Error-0016: Germplasm with inherited name from GPID2 but NDATE not inherited
Error-0017: Germplasm with inherited name from GPID2 but NLOCN not inherited


IRIS - GMS Queries

Checks the Genealogy Management System (GMS) central database (applicable to International Rice Information System (IRIS)).

Image:irisgms.JPG



Preferred Names [2 checks]

CropForge Feature Request # 160 and/or Version 5.3 Discussion Article on ICISWiki


No more than one name of a GID must have NSTAT=1 (preferred English name)

Error-0018: Germplasm with more than one preferred English name (NSTAT=1)


Names eligible to be the preferred name are: CRSNM, RELNM, DRVNM, CVNAM, ELITE

Error-0019: Germplasm with invalid name type as preferred name

Preferred IDs [3 checks]

CropForge Feature Request # 160 and/or Version 5.3 Discussion Article on ICISWiki


No more than one name of a GID can have a "preferred ID" status (NSTAT =8)

Error-0020: Germplasm with more than one preferred ID (NSTAT=8) 


The following names types are eligible to be preferred ID: DRVNM, COLNO, ACCNO (if present), GACC (if present), ITEST (if present), CIATGB (if present)

Error-0021: Germplasm with invalid name type as preferred ID


The preferred ID must be unique for the given name type – that is, two accessions must not share the same name of the same type if one or both is a preferred ID.

Error-0022: Germplasm with preferred ID not unique for the given name type


Location (GLOCN) of germplasm

CropForge Feature Request # 160 and/or Version 5.3 Discussion Article on ICISWiki

For all germplasm with germplasm creation method not Method ID 62 (Import), germplasm location (GLOCN) of a GID should be the same as the GLOCN of its GPID2

Error-0023: Germplasm with GLOCN different from GLOCN of GPID2



Method - name type combinations [3 checks]

CropForge Feature Request # 160 and/or Version 5.3 Discussion Article on ICISWiki


Only certain combinations of name type and germplasm creation method are acceptable. Namely:


Method type GEN: valid name types are: CRSNM, UNCRS, UNRES

Error-0024: Invalid method-name type combination for method GEN 


Method type DER: valid name types are: RELNM, DRVNM, CVNAM, CVABR, NTEST, LNAME, ADVNM, ACVNM, AABBR, OLDMUT1, OLDMUT2, ELITE, UNRES

Error-0025: Invalid method-name type combination for method DER 


Method type MAN: valid name types are: ACCNO, RELNM, CVNAM, CVABR, COLNO, FACCN, ITEST, NTEST, LNAME, TACC, ADVNM, ACVNM, ELITE, GACC, DACCN, LCNAM, CIATGB

Error-0026: Invalid method-name type combination for method MAN



Name type occurrence [3 checks]

CropForge Feature Request # 160 and/or Version 5.3 Discussion Article on ICISWiki


Some name types must not occur more than once for a single GID. These are ACCNO, CRSNM, UNCRS, COLNO, ITEST, GACC, CIATGB, RELNM, DRVNM, CVNAM

Error-0027: Germplasm with certain name types occurring more than once 


Cross-names (CRSNM) and Line-names (LNAME) cannot occur together

Error-0028: Germplasm with both CRSNM and LNAME name types 


Release names (RELNM) and Collector's Numbers (COLNO) cannot occur together

Error-0029: Germplasm with both RELNM and COLNO name types



Name type RELNM (Release Name)

CropForge Feature Request # 160 and/or Version 5.3 Discussion Article on ICISWiki

A Release Name (RELNM) cannot occur more then once for a GID-GLOCN-COUNTRY combination. Two GIDs can share the same RELNM only if their GLOCNs are in different countries

Error-0030: Germplasm sharing a RELNM (release name) with another germplasm in the SAME country



WHAT'S NEW in Version 5.4.1

Data Checks

  • IRIS-specific data validation.


  • The following are new error codes:
    • Error-0012: Germplasm with germplasm date (GDATE) earlier than GDATE of GPID1
    • Error-0013: Germplasm with germplasm date (GDATE) earlier than GDATE of GPID2
    • Error-0014: Germplasm with germplasm date (GDATE) earlier than GDATE of MGID
    • Error-0015: Germplasm with Progenitor ID1 unknown ( GPID=0 ), Progenitor ID2 known ( GPID2 > 0 )
    • Error-0016: Germplasm with inherited name from GPID2 but NDATE not inherited
    • Error-0017: Germplasm with inherited name from GPID2 but NLOCN not inherited
    • Error-0018: Germplasm with more than one preferred English name (NSTAT=1)
    • Error-0019: Germplasm with name type not eligible to be preferred name
    • Error-0020: Germplasm with more than one preferred ID (NSTAT=8)
    • Error-0021: Germplasm with name type not eligible to be preferred ID
    • Error-0022: Germplasm with preferred ID not unique for the given name type
    • Error-0023: Germplasm with GLOCN different from GLOCN of GPID2
    • Error-0024: Invalid method-name type combination for method GEN
    • Error-0025: Invalid method-name type combination for method DER
    • Error-0026: Invalid method-name type combination for method MAN
    • Error-0027: Germplasm with certain name types occurring more than once
    • Error-0028: Germplasm with both CRSNM and LNAME name types
    • Error-0029: Germplasm with both RELNM and COLNO name types
    • Error-0030: Germplasm sharing a RELNM (release name) with another germplasm in the SAME country



Miscellaneous

  • Option to output query results to MS Excel files. Usually one file for each error code.


  • Removed "Local Database Queries" tabsheet


  • Removed "IRRI-GRC Queries" tabsheet


  • "Central Database" tabsheets (from Version 5.3) renamed to "ICIS - GMS" (checks are applicable to all ICIS implementations).


  • Improved "About" form containing more information, plus the GNU General Public License. CRIL and IRRI logos also included.

Image:AboutIcisValidate.JPG


  • Introduction of application icon

Image:Icon_validate5.4.JPG


Powerpoint presentation

Data Validation Tool (PDF)

Personal tools