Quality Assessment (QA)

R2R will undertake routine/automated Quality Assessment (QA) on select data types, and in some cases will produce Quality Controlled data products. QA is defined as documenting the quality of the data as originally delivered from vessels. It is important to note that R2R QA routines are intended to identify suspicious data, and do not indicate the scientific quality/value of data. During QA procedures, a series of quality tests is run on the data and a summary of the results is provided.  QA tests include dataset-level assessments, such as whether appropriate metadata exists and checking for errors in the file formatting, and can also include summaries of record-level testing of the data.  R2R QA routines do not alter the original data files.

The primary purpose of the QA is to provide timely feedback to shipboard operators to ensure that high quality data are consistently acquired, however they also provide background information for end users of the data.  An R2R Quality Assessment Certificate (QAC) will be generated programmatically post-cruise for most standard data types.  Data that have not been quality assessed are documented as such. The results of QA will be made publicly accessible on the R2R website in a QA Dashboard, via Cruise Catalog, and Web services. QA output will ultimately accompany data submitted to the long-term archives. Initially, the QAC will provide essential file details including name, size, checksum and date, but as tools mature, additional content will be added.

For some data types , Quality Controlled data products are also created. These include meteorology station (MET) and thermosalinograph (TSG) data (delivered in realtime), as well as navigation and trackline geophysics. In Quality Control (QC), each value within a dataset is tested, and one or more sets of QC flags is incorporated into the data product that indicate whether that record or value passed or failed the quality test(s).  Failing points are not removed, just flagged.   

The exact quality tests done vary by data type. Tests can include ensuring physically plausible relationships between observations, comparisons to historical values for a region (e.g., climatologies), identifying time reversals and missing data, and other tests that evaluate the scientific use of the data.

The R2R QA and QC procedures are being developed over time, engaging specialists and requesting community feedback for each data type.  As the software tools mature and become sufficiently stable, they will be made available for members of the community to use independently of R2R.