This DQec report is generated from testing completeness in OMOPV5_0 data from University of Washington on 2019-03-20.
The table bolow provides a list of CDM tables provided (and not provided) in the data load.
The source data this table and the following graphics in this section are being generated from is tablelist_OMOPV5_0_University of Washington_20-03-2019.csv
This figure shows which of the CDM tables are loaded and/or available.
## Warning in `[.data.table`(dtfDT, , `:=`("c", fact), with = FALSE):
## with=FALSE ignored, it isn't needed when using :=. See ?':=' for examples.
The figure below shows a network visualization of the CDM data model, as well as highlighting the tables that are available in this load (legend is the same as in Figure 1).
The table below provides results of completeness test at the value/cell level.
TabNam
= OMOPV5_0 table nameColNam
= Column nameDQLVL
= Level of importance for completeness test. (X
: Extremely Important, H
: Highly Important, L
:Low Importance)FRQ
= Frequency of rowsUNIQFRQ
= Frequency of unique values in each columnMS1_FRQ
= Frequency of cells with NULL/NA values or empty strings in each columnMS2_FRQ
= Frequency of cells with characters in each column that don’t represent meaningful data – including, ‘+’, ‘-’, ’_‘,’#‘,’$‘,’*‘,’', ‘?’, ‘.’, ‘&’, ‘^’, ‘%’, ‘!’, '@', and ‘NI’.MSs_PERC
= Percentage of overall missing data in each columnData for this table is generated from DQ_Master_Table_OMOPV5_0_University of Washington_20-03-2019.csv saved under report directory.
Figure below profiles changes in primary keys across loads as a measure of change in patient/record number over time.
Data for the figure is stored in FRQ_comp_trberg_20-03-2019.csv
Figures below show proportion of missing cells/values in each column of each table loaded. Figures are generated based on Table 2.
MS1_FRQ
= Frequency of cells with NULL/NA values and empty strings in each column – presence of absenceMS2_FRQ
= Frequency of cells with characters in each column that don’t represent meaningful data – presence of nonsense## [[1]]
##
## [[2]]
##
## [[3]]
##
## [[4]]
##
## [[5]]
##
## [[6]]
##
## [[7]]
##
## [[8]]
##
## [[9]]
##
## [[10]]
##
## [[11]]
##
## [[12]]
##
## [[13]]
##
## [[14]]
##
## [[15]]
##
## [[16]]
##
## [[17]]
##
## [[18]]
##
## [[19]]
##
## [[20]]
##
## [[21]]
##
## [[22]]
##
## [[23]]
##
## [[24]]
Figures below visualize number of unique key variables that are common in multiple OMOPV5_0 tables.
The Reference column on the right comes from the table in which the variable is a primary key, and therefore is a reference for all other tables.
Count_Out shows number of unique key variables that are not present in the reference table – e.g., person id from observation table that does not exist in person table.
Count_In represent number of unique key variables that are present in the reference table – e.g., person id from observation table that exist in person table as well.
Figure 5 shows the parcentage of patients missing specific key clinical indicators.
This is report is from DQe-c version 3.2
Ask questions or report issues: trberg@uw.edu or kstephen@uw.edu
This tool was funded by ITHS and CD2H. For citation, see https://www.ncbi.nlm.nih.gov/pubmed/29069394