MIDRC Open-R1 Clinical Data

Please note: This notebook uses open access data

Created By: J Montgomery Maxwell

In this notebook we will visualize the distribution of subjects accross a variety demographics and their COVID-19 status in the Open-R1 dataset from The Medical Imaging and Data Resource Center. (MIDRC - https://data.midrc.org/)

The Open-R1 data set has 1,169 subjects, this notebook will compare the distribution of COVID-19 positive and negative patients across multiple demographic classes. In particular we will focus on the subjects' age groups (-20, 21-30, ..., 90+), sex (Male or Female), race (Black or African American, White, Asian, Pacific Islander, American Indian, Other, or Not Reported), and whether the subject is Hispanic or Latino. Below is a subset of the dataset.

Subjects' COVID-19 Status

Approximately 22% of the subjects in the Open-R1 dataset were COVID-19 positive at the time of the dataset indexing.

Subject Distribution

Users can examine the ratio of Negative and Positive COVID cases amoungst various demographics. At many points thoughout the first two years of the pandemic, desparities of COVID positivity ratios were often noted. Additionally, since subjects possess the ability to not report their race (Not Reported), differences in positivity ratios can be observed if present.

When reduced to only two groups (Not Hispanic or Latino verse Hispanic or Latino), differences in COVID positivity can be observed if present.

If present, a disparity of COVID positivity can be noted between sexes.

The affect age plays in the prevalence of COVID positivity is displayed above. It should be noted that this chart is not normalized by the age distribution of the general population. Typically though, individuals <20 years represent a significant portion of most general populations.