What other kinds of specific data do we need to fight COVID-19?

Although the Kaiser Family Foundation outlines a number of key SDOH data sources, CODE also worked with stakeholders to identify a variety of other data sources are considered to be high-value for policymakers. 

Transportation and Infrastructure Data

Having data on access to transportation and mode of transportation is essential for mitigating the effects of COVID-19. Those who rely on public transportation and use it to commute to work face higher risks of exposure to the virus. Also, those who don’t have access to reliable transportation will have greater difficulty receiving proper healthcare for COVID-19 or other conditions during the pandemic. 

The distribution of a vaccine and other key materials is also vital during a pandemic. Transportation and infrastructure data enables planners and policymakers to effectively distribute the vaccine to those populations most in need. 


Housing data (housing insecurity, homelessness, urban housing units) 

Racial and ethnic minorities are more likely to live in densely populated areas, and to experience homelessness. Data collection efforts for housing data should be significantly ramped up in order to mitigate disparities in the most severely affected neighborhoods, and better predict COVID-19 impact.

A June survey from Pew said that 3% of Americans have moved since the pandemic, and 6% have a new person/people in their house since the virus. Government data on household density and stability would be highly valuable. The private sector should also be explored for more granular housing and neighborhood data, including homelessness. 



The U.S. is facing record levels of unemployment, and unemployment is a major factor in how people are being affected by the pandemic.11 The economic stress of unemployment can increase an individual’s overall risk of illness, due to factors like the loss of one’s insurance through a previous employer, and inability to afford quality healthcare. Although basic employment status data is being collected, there needs to be a push for better collection of data on paid sick leave, employee insurance, and essential worker status.


Racial, ethnic, and language data

Data collection on language is sparse, but essential if the medical community wants to administer better care. Many Americans who don’t speak English as their first language can be deterred from seeking care and getting tested for COVID-19 when they face difficulties understanding English or lack information available in their own languages. Data on race and ethnicity is also not collected in a standard manner and is missing from many key datasets dealing with COVID-19. Data standards should be developed according to OMB guidelines for the collection of race and ethnicity data, and should be a mandatory field in COVID-19 case data. 


  • Race and ethnicity — U.S. Census Bureau
  • Language spoken at home — U.S. Census Bureau

Health Insurance status 

Most workers receive health insurance through their jobs, but due to increasing rates of unemployment, many people are losing their coverage. In addition, even those who have insurance may have difficulty meeting their deductibles, covering their copays, or accessing quality health care. Better data on individuals’ insurance status would support efforts to reduce healthcare disparities and improve access to high-quality healthcare. Greater measures should be taken to collect data on individuals’ insurance status in order to better mitigate disparities. 


Internet access

Internet access affects whether individuals are able to access important information regarding COVID-19, including testing facility locations, proximity to health care, and any updates on the state of the pandemic, as well as their ability to utilize telemedicine. Vulnerable communities have lower rates of internet access, and in turn, are bearing more of the consequences of the pandemic. 


County and urban density and hospital bed occupancy

Government data on household density and stability, and urban density at a granular geographic level would be of extreme value. Hospital resource use including hospital and ICU bed capacity, and invasive ventilator availability is also vital information to have. This data can help predict hotspots for the pandemic. This data can help predict hotspots for the pandemic, and hospitals at risk of being over capacity.  Proximity and population density are key. Sources of this sort of data may include the U.S. Census and the U.S. Department of Housing and Urban Development.


Food insecurity data 

The number of people facing food insecurity in the U.S. is rising due to the pandemic, particularly in already vulnerable communities. Improving data collection efforts for food insecurity, SNAP/WIC enrollment, and food access is needed to combat this issue. 

The New York Times reported that nearly 1 in 8 households don’t have enough food to eat during the ongoing pandemic. Food insecurity data is an important piece of understanding this landscape and responding in kind. This data could be aggregated by regional food banks or other community organizations.


Air quality

Air pollution has been linked to more severe cases and higher mortality rates for COVID-19, making air quality a critical factor to analyze during the pandemic. Studies have concluded that increased long-term exposure to air pollution have resulted in larger increases in COVID death rates, and low income and minority communities are more likely to experience poor air quality. 


  • Air quality —  U.S. Environmental Protection Agency, AirNow API

Up to date Medicaid claims data and health status

Medicaid claims data helps identify at-risk populations and understand what comorbidities might exist among poor and at-risk communities. Data on individuals’ health status is also critical to identifying and assessing at-risk populations. With this data, people like county health officials can identify their Medicaid populations with a snapshot of what procedures and conditions they have, and use this information to allocate greater resources, care, and recovery support efforts.


  • Quarterly Medicaid enrollment — Centers for Medicare and Medicaid Services,  Medicaid.gov
  • Health status (fair or poor health) — CDC, Behavioral Risk Factor Surveillance System  (BRFSS)

COVID-19 tests, cases, and deaths 

Geographically granular testing and case data is essential in managing all aspects of this pandemic. For instance, this data serves as the foundation for most COVID-19 forecasting models, which predict future case surges and demand for emergency room services, hospital beds, ventilator equipment, and other forms of care. Without adequate testing data, forecasters are forced to rely on flawed data and their own assumptions. 


Access to care and testing facilities and basic health data (including deaths) from states and localities. 

Data on access to healthcare, COVID-19 testing facilities, and other basic health data is essential to determine the constraints, challenges, and needs of different communities during COVID-19. Standard data on access to healthcare is scarce, as the meaning of access has yet to be clearly defined. The same can be said for access to testing facilities, since very few entities are collecting this data.  The National Committee on Vital and Health Statistics (NCVHS) works with the states to collect this sort of data, but acquiring timely and accurate data has been a challenge.