Needed to fight COVID-19: A program for data on social determinants of health

While recent pandemic data models have been valuable, most of them have left out a critical component: the impact of social context on illness and its relevance to COVID-19.
(Getty Images)

The COVID-19 crisis has led to an explosion of data analysis and data-driven debate, perhaps more than any other event in recent history. Government agencies, nonprofits, news organizations and ordinary citizens have collected data, published it widely and done myriad analyses to help guide decision-making about the pandemic. While these models have been valuable, most of them have left out a critical component: the impact of social context on illness and its relevance to COVID-19.

A growing body of evidence shows that the social determinants of health (SDOH) have a major influence on an individual’s health status — perhaps as much as, or even more than, the standard epidemiological measures that have been the basis of most COVID-19 models. A much-quoted study by the Kaiser Family Foundation identified six categories of social determinants: economic stability, neighborhood and physical environment, education, food, community and social context, and the health care system.

These social factors can vary widely from community to community. As researchers have said, your ZIP code may be as important as your genetic code in determining your health. In the case of COVID-19, social determinants can be critical in predicting how many people will suffer symptoms severe enough to require hospitalization, and where they will risk overwhelming their local hospitals’ capacity.

We are now seeing the importance of social factors in COVID-19 through an alarming set of statistics: the unusually high risk that African Americans infected with COVID-19 will be hospitalized or die of the disease. A likely explanation is that race in America is associated with social factors, including structural inequalities in access to medical care and economic and employment instability, that lead to a poor prognosis with COVID-19. Not only African Americans, but other Americans with high-risk SDOH profiles may be especially vulnerable to the virus as well.


State health departments, health care companies and academics are beginning to use SDOH data to predict COVID-19 risk in the populations they serve. The University of California-San Francisco has published a Health Atlas for the state of California showing social factors that impact health by geographical location. The population health management company ZeOmega is developing artificial intelligence models for risk prediction using SDOH data and flu infection records, and will apply those models to COVID-19 as more cases occur in the medical claims they manage for tens of millions of Americans. The nonprofit Center for Open Data Enterprise (CODE), which is a consultant to ZeOmega on the use of public SDOH data, recently co-published a white paper on applying SDOH data with the company. The analytics firm Socially Determined, which analyzes SDOH data at a highly localized level, is using its data to help health plan clients and the State of Maryland plan for COVID-19 care.

Organizations like these face a dual challenge: They need better data on the social determinants of health, and they need better data on the individuals who contract COVID-19, in order to develop AI models that use SDOH data for its predictive power. A combination of public and private efforts may provide the best solutions on both fronts.

The Centers for Disease Control and Prevention’s data on COVID-19 cases is an initial basis for tracking and analyzing the spread of COVID-19, but it is not enough. AcademyHealth and the Robert Wood Johnson Foundation have just launched a project to fill gaps in the CDC’s “missing or incomplete data” on “serious or underlying health conditions, hospitalization status, ICU admission, death, and age [or] any data stratified by race and ethnicity.” Their new collaborative initiative is bringing together researchers and health systems to “create an open COVID-19 patient data registry network.”

There are similar gaps in the availability of good SDOH data, and the federal government is well-positioned to lead the way to improve that data for public use. CODE recently partnered with the U.S. Department of Health and Human Services (HHS) Office of the CTO to explore ways to improve SDOH data, and published several recommendations for doing so. CODE’s report recommended that HHS create a national SDOH Data Strategy to improve the methodology for collecting SDOH data, support state and local data-gathering efforts, and establish data standards and mechanisms for data governance. Through the Office of the National Coordinator, HHS could also work with local decision-makers who can use their access to SDOH data at the state, county, or city level.

At the same time, healthcare companies, foundations, and nonprofits have a major opportunity to work with government sources, and with each other, to improve SDOH data and its use. The newly launched COVID-19 Healthcare Coalition is a highly collaborative national model. To identify vulnerable populations, the Coalition has used government data from the CDC, the U.S. Department of Housing and Urban Development, and the U.S. Census Bureau, among other sources. Their analysis captures social risk (along with medical and health care resource risks) through data on poverty levels, access to insurance, migration, and homelessness in American communities.


These strong beginnings can lead the way to new collaborations between government, industry, academia, and the nonprofit community to accelerate the use of SDOH data for public health. The Center for Open Data Enterprise is committed to promoting the use of SDOH data as a public resource. We welcome ideas for collaboration. This critically important data can make an immediate difference in the fight against COVID-19, and have a long-term impact on prevention, treatment, and diagnosis across American healthcare.

Joel Gurin is the President and Founder of the Center for Open Data Enterprise (CODE), a nonprofit based in Washington, DC whose mission is to maximize the value of open government data for the public good. He can be reached at

Latest Podcasts