How we track Covid-19, with wastewater and other data sources

Posted by

In the last few weeks, I (Betsy) have received a few questions about the national Covid-19 updates that we’ve been publishing every week at The Sick Times. Readers have been particularly interested in how I reference wastewater surveillance, a newer technology for tracking viruses and other health indicators that became prominent during the pandemic.

So, here’s my explanation of the data sources I use and how I approach writing these updates. I’ve focused on wastewater surveillance, but not exclusively — there are quite a few other sources out there! I hope this will be helpful for readers continuing to follow Covid-19 trends in their own communities as well.

First, a brief introduction for readers who might not be familiar with my work pre-Sick Times. I’ve been closely following Covid-19 data since early 2020. When the pandemic started, I was a staff data journalist at a company called Stacker. At the time, Stacker had a rather small team and I was the resident science/health journalist, so I quickly started managing the publication’s data-driven Covid-19 coverage. I also joined the COVID Tracking Project, the volunteer data-collection organization started by The Atlantic, and then became a volunteer leader of data compilation efforts for the core Testing & Outcomes dataset and Racial Data Tracker.

Through both of those positions, I immersed myself in Covid-19 data, learning about how health agencies were tracking this disease as scientific understanding and data systems quickly evolved. To share what I learned, I started the COVID-19 Data Dispatch newsletter and blog in summer 2020. When I left Stacker to freelance in spring 2021, editors who had read my newsletter asked me to write more articles about Covid-19 data, leading this topic to become my main beat for a couple of years.

In short, I’ve done a lot of research and reporting about Covid-19 data, which I draw on when writing weekly updates. I wrote these updates at the COVID-19 Data Dispatch for three years; as readers told me that they found the information helpful, I carried them over (with a couple of formatting tweaks) to The Sick Times.

Subscribe to our weekly newsletter

* indicates required

Here are the main sources I use these days:

Wastewater surveillance data

Early in the pandemic, scientists learned that they could monitor SARS-CoV-2 by testing samples from public sewer systems. This technology predates Covid-19 — for example, some research teams were using it to track opioids — but it became hugely popular thanks to the coronavirus. In particular, wastewater data’s popularity picked up in the last two years, as case counts became less reliable and then vanished entirely.

Extensive research has shown that the levels of coronavirus measured in sewer systems (often presented as copies of the virus per milliliter of sewage) tend to rise and fall in time with disease spread in the community. Monitoring wastewater also has some key advantages over counting people who’ve tested positive: scientists can use just one sample to estimate disease levels for an entire city or county, meaning everybody in the water system is included even if they don’t have symptoms or access to healthcare.

However, wastewater data can’t be treated as a direct replacement for case counts. This is a common interpretation mistake that I see, particularly on social media: we all got used to knowing, with a lot of specificity, how many people were sick with Covid-19 in our communities at a given time, and we want to replicate those numbers using wastewater data. But first of all, case numbers were never perfectly accurate (since access to testing was always a challenge), and second of all, wastewater samples can be biased in totally different ways from cases. For example:

  • Not all public sewer sheds are getting tested for SARS-CoV-2, and testing patterns vary a lot by region. In some states, every county has a testing site, while other states just have one or two. And millions of people aren’t plugged into public sewer networks at all!
  • Sewersheds include a lot of waste that doesn’t come from humans. Depending on the community, they could have animal waste, agricultural waste, runoff from rain and snow, and more. These factors can interfere with virus measurements.
  • Different people infected with SARS-CoV-2 likely shed different amounts of virus, depending on factors like their symptoms, how long they are sick, and which variant they have. Scientists are working to understand these factors better.
  • Given differences between communities, scientists like to compare viral trends from a specific testing site to that same site over time. But testing can be inconsistent, as health departments have changed the companies or research teams they work with, and research teams themselves update how they process and analyze samples.

Scientists who work with wastewater data understand these complexities and are usually cautious in how they describe wastewater trends. True infection rates could be lower than wastewater data make them appear, or they could be higher. This is why I typically reference other sources in addition to wastewater data, rather than relying on just one type of tracking.

Some research teams have developed frameworks to help communicate trends to the public, such as the CDC’s viral activity levels, while others are working on models that will estimate infection rates based on coronavirus levels in sewage. It’s important to understand that this analysis is a work in progress, as public health officials are still getting used to interpreting and using this newer type of data. I personally haven’t seen a public wastewater-to-infections model that I consider rigorously peer-reviewed enough to reference on a weekly basis, but I hope that this will change in the coming months.

As for where to find wastewater data, these are my primary sources:

  • The CDC’s National Wastewater Surveillance System: This is the most comprehensive wastewater surveillance network, with about 1,100 sites covering more than one-third of the U.S. population. The CDC compiles data from state and local health departments that have their own wastewater monitoring programs, as well as some sites tested by a private contractor. Since every health department and testing company has different techniques for processing and analyzing sewage samples, the CDC came up with a new metric, “viral activity levels,” to standardize. The agency essentially compares recent SARS-CoV-2 measurements from a given site to past measurements, then averages those comparisons up to state, regional, and national levels. This standardizing is a work in progress, which is why we often see the CDC retroactively updating past activity levels.
  • Biobot Analytics: Biobot is a startup focused on wastewater surveillance with its own network of Covid-19 testing sites. For about a year and a half, this company was also testing at several hundred sites in the CDC network as a federal contractor, and it included data from the CDC sites on its national dashboard. While that contract ended in fall 2023, Biobot is still testing at hundreds of sites in its own network. Biobot’s dashboard offers more consistency and longevity than the CDC’s, since it has standard testing protocols for all its sites going back to spring 2020.
  • WastewaterSCAN: This testing project started at Stanford and Emory Universities in early 2020, then expanded to a larger network (now including about 200 testing sites). Similarly to Biobot, WastewaterSCAN offers more consistent data, as it is testing at every site in the same way. This project also tests for several other common viruses in addition to SARS-CoV-2.

While I primarily use these three national dashboards for my Covid-19 trends updates, I would also recommend looking for state or local wastewater dashboards covering your community. Local dashboards are often updated more frequently and tailored to their communities more specifically. You can find a list of some state and regional dashboards on the COVID-19 Data Dispatch site, and the COVIDPoops19 global dashboard is another good place to look for options near you.

Healthcare system data

After looking at trends from wastewater surveillance, I also look at data from the U.S. healthcare system. The CDC continues to track hospitalizations (with about 6,000 hospitals reporting to the agency) and some limited testing data for Covid-19, now using similar systems to its tracking of flu, RSV, and other common viruses.

I specifically look at:

  • Hospital admissions, or how many people with Covid-19 have been admitted to hospitals;
  • Emergency department visits, or what share of people going to hospital emergency rooms have been diagnosed with Covid-19;
  • Test positivity, or share of Covid-19 tests returning positive results at laboratories in the CDC’s National Respiratory and Enteric Virus Surveillance System;

Healthcare system data are a delayed indicator compared to wastewater, since it can take a few days or weeks after someone gets infected for them to become hospitalized. But these metrics are useful as they show how Covid-19 is leading to serious illness in communities and adding patients to hospitals and health clinics.

Along with Covid-19 metrics, I check the CDC’s flu data during the flu season, as this disease can also cause severe symptoms and burden the healthcare system. The CDC provides weekly flu surveillance updates which include hospitalizations, testing data, and doctors’ visits for influenza-like illness (eg. cough, fever, and other respiratory symptoms). Since Covid-19 and the flu can have similar symptoms in the acute phase, some influenza-like illness visits may be from Covid-19.

Subscribe to our weekly newsletter

* indicates required

Modeling data

In addition to tracking Covid-19 and other diseases through directly counting sick people, the CDC also has scientific teams who estimate illness burden through modeling. One of these teams is housed at a new center, the Center for Forecasting and Outbreak Analytics, which I wrote about when it launched in 2022. The center is using data from hospitals to estimate future hospitalizations, as well as infection trends for Covid-19 and the flu; it’s working on models based on wastewater data.

Another CDC modeling team focuses on tracking coronavirus variants. This team estimates how many infections are driven by specific variants, based on a select number of PCR test samples that laboratories analyze to identify their full genetic sequence. The CDC also tracks variants through wastewater monitoring and by testing international travelers returning to U.S. airports. (I don’t reference the travel data very often, but this source can be a helpful way to see what’s circulating internationally.)

No real-time Long Covid data

Many people who follow Covid-19 data tend to focus on hospitalizations and deaths as the “severe outcomes” that can come after an infection. But those metrics ignore another severe outcome, which is arguably the most common at this point in the pandemic: Long Covid.

Unfortunately, there is no current real-time data source for Long Covid cases in the U.S. Our best source is the CDC and Census’s Household Pulse Survey, which surveys U.S. adults about their experiences with Long Covid — but this survey is only conducted every couple of weeks and the data are reported with a significant lag. (As of today, the most recent data are from October 2023).

The lack of up-to-date data on Long Covid makes this disease easy to minimize for many outlets. At The Sick Times, we don’t have better numbers to reference than any other reporters — but we can at least remind you, frequently and with urgency, that we know the current data do not capture the full scope of this problem.

All articles by The Sick Times are available for other outlets to republish free of charge. We request that you credit us and link back to our website.

Subscribe to our weekly newsletter

* indicates required

3 responses

Leave a Reply

Blog at WordPress.com.

Discover more from The Sick Times

Subscribe now to keep reading and get access to the full archive.

Continue reading