FAQ for COVID-19 Maps

What did you learn from this work?

I wrote a blog post on that.

Why do the case lines on the graph get cut off, e.g. why are the scales so small?

It's a tough balance. I wanted, most of all, for the scale to be easy to interpret. That meant the same scale for all the graphs, and not log-scale (because most people have trouble interpreting log-scale graphs).

That meant that if I scaled to show the peak of the worst hit places (New York), then you wouldn't be able to see detail in most other places. ¯\_(ツ)_/¯

Why are the death scales on the graphs 10x bigger than the cases?

Because if I used the same scale for deaths as I did for cases, then you wouldn't be able to see detail in the deaths at all.

Why is the death scale on the maps so low? It's hard to see much.

Because if I made the scale so that you could see detail in the deaths, it would look like New York had more deaths than cases in April.

To some extent, it's also a reflection of the fact that we got much better at treating cases pretty quickly, so there are not many deaths per capita now. ¯\_(ツ)_/¯

If there aren't many deaths per capita, shouldn't we just open everything up and let the disease rip?

No.

First, even if the death rate per capita is low-ish, there would still be a huge number of people who would die because there are so many people. With a 1% fatality rate, that's 3.3M people who could die.
Second, COVID19 is not at all pleasant and can have long-term effects even if you don't die.
Third, there is some evidene that people are dying less because of the distancing measures: that because of masks and distancing, people get lower doses of virus, so they don't get as sick. If we stop those measures, people will get sicker and die more.

What happened to deaths in New York City around 18 May 2020?

I don't remember exactly, but it was something to do with reporting. I think it might have been that before that day, they had not been counting deaths in nursing homes, so all the nursing home deaths until that day got dumped into the "18 May" bucket.

What happened to cases in Iowa around 19 Feb 2021?

They changed how they report. Instead of reporting cases, they are now reporting positive tests, so if one person gets tested three times, that would look like three cases. I am guessing that on Feb 19, they reported the cumulative number of positive tests, and that was hugely higher than the number of positive cases.

Note that because the graphs show the rolling seven day average, it looked like it was high for a week.

What happened to cases in Missouri around 11 Mar 2021?

They added positive rapid antigen tests all at once. Before, they had only reported positive PCR cases.

What happened to cases in Alabama around 13 May 2021?

Alabama dumped backlog cases into the record on the 13th, 14th, and 15th of May 2021.

What happened in Nebraska around 30 June 2021?

Nebraska stopped publishing its numbers on its dashboard.

What happened in Nebraska around 25 September 2021?

Nebraska dumped a ton of data all at once, for example, Douglas County reported amost 11K cases, for a population of 500K.

What happened to Florida data around June 6, 2021?

Their deaths data stopped getting into the Johns Hopkins database. People are still dying, A LOT, but they are not getting reported to whatever data source John Hopkins is using.

What happened to Florida data around November 26, 2021?

Florida is now reporting their numbers weekly instead of daily, and I suspect that they didn't report at all over the Thanksgiving holiday.

What happened to Florida data around January 14, 2022?

I don't know yet, but I suspect they dumped all their death data since June 6 into one day.

What happened to Maryland data around December 5, 2021?

Apparently that part of Maryland's IT got locked up by ransomware.

What happened to Vermonton 3 March 2022?

I don't know yet why, but a lot of cases throughout the state got dumped onto that day.

What happened to Nebraska on 28 Feb 2022?

I don't know yet why, but lots cases got dumped onto that day all over the state.

What happened to Ada County, ID in Feb/March 2022?

As far as I can tell, genuinely high cases. They started going high in February and just kept going.

What happened to Polk County, IA on 2 March 2022?

As far as I can tell, genuinely high cases. (It's a little hard to tell because Iowa switched to weekly reporting on around 7 July 2021, so I can't tell if a bunch of cases got dropped onto one day.)

What happened to Canyon County, ID on 2 March 2022?

As far as I can tell, genuinely high cases. (It's a little hard to tell because Iowa switched to weekly reporting on around 7 July 2021, so I can't tell if a bunch of cases got dropped onto one day.)

What happened to Pierce County, WA in late Feb 2022?

As far as I can tell, genuinely high cases. (It's a little hard to tell because Pierce seems to only report on Monday/Wednesday/Friday, so it's harder to tell if a large number of cases got dropped onto one day.)

Why are there a few counties in Utah with no data?

Some counties are so small that the Utah government was concerned about privacy, so combined some counties together into pseudocounties. This means the data for the actual counties registers as zero.

What happened to Nevada on 16 March 2022?

I don't know, but some counties had a huge positive spike, while other counties (like Lincoln and Douglas) had significant negative spikes. My best guess is that they changed how they were designating which county a case was in, e.g. changing from the county where the person lived to the county where they got tested.

Why is this FAQ so ugly?

Because I spent my time and effort on making the graphs look nice and work well.

Why does that marker/graph start on Cook County?

It had to start somewhere. Chicago is relatively central, big, and I have some affection for the city.

Where did you get the COVID-19 data?

The COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University

Where did you get the polygons?

I probably got the Mercator polygons from the US Census Bureau (2008) version. (Sorry, I don't have the URL of where I got it, but county boundary files are currently available here.)

I made the cartograms using ScapeToad.

Why don't the cartogram polygons change as the population grows?

The population doesn't change that fast, and the relative population changes even more slowly. Furthermore, people are only counted once every ten years with the decennial census. For intermediate years, the Census Bureau just does a linear interpolation (last time I checked).

I used population estimates from 2012.

You are a Canadian, where's Canada?

I would love to have a cartogram of the US and Canada, but that's difficult. To make a cartogram out of a shapefile requires:

A shapefile. To make a cartogram of the US+Canada, I would need to merge a US shapefile with a Canadian shapefile. Also, the data for Canada is distributed at different granularities across Canada: some provinces report by province, some by health authority, some by health region. I would need to aggregate either specific shapes or specific case/deaths data.
Populations of all the jurisdiction, inserted into shapefile for each jurisdiction. Because of the problem with non-uniform jurisdictions, this means a manual process and a lot of looking about for data.
Code to make the cartogram. I could write code to cartogarmmize myself, but it would take a long time. Better to use what exists (a program called ScapeToad), but! A long time.
ScapeToad takes a very long time to run, and it uses a UI, so it isn't easy to run it on some rental server with beefy hardware (e.g. AWS). (All cartogrammizers take a long time to run; it's an iterative process.
Furthermore, ScapeToad has problems with points near the poles, and essentially gets a divide-by-zero error with some of Canada's Arctic points in a Mercator projection. I think I would EITHER need to start with Lambert Conical projection, do a few iterations for it to collapse the territories into nothingness, then convert it to Mercator and then run that OR shift all of the US and Canada down by 40 degrees, do a few iterations to collapse the territories, then move it 40 degrees north and carry on.
Alternatively, I could just ignore the territories, since they have so few peoople and covid cases right now. I feel that it is disrespectful to ignore the provinces, and while they don't have an COVID-19 cases right now, they might in the future.
The bottom line is that it's not straightforward to make a Canadian version.

How did you make the web page?

I made the Mapzarf mapping framewor many years ago for a similar poject. I dusted it off a bit, augmented it to deal with time-series data, and adapted it for the COVID-19 mapping project. The Mapzarf framework uses PHP and mysql.

I generate the tiles myself, using other code which I had developed a long time ago.

Why did you do this?

Because I kept being annoyed at graphs/charts which were only cumulative, and/or weren't normalized to population, and/or weren't smoothed, and/or weren't cartograms, and kept thinking, "Someone should do better maps!" Welp, I guess I am somebody.

Who did this?

Ducky Sherwood