FAQ for COVID-19 Maps
What did you learn from this work?
I wrote a blog
post on that.
Why do the case lines on the graph get cut off, e.g. why are the scales so small?
It's a tough balance. I wanted, most of all, for the scale to be easy to interpret.
That meant the same scale for all the graphs, and not log-scale (because most people
have trouble interpreting log-scale graphs).
That meant that if I scaled to show the peak of the worst hit places (New York), then you
wouldn't be able to see detail in most other places. ¯\_(ツ)_/¯
Why are the death scales on the graphs 10x bigger than the cases?
Because if I used the same scale for deaths as I did for cases, then you wouldn't
be able to see detail in the deaths at all.
Why is the death scale on the maps so low? It's hard to see much.
Because if I made the scale so that you could see detail in the deaths, it
would look like New York had more deaths than cases in April.
To some extent, it's also a reflection of the fact that we got much better at
treating cases pretty quickly, so there are not many deaths per capita now.
¯\_(ツ)_/¯
If there aren't many deaths per capita, shouldn't we just open everything up
and let the disease rip?
No.
- First, even if the death rate per capita is low-ish, there would
still be a huge number of people who would die because there
are so many people. With a 1% fatality rate, that's 3.3M people
who could die.
- Second, COVID19 is not at all pleasant and can have long-term
effects even if you don't die.
- Third, there is some evidene that people are dying less because
of the distancing measures: that because of masks and distancing,
people get lower doses of virus, so they don't get as sick. If
we stop those measures, people will get sicker and die more.
What happened to deaths in New York City around 18 May 2020?
I don't remember exactly, but it was something to do with reporting.
I think it might have been that before that day, they had not been counting deaths
in nursing homes, so all the nursing home deaths until that day got dumped into
the "18 May" bucket.
What happened to cases in Iowa around 19 Feb 2021?
They changed how they report.
Instead
of reporting cases, they are
now reporting positive tests,
so if one person gets tested three times, that would look like three cases. I am guessing that
on Feb 19, they reported the cumulative number of positive tests, and that was hugely
higher than the number of positive cases.
Note that because the graphs show the rolling seven day average, it looked like it
was high for a week.
What happened to cases in Missouri around 11 Mar 2021?
They added positive rapid antigen tests all at once.
Before, they had only reported positive PCR cases.
What happened to cases in Alabama around 13 May 2021?
Alabama
dumped backlog cases into the record on the 13th, 14th, and 15th of May 2021.
What happened in Nebraska around 30 June 2021?
Nebraska stopped publishing its numbers on its dashboard.
What happened in Nebraska around 25 September 2021?
Nebraska dumped a ton of data all at once, for example, Douglas County reported amost 11K cases,
for a population of 500K.
What happened to Florida data around June 6, 2021?
Their deaths data stopped getting into the Johns Hopkins database. People are still dying,
A LOT, but they are not getting reported to whatever data source John Hopkins is using.
What happened to Florida data around November 26, 2021?
Florida is now reporting their numbers weekly instead of daily, and I suspect that
they didn't report at all over the Thanksgiving holiday.
What happened to Florida data around January 14, 2022?
I don't know yet, but I suspect they dumped all their death data since June 6 into one day.
What happened to Maryland data around December 5, 2021?
Apparently that part of Maryland's IT got locked up by ransomware.
What happened to Vermonton 3 March 2022?
I don't know yet why, but a lot of cases throughout the state
got dumped onto that day.
What happened to Nebraska on 28 Feb 2022?
I don't know yet why, but lots cases got dumped onto that day all over the state.
What happened to Ada County, ID in Feb/March 2022?
As far as I can tell, genuinely high cases. They started going high in
February and just kept going.
What happened to Polk County, IA on 2 March 2022?
As far as I can tell, genuinely high cases. (It's a little hard to tell because
Iowa switched to weekly reporting on around 7 July 2021, so I can't tell if a bunch
of cases got dropped onto one day.)
What happened to Canyon County, ID on 2 March 2022?
As far as I can tell, genuinely high cases. (It's a little hard to tell because
Iowa switched to weekly reporting on around 7 July 2021, so I can't tell if a bunch
of cases got dropped onto one day.)
What happened to Pierce County, WA in late Feb 2022?
As far as I can tell, genuinely high cases. (It's a little hard to tell because
Pierce seems to only report on Monday/Wednesday/Friday, so it's harder to tell if
a large number of cases got dropped onto one day.)
Why are there a few counties in Utah with no data?
Some counties are so small that the Utah government was concerned about privacy,
so combined some counties together into pseudocounties. This means the data for the
actual counties registers as zero.
What happened to Nevada on 16 March 2022?
I don't know, but some counties had a huge positive spike, while other counties
(like Lincoln and Douglas) had significant negative spikes. My best guess
is that they changed how they were designating which county a case was in, e.g.
changing from the county where the person lived to the county where they got tested.
Why is this FAQ so ugly?
Because I spent my time and effort on making the graphs look nice and work well.
Why does that marker/graph start on Cook County?
It had to start somewhere. Chicago is relatively central, big, and I have
some affection for the city.
Where did you get the COVID-19 data?
The COVID-19 Data Repository
by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University
Where did you get the polygons?
I probably got the Mercator polygons from the US Census Bureau (2008) version.
(Sorry, I don't have the URL of where I got it, but county boundary files are
currently available
here.)
I made the cartograms using ScapeToad.
Why don't the cartogram polygons change as the population grows?
The population doesn't change that fast, and the relative population
changes even more slowly. Furthermore, people are only counted once every ten
years with the decennial census. For intermediate years, the Census Bureau just
does a linear interpolation (last time I checked).
I used population estimates from 2012.
You are a Canadian, where's Canada?
I would love to have a cartogram of the US and Canada, but that's difficult. To make a cartogram out of a shapefile requires:
- A shapefile. To make a cartogram of the US+Canada, I would need
to merge a US shapefile with a Canadian shapefile. Also, the
data for Canada is distributed at different granularities
across Canada: some provinces report by province, some by
health authority, some by health region. I would need to
aggregate either specific shapes or specific case/deaths data.
- Populations of all the jurisdiction, inserted into shapefile for each
jurisdiction. Because of the problem with non-uniform
jurisdictions, this means a manual process and a lot of looking
about for data.
- Code to make the cartogram. I could write code to cartogarmmize myself,
but it would take a long time. Better to use what exists (a program called
ScapeToad), but! A long time.
ScapeToad takes a very long time to run, and it uses a UI, so it
isn't easy to run it on some rental server with beefy hardware
(e.g. AWS). (All cartogrammizers take a long time to run; it's
an iterative process.
Furthermore, ScapeToad has problems with points near the poles,
and essentially gets a divide-by-zero error with some of
Canada's Arctic points in a Mercator projection. I think I
would EITHER need to start with Lambert Conical projection, do
a few iterations for it to collapse the territories into nothingness,
then convert it to Mercator and then run that OR shift all of
the US and Canada down by 40 degrees, do a few iterations to
collapse the territories, then move it 40 degrees north and
carry on.
Alternatively, I could just ignore the territories, since they
have so few peoople and covid cases right now. I feel that it
is disrespectful to ignore the provinces, and while they don't
have an COVID-19 cases right now, they might in the
future.
The bottom line is that it's not straightforward to make
a Canadian version.
How did you make the web page?
I made the Mapzarf
mapping framewor many years ago for a similar poject.
I dusted it off a bit, augmented it to deal with time-series data,
and adapted it for the COVID-19 mapping project.
The Mapzarf framework uses PHP and mysql.
I generate the tiles myself, using
other code
which I had developed a long time ago.
Why did you do this?
Because I kept being annoyed at graphs/charts which were only cumulative, and/or weren't
normalized to population, and/or weren't smoothed, and/or weren't cartograms, and kept
thinking, "Someone should do better maps!" Welp, I guess I am somebody.
Who did this?
Ducky Sherwood