Covid (v3) – data visualizations

WARNING. I am not a medical doctor nor an epidemiologist. The analysis I am sharing here is only for the data geeks around that are curious. Please follow the advice of your national authorities and health system.

Let me publish this short update on the COVID pandemic to share some of the most interesting visualizations I found online and the last update of Matlab code for those of you who would like to play with the data. For some of us, keeping an eye on the data is a way to diffuse anxiety about the pandemic and it is in this spirit I keep posting, now and then, comments on COVID. If data, trajectories, comments on the pandemics are stressing you and you are experiencing a COVID overload, perhaps you would like to read Prof. Aisha Ahmad’s article – particularly if you work in higher education. And… not read this far less useful post of mine.

First, NextStrain is a tool that permits to monitor the spread of various infectious diseases, including CoV-2019 (thanks to Quentin to share the link). You can appreciate how the virus is mutating and how different clusters/mutations are geographically spread. I should note that, to my knowledge, there is no evidence for differences in the severity of the disease depending on mutations identified so far. These differences are currently used just as simple fingerprints, or ‘paternity tests’ to identify the evolution and the spread of the virus. I avoid interpretations of this data but you can find reports on NextStrain. I assume that these maps are biased significantly according to the amount of sequencing that each country is doing. You might appreciate the central role of, of course, China, but also of the UK as a hub of transmission. This might be because of London, an important aviation hub, and the high numbers of sequencing happening in the UK, but neglecting the number of samples, if the timing/trajectory of the spread will be confirmed in due time, it will be very interesting to understand the implications.

Before showing a bit of new Matlab code I worked on in the little spare time I had lately, I would like also to introduce the work that inspired my new analysis. There is this very nice tool by Aatish Bhatia and a great introductory YouTube clip by the same author.

Like most non-expert people, I am interested in understanding the efficacy of public health measures taken in different countries. As confirmed cases are rather unreliable, and the likely cause of the big differences in mortality rates across countries (see previous posts), I currently focus only on reported fatalities. It should be noted that this data is more reliable, but again it depends on the reporting criteria of each individual country and these might change. Hence, the true impact on public health will be clear only retrospectively in a year or so, but for the time being, we can appreciate trends to get an impression if containment measures are working or not.

First, I would like to start with the raw data. Here I present groups of selected countries based on my own interest but with the Matlab code available on GitHub you can generate the same graphs for the groups of countries of your choice using the Johns Hopkins CSSE dataset. To analyse just the raw data Bahtia’s tool is easier to use, but I show here the results to introduce my next analyses. Please note that the data is averaged over a sliding window of three days and we usually get data with 1 day of delay. Therefore, the curves are representative of the situation as it was 2-4 days ago.

These graphs are difficult to interpret as countries not only are experiencing different types of epidemics (for example, more or less localised) but do have also rather different population sizes. However, these trends allow us to understand – broadly speaking – if the policies enacted by individual governments are providing the expected results.

You might notice that I present two entries for China: one for China overall, and another just for the Hubei province. Here, the two graphs are almost identical because the Chinese outbreak started in Wuhan, within the Hubei province, and there caused many more fatalities. The black line is just shown as a reference to guide the eye. When the traces significantly deviate from this line, eventually tracing a horizontal line, confirmed deaths stop increasing exponentially. In that phase, the epidemic is still causing fatalities but it becomes more manageable and predictable. Most importantly, for those countries in lockdown, it is the clear sign that the strategy is effective. For those countries that are not in lockdown, for example, the Republic of Korea (aka South Korea), such trend imply that the epidemics is not resolved but controlled to strike a balance between socioeconomic sustainability and control of the epidemics.

The second gallery is a collection of the very same plots but normalized to the total population of each country. You might appreciate here why I report data for both Hubei alone and mainland China overall. I heard – amazingly even from virologists – that the data from China is not correct because China cannot possibly have so few fatalities being such a large country. At this stage, we can’t trust data as fully reliable from any country, but it should be clear that the epidemics was initially localized in Wuhan, Hubei. There it went out of control but in the rest of China a combination of the lockdown and tracking of patients made possible to avoid an uncontrolled epidemics. This is why Hubei, with its 60M inhabitants, should be considered as a reference and not all mainland China and why I am reporting the two curves.

Normalizing fatality by population size, we can now appreciate how some countries are in much similar state at the moment in Europe with, of course, plenty of exceptions. Another note on the reliability of data. All of us, even the non-specialists like me, have learnt how ‘confirmed cases’ of covid are a rather unreliable indicator because of different capacities in each country to test, particularly in different stages of the epidemics. Reported fatalities are a more robust indicator. However, different countries might adopt different methodologies to report ‘confirmed’ cases. Some countries make a distinction between people who died with COVID and people who died of COVID, but others do not. Some countries are faster than others in reporting and deaths outside hospitals – anywhere – are likely to be counted with a significant delay, or sometimes not reported at all because of lack of testing. Therefore, keep in mind that data from any country is not rock solid, and we will discover the real impact of COVID only at a later stage.

Most importantly, this is not a race between countries. Each of these numbers is a human life cut short and a complex network of relations broken. Therefore, my comments are provided as a means to try to understand what is happening from the standpoint of the layman I am in this context, with the deepest respect of what ‘fatalities’ means.

This noted, I wished to present the last gallery. Different countries have different demographics and the risks, as we know, of COVID are age-dependent. Therefore, I used the age-dependent mortalities inferred from mainland China (Hubei excluded – this represents a best-case scenario) to provide a rough estimate of the population at risk in each country. For example, once adjusted for the different demographics, an average fatality rate in China would amount to about ~1% but ~2% in Italy and 1.7% in the UK. Notably, the mortality rate in Hubei was higher (~4%) and using this value for Italy we would expect ~9% mortality. However, many have noted how these values are unrealistic and heavily depends on testing capacity, that is constrained during phases when a health system is overwhelmed. Most studies estimate fatality rates around 0.5% to 1%. If 1% is the real mortality rate, to evaluate the population at risk in Hubei, we can simply take 1% of the ~60M inhabitant as the population at risk. For all other countries, this proportion is made considering the different number of inhabitants in different age bands.

The take-home message is that the actions that governments decided to take are having the desired effects. While I might disagree with one policy or another, I thus invite people to follow the guidance provided from each country. However, it will be interesting to keep comparing the Netherlands and Sweden with other European countries, and China with other countries in South East Asia as, by choice of necessity, different strategies have been employed. Data from the UK and Sweden (and others) is rather noisy, with some temporal variation that might depend on the way data is reported. Therefore, it is to early to tell how the situation is developing in these two countries, but over the next week, a picture will be rather clear. We have now to watch out for the US and also the many countries that I did not study so far but that will play an equally important role in the evolution of this disease.

Personally, I had supported a fast initial response. While trying to shrink the epidemics, I hope that countries now will cooperate to share resources to save as many lives as possible. At the same time, I hope that countries will also cooperate in rebooting the world economy and productivity as soon as possible. We should not rush to not waste all the work done but we should have clear plans to remerge from lockdowns.

We are still adapting to this new reality. However, while supporting our societies in passing through this public health issue, as soon as we’ll see those trajectories dropping (or before if you can), we will have to quench the pandemics of hate that might break out between countries. We can do that only by resisting the populist trends we had been already experiencing, dark energies that might be getting stronger than ever.

I hope, instead, that we will feel closer to each other. In our streets, in our nations, with our neihboring countries but also with those far far away who suffered or will suffer like any of us. Together, we can build a brighter future. Against each other, more and more lifes will be lost.

Author: Alessandro

Please visit my website to know more about me and my research

2 thoughts on “Covid (v3) – data visualizations”

Leave a Reply

%d bloggers like this: