Corona virus – data mining (v2)

WARNING. I am not a medical doctor nor an epidemiologist. The analysis I am sharing here is only for the data geeks around that are curious. Please follow advice of your national authorities and health system.

NOTE: For a more comprehensive blog post, you might be interested in Tomas Pueyo’s website. A good discussion about mortality rates can be found at CEBM website.

The trends

I have run a comparison between selected countries. I use Hubei (the Chinese province where Wuhan is located, in red) and Italy (in green) as references. The top plot in each figure shows confirmed cases and the middle plot shows covid-related deaths. Check the appendix for a discussion on the bottom plots (mortality rates). The JH data starts with Hubei on the day that the province went into lockdown. I will often comment on the comparison between the UK and Italy as I am a British-Italian dual-national and I started to follow the data to understand how to adapt to the situation.

Day 1 in these plots is relative to the start of Hubei tracking, which coincides with the day of Wuhan lockdown (red vertical line). The vertical green lines represent two key moments of the Italian response. The first is the local lockdown of towns where outbreaks in Northern Italy occurred, the second is the national lockdown. The vertical blue lines are key moments of the British response. The first one marks the day when the PM announced abandoning the containment strategy to pursue ‘herd immunity’, the second line marks the day when a UK nation-wide lockdown was implemented.

The comparison between Hubei, Italy and the UK is interesting as these are regions with similar total populations (~60M). Imported cases are identified at a similar time in both UK and Italy, but the two countries then follow quite a different trajectory. Until day 35, the UK seems capable to track and isolate cases, after which local outbreaks are evident from the start of a steep exponential (linear in this log-scale) growth. Instead, Italy experienced a sudden (apparent) outbreak at day 30. The difference between Italy and UK at this stage is that Italy shut flights from China (contrary to WHO indications) and kept testing only people with a declared travel history from China. This caused the coronavirus to spread undetected, particularly in the hospitals that did not trigger emergency procedures timely. It seems likely that the coronavirus was imported from Germany by people returning from China through Germany. However, it seems also likely that the spread was boosted by contacts between Germans and Italians, between strong productive regions, completely bypassing the Italian monitoring strategies, too focused on China. From the trends of detected cases in Italy, one could extrapolate that the local epidemic starts immediately and undetected as soon as the first imported cases are identified. The good British monitoring of the epidemic gives the UK two weeks of breathing time for preparing to the epidemics.

In my previous post, I mentioned about the puzzle of Germany exhibiting low mortality and some reports about Germany counting covid-related deaths differently from other countries. Differences in counting are probably real but I am now convinced that too much emphasis on this point is unadvisable. My impression is that there is a lot of talking about this to spread fake news aimed to construe a conspiracy theory about some countries hiding covid-cases to protect their economies. Personally, I wished someone would make clarity on how deaths are counted, but I am keen to interpret – at this point in time – data as more realistic that some people might consider. Probably, Germany did simply a better job than the UK and Italy and delayed the local epidemic of three weeks compared to the latter.

We can browse the plots of different western European countries and notice similar trends, with different delays caused by the probability to import cases and the capability of each country to track and contain imported cases. Amongst the countries I check, Spain is the outlier, with a steep rise of covid-related cases that has been broadly described in the media and not yet fully explained.

In South East Asia (I checked China, Singapore, South Korea and Japan), the trends are very different from Europe. We can argue that different political and societal systems permitted a better response. In general, we can state that they responded fast and proactively. You might notice second waves of infections. Japan, Singapore and South Korea did not enforce national lockdowns but are containing the epidemics with careful tracking and strong mitigation measures. Somehow it is what the UK wanted to do. However, the UK, in my opinion, acted too late and when they announced changing from containment to mitigation plans, it was too late to adopt a balanced strategy between public health and economy (the South East Asian model) and too late to fully contain the disease with low casualties. Therefore, western European countries will likely follow the Chinese trend, full containment followed – in due time – with the South East Asian response unless a cure or vaccine will be ready early.

Singapore is the outlier, however. We should consider that Singapore has only 10% of the inhabitant in Italy and Hubei (Japan twice, South Korea similar), it is, in fact, a high-density population city-state. The death rate is very low… I wonder how the situation will evolve and how the data is collected. Back to Europe, we should keep an eye on Sweden as – if I understood correctly – the Swidish government did what the UK wanted to do, i.e. mild mitigation measures.

Let’s conclude this overview with USA. We should keep in mind that in any country, the statistics are the sum of multiple local outbreaks. USA has >300M inhabitant split in 52 states. For the time being, I just check the overall trends. And the trends are simply alarming. Of course, I neglected many countries but you can freely use/adapt my Matlab code on GitHub to draw your own conclusions.

Next…. let’s synchronize the curves.

Contrary to my previous blog post, I am now synchronizing the curves on the death statistics, as they are more realistic, while confirmed covid-cases heavily depend on the capability of health systems to run tests across the non-hospitalized population. I used an arbitrary number of deaths, 40, sufficiently large to provide a robust synchronization, but sufficiently low to precede containment actions of governments that would, of course, change the shape of the curves. Most South-East Asia countries are not shown as they are containing the epidemics at the moment. With this data, now you can appreciate that the epidemic across European countries is very similar with the caveat that any type of synchronization of data is wrong, certainly if not done with proper models and study of local outbreaks. However, the general conclusions might not change.

Rolling back to the UK, Italy and Hubei, you can see that the British government announced the abandoning of the containment phase, exactly when Hubei did the opposite (in relative time). After initial errors, Italy acted fast but had to play a chase game with the outbreak. Their strategy was not working and Italy had to enforce a nation-wide lockdown that started about 10 days later in comparison to Hubei. A nation-wide lockdown might not have been necessary, but the lack of compliance of the general population, some doctors insisting that coronavirus was not worse than the flu and some political parties trying to score points rather than supporting a common strategy eventually required a national lockdown that now is finally having the desired effects. We can be still concerned, however, about possible second-waves of the epidemics also in the short terms because of waves of inner migration from North to South, people running away from the epicentre of the Italian outbreak, although the current lockdown might be able to quench new outbreaks. We can just wait and see for now. However, that delay and initial lack of cohesion will result in several thousand preventable deaths.

In the UK, the first stage of the epidemics is still unfolding. It is interesting to observe that about a week after the British government decided not to take action to reach ‘herd immunity’, the increase in deaths deviated from the original trend, slowing down. On the ground, we have noticed how a large majority of people, seeing what was happening in Italy, took action independently from government advise. Sport organizations cancelled events, Universities started to close, people increasingly worked from home and several families started to withdraw children from schools. In all effect, the British population started social distancing measures ahead of the Government that, eventually triggered a national lockdown approximately at the same (relative) time than Italy.

One issue that in Italy has been publicized but not in the UK is what I referred to as internal migration. The reason is simple. Paradoxically, in both countries this happened because people did not follow advice. In Italy, the Government asked not to travel from the afflicted regions. People did not follow advise and when the government was preparing the regional lockdown, an opposition party leaked the measure to the press. People panicked and travelled to their holiday homes and University students went back to their family, many travelling from North to South. In the UK, the Government did not ask to close University, but Universities did close and students were invited to go back to their families. This advice was sensible as we do not want to have students trapped in student accommodations. However, this happened without control and without a good sense of the status of the local epidemics. Therefore, while not broadly reported by the media because of how this happened, this large movement of people might contribute to the future dynamics of the UK epidemics, and possibly also in other countries.

It is just my opinion, not a scientific fact for now, that the UK wasted an incredible amount of time. While the NHS and the Government might have done everything right initially, all those efforts had been squandered. What I argue here is that no strategy is a good strategy in these circumstances. Coronavirus will kill a lot of people and cause damage to the economy. Adapting a strategy is also important, new information might require a new strategy. However, contrary to the story depicted by the Government, there was no change of science about the epidemics. A U-turn in policy can be costly as it might result in none of the positive outcomes desired with either one or the other strategy.

The only hope is that, those countries that were able to delay the onset of local epidemics compared to Italy and Iran, either by chance (low exposure) or by good management, were able to scale-up contingency plans and availability of specialist ICU beds to draw the final numbers of deaths as low as possible. Why this was not arranged at the onset of the epidemics is something we will have to analyse in the future.

Soon, we will speak a lot about USA. China has four times the population of USA. China was able to contain the disease (for now) and to avoid widespread diffusion across all China. Outbreaks after outbreaks, the trends from USA are accelerating and already overshooting both Hubei and China overall, with no sign of slowing down of the epidemics.

Concluding remarks

In the next sections, I describe the methods I used and I provide a discussion about mortality rates. However, I wished to conclude my post here stating that this might be the last time I post graphs, as by now there is so much data around and from people with a background in epidemiology. However critical I might be of certain political decisions, I would like to be very clear on the following. The individual risk to people is comparatively low. The large majority of the population has to fear more the socioeconomic repercussions of the pandemic rather than the disease in itself. The socioeconomic impacts of nationwide lock-downs will impact people’s health. Mental health will deteriorate. Health systems will be weakened and therefore more people will suffer even if not infected. Governments all over the world are trying to guess what is best to do. My advice is to follow guidance from your national medical and governmental authorities. It might seem I am contradicting myself. I am not. I simply acknowledge there is no good solution to the problem. However, governments should speak the truth and clarify why they take certain decisions, at least in democracies. Governments should also work together not against each other. Some of us were fearing a big war was brewing after the 2008 financial crisis but we were not expecting a pandemic. Now that we got the pandemic, I hope we do not get both, but that this situation is bringing all peoples of all nations together. At the moment there are both good and bad signs. In Italy, we have a saying: “La speranza e’ l’ultima a morire”. Hope is last to die. And we hope, we hope for more rainbows at the windows and fewer clouds at the horizon.

Take care, my friends.

The data | The data repository for the 2019 Novel Coronavirus Visual Dashboard operated by the Johns Hopkins University Center for Systems Science and Engineering is available on GitHub.  To adjust mortality rates by local demographics, I have downloaded the population pyramid data from www.populationpyramid.net. There are several estimates for age-dependent mortality. I was able to find only the following pre-print for mortality in Hubei compared to the rest of China. The dataset analysed was small, let me know if you find something better.

The software | In spare time, I prepared a bit of Matlab code that can import the JH data and does just two simple things: compare trends between different countries and compare age-adjusted moralities. The code is available on GitHub. Keep in mind, sorry to repeat, this is just for the curious geeks.

Appendix – Mortality rates.

Let’s now discuss, briefly, mortality rates. This is a very complicated issue, not just for the non-experts like me. We will have better estimates only much later. At the moment, there is no evidence about the existence of multiple strains of CoV-SARS-2 exhibiting different aggressiveness. There are of course different strains, as viruses do constantly mutate, but the idea that some country is more affected than others because of different strains – at the moment- seems to be just a way to justify their own shortcomings. Of course, we will understand this later.

Let’s consider Washington state (not shown here) that resulted in a very high mortality. This was the result of very little testing of the general population, and a spike of deaths of elderly people arising from outbreaks in retirement homes. New York is on the opposite scale as it was a more ‘standard’ outbreak. We noticed mortality rates going from very high to very low and bouncing back. These are all artefacts of sampling.

I re-propose here some graphs comparing demographics. First, the age-adjusted mortality. The blue bar in China at ~1% is the mortality rate in China outside Hubei. This, in my opinion, is a good measure and backed up by epidemiological studies. WHO is setting this number at ~3.5% because at the moment they are just dividing confirmed deaths by confirmed cases, a number likely to be overestimated.

The blue bars for other countries are adjusted to the different demographics of individual regions. Italy has an older population compared to most other countries and, therefore, higher mortality is to be expected. There is some report suggesting that also men are more likely to die compared to women. Therefore, in the middle graph, I compare the demographics of Italy, UK and Germany to China. The red bars in the first graph are the mortality rates inferred from Hubei, thus in a situation where the health system is overwhelmed. As you can see, in Italy we can expect apparent mortality rates of almost 10%. Indeed, at the moment this value has been exceeded. However, here the keyword is ‘apparent’.

Therefore, while I might be still discussing COVID in this blog, I will probably stop speaking about mortality rates. Eventually, these numbers could be very low. Perhaps, we might have ‘just’ 0.5% of the population at risk of death, maybe 1% in countries with older demographics. However, keep in mind that like the WHO has always said, the issue is the overwhelming of the health systems. UK, Hubei and Italy have around 60M inhabitants. This means that a ‘do-nothing’ policy would result in 300-600k deaths in each country (40M world-wide). Or half of that if some sort of herd immunity would protect us. To put this in perspective, ~500-600k people die in the UK every year (60M world-wide).

Author: Alessandro

Please visit my website to know more about me and my research http://www.quantitative-microscopy.org

One thought on “Corona virus – data mining (v2)”

Leave a Reply

Discover more from Quantitative Cancer Biology (Esposito's group)

Subscribe now to keep reading and get access to the full archive.

Continue reading