COVID | data analysis (new CAAT 4.3 release)

WARNING. I am not a medical doctor nor an epidemiologist. The analysis I am sharing here is only for the data geeks around that are curious. Please follow the advice of your national authorities and health system.


I have just published a new release of CAAT, Matlab code to analyse Johns Hopkins dataset on the SARS-CoV-2 pandemics. The usual caveat is that data is likely to be underestimates. Underestimation does not occur only because of lack of transparency but most of the times because of differences in definitions of COVID-related fatalities and efficiency of reporting systems across countries. For example, using the excess mortality statistics we know that about 30-50% under reporting is rather physiological because of deaths occurring outside the hospital settings or because people might die positive to COVID but not for COVID. This also accounted to a significant adjustment of statistics we have noticed in the past in Hubei. However, changes over time within a country might be more reliable and, therefore, there is still something to learn from this data.

I started analyzing data when the UK government decided to drop the policies set up for containment of SARS-CoV-2. Now that data has been extensively discussed in the UK, I curate CAAT only for others that would like to explore the data. However, at this release it is worth mentioning just the main observation. While the first acute phase of the pandemics is subsiding in Europe and Northern America, it is now flaring in South America at worrying pace.

Relative fatalities in the population at risk for each country. This is a new visualization I provide. Both bar graphs are ordered according to the values shown to the right, i.e. the fraction of the population at risk who has already died. At the right, the weekly rate of fatalities during the current week and the preceding two weeks. Those countries that, so far, has experienced high casualties are showing a significant reduction of weekly fatalities showing the positive effects of policies aimed to contain the virus.

Commenting out line 362 ‘d_ord = p_ord;’, you can order the weekly rate of fatalities according to the last week. This might reveal which are the countries at most risk now. Several South American countries are topping this list. In Brazil, Ecuador and Peru, about 0.5% of the population at risk died during the last week and, sadly is several countries this rate is accelerating.
Let’s also focus on some good news. This is one of the several plots hard coded in CAAT, comparing Italy, UK, Germany, Denmark and Sweden. As it is well established by now, new fatalities are dropping significantly and lock-down measures are gradually abandoned in favor of social distancing measures. The individual outbreaks started at different times and at different rates. Comparisons between countries are therefore difficult but the effectiveness of policies within countries can be evaluated. If you were interested in different graphs or comments, let me know but I will not elaborate more on this at this stage.
This graph is a bit of a mess but I present it for completeness. In North America lock-down measures are having a clear effect. In South America we see alarming increases. As I do not follow South American politics and specific news, I can’t draw conclusions, but it is evident that – assuming no change in reporting occurred, Argentina has slowed down the outbreak but not yet put it under control.

Although I have explained this before, I should probably clarify how I evaluate the population at risk. To estimate how many people might die in each country in the (unrealistic) scenario where everyone would get ill, I used the age-dependent fatality rates published in The Lancet by Ferguson’s group and multiplied these values with the demographics of each country as reported by the UN.

Author: Alessandro

Please visit my website to know more about me and my research

Leave a Reply

%d bloggers like this: