Jupyter Notebook Example to Elaborate COVID Italian Data

Posted by MaX on January 25, 2021

Elaborate Covid-19 Italian Data

Very simple notebook example that shows how to load data from a CSV, leverage panda’s Dataframes tocalculate few extra columns with Covid Pandemic KPIs and then display it using Plotly

1
2
3
4
5
6
7
# Import Global Libraries
import pandas as pd
import numpy as np
import io
import requests


Load Data

The data are downloaded from the Italian “Protezione Civile” Covid Data repository on github https://github.com/pcm-dpc/COVID-19

1
2
3
4
5
url='https://raw.githubusercontent.com/pcm-dpc/COVID-19/master/dati-andamento-nazionale/dpc-covid19-ita-andamento-nazionale.csv'
s = requests.get(url).text
df = pd.read_csv(io.StringIO(s))
df.tail()

data stato ricoverati_con_sintomi terapia_intensiva totale_ospedalizzati isolamento_domiciliare totale_positivi variazione_totale_positivi nuovi_positivi dimessi_guariti ... tamponi casi_testati note ingressi_terapia_intensiva note_test note_casi totale_positivi_test_molecolare totale_positivi_test_antigenico_rapido tamponi_test_molecolare tamponi_test_antigenico_rapido
330 2021-01-19T17:00:00 ITA 22699 2487 25186 510338 535524 -11535 10497 1781917 ... 29619436 16009790.0 NaN 176.0 NaN NaN 2397121.0 3477.0 29132944.0 486492.0
331 2021-01-20T17:00:00 ITA 22469 2461 24930 498623 523553 -11971 13571 1806932 ... 29899198 16102034.0 NaN 152.0 NaN NaN 2409616.0 4550.0 29296422.0 602776.0
332 2021-01-21T17:00:00 ITA 22045 2418 24463 492105 516568 -6985 14078 1827451 ... 30166765 16197394.0 NaN 155.0 NaN NaN 2422728.0 5493.0 29458875.0 707890.0
333 2021-01-22T17:00:00 ITA 21691 2390 24081 477972 502053 -14515 13633 1855127 ... 30431493 16279588.0 NaN 144.0 NaN NaN 2435519.0 6335.0 29608567.0 822926.0
334 2021-01-23T17:00:00 ITA 21403 2386 23789 475045 498834 -3219 13331 1871189 ... 30717824 16367107.0 NaN 174.0 NaN NaN 2447861.0 7324.0 29759716.0 958108.0

5 rows × 24 columns

Build additional columns to calculate pandemics KPIs

Calculating and wrangling a bit the data to fit them in a comparable scale.

Additional columns:

\[infection.rate = nuovi.positivi/nuovi.tamponi\] \[infectedX1000=nuovi.positivi/1000\] \[deadX100=nuovi.deceduti/100\] \[criticalX100=terapia.intensiva/100\]
1
2
3
4
5
6
7
8
9
10
df.fillna(0)
df['nuovi_tamponi']=df['tamponi'].diff()
df['nuovi_deceduti']=df['deceduti'].diff()
df['infection_rate']=100*df['nuovi_positivi']/df['nuovi_tamponi']
df['infectedX1000']=df['nuovi_positivi']/1000
df['deadX100']=df['nuovi_deceduti']/100
df['criticalX100']=df['terapia_intensiva']/100
#df[df['data']=='2020-12-17T17:00:00','infection_rate']=0
df.loc[df['data']=='2020-12-17T17:00:00',['infection_rate']]=0

1
2
# Display the dataframe types with the added columns at the end
df.dtypes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
data                                       object
stato                                      object
ricoverati_con_sintomi                      int64
terapia_intensiva                           int64
totale_ospedalizzati                        int64
isolamento_domiciliare                      int64
totale_positivi                             int64
variazione_totale_positivi                  int64
nuovi_positivi                              int64
dimessi_guariti                             int64
deceduti                                    int64
casi_da_sospetto_diagnostico              float64
casi_da_screening                         float64
totale_casi                                 int64
tamponi                                     int64
casi_testati                              float64
note                                       object
ingressi_terapia_intensiva                float64
note_test                                 float64
note_casi                                 float64
totale_positivi_test_molecolare           float64
totale_positivi_test_antigenico_rapido    float64
tamponi_test_molecolare                   float64
tamponi_test_antigenico_rapido            float64
nuovi_tamponi                             float64
nuovi_deceduti                            float64
infection_rate                            float64
infectedX1000                             float64
deadX100                                  float64
criticalX100                              float64
dtype: object

Display Results

1
2
3
4
5
6
7
8
9
10
# import graphic libraries
import plotly.express as px
import plotly.io as pio


# Draw Graph
fig = px.line(df, x='data', y=['infection_rate','deadX100','infectedX1000', 'criticalX100'], title='Italy [Rate of infection, deaths/100, infected/1000, critical/100] ')
# write image
pio.write_image(fig, file='covid_graph.png',format='png')

OPTIONAL: save html Graph

1
 fig.write_html('Infection_Ratio_Chart.html', include_plotlyjs="cdn",auto_open=False)
1
fig.show()

COVID Situation Graph