Finding the parameters of a SIR model
Basic elements of a SIR model
In our model, each agent transitions among 3 states: Susceptible, Infectious and Recovered. At each epoch, an Infectious agent has a probability β of infecting its Susceptible neighbours, and a probability γ to transition to the Recovered state. From that moment on, it will no longer participate in the spreading of the disease. |
The connectivity between agents is modeled as a Watts-Strogatz small world random graph, a regular ring lattice of mean degree K where each node has a probability r of being randomly rewired. In our model, these parameters will be input-calibrated (i.e., fixed). |
# preparatory imports
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
from black_it.calibrator import Calibrator
from black_it.loss_functions.minkowski import MinkowskiLoss
from black_it.plot.plot_results import (
plot_convergence,
plot_losses,
plot_losses_interact,
plot_sampling,
plot_sampling_interact,
)
from black_it.samplers.best_batch import BestBatchSampler
from black_it.samplers.halton import HaltonSampler
from black_it.samplers.random_forest import RandomForestSampler
cmap = matplotlib.cm.get_cmap("tab10").colors
def plotSeries(title, SIR1, SIR2=None):
fig, ax1 = plt.subplots()
ax1.plot(SIR1[:, 0], "-", label="susceptible", color=cmap[0])
ax1.plot(SIR1[:, 1], "-", label="infected", color=cmap[1])
ax1.plot(SIR1[:, 2], "-", label="recovered", color=cmap[2])
ax1.tick_params(axis="y", labelcolor=cmap[0])
ax1.set_ylabel("susceptible per 1000 inhabitants", color=cmap[0])
ax1.set_xlabel("weeks")
ax1.legend(loc=5)
if SIR2 is not None:
ax1.plot(SIR2[:, 0], "--", label="susceptible", color=cmap[0])
ax1.plot(SIR2[:, 1], "--", label="infected", color=cmap[1])
ax1.plot(SIR2[:, 2], "--", label="recovered", color=cmap[2])
ax1.title.set_text(title)
fig.tight_layout()
plt.show()
def printBestParams(params):
with np.printoptions(suppress=True, formatter={"float_kind": "{:.3f} ".format}):
print(f" brktime beta1 beta2 gamma")
print(f"Best parameters: {params[0]}")
Simple calibration over synthetic data
Let's start by using black-it to recover the \beta and \gamma parameter of a SIR model from synthetic data.
Initialize a calibrator object
To set up a calibration one needs to define the following components first:
- a model to be calibrated
- a loss function to measure the distance between the real time series and the simulated time series
- a set of samplers that iteratively suggest a set of parameter values to explore
- the parameter space that should be explored
1. Model simulator
In order to use the SIR simulator, we need to download its docker image:
- Install Docker following the instructions given here
- Open Docker on your computer
- Run the following command in your terminal:
docker pull bancaditalia/abmsimulator
We can then proceed to generate the data we are going to try to reproduce via calibration.
from models.sir.sir_docker import SIR
true_params = [0.1, 0.1] # in general not known!
synth_data = SIR(true_params, 50, 0)
plotSeries("Synthetic data", synth_data)
2. Loss function
# import a quadratic loss, a simple squared difference bewteen the two series
from black_it.loss_functions.minkowski import MinkowskiLoss
loss = MinkowskiLoss()
3. Samplers
from black_it.samplers.best_batch import BestBatchSampler
from black_it.samplers.halton import HaltonSampler
from black_it.samplers.random_forest import RandomForestSampler
batch_size = 4
halton_sampler = HaltonSampler(batch_size=batch_size)
random_forest_sampler = RandomForestSampler(batch_size=batch_size)
best_batch_sampler = BestBatchSampler(batch_size=batch_size)
samplers = [halton_sampler, random_forest_sampler, best_batch_sampler]
4. Parameter space (bounds and precision)
bounds = [[0.001, 0.001], [1.00, 1.00]]
precisions = [0.001, 0.001]
Finally, initialize the Calibrator
from black_it.calibrator import Calibrator
saving_folder='sir-test'
# initialize a Calibrator object
cal = Calibrator(
samplers=samplers,
real_data=synth_data,
model=SIR,
parameters_bounds=bounds,
parameters_precision=precisions,
ensemble_size=1,
loss_function=loss,
saving_folder=saving_folder,
)
Calibration
Calibrate the model for five batches
params, losses = cal.calibrate(5)
# best parameters obtained so far
params[0]
# index of the best fitting series
idxmin = np.argmin(cal.losses_samp)
Compare the synthetic and calibrated series
plotSeries("Synthetic vs calibrated series", synth_data, cal.series_samp[idxmin, 0])
Calibration of a more advanced SIR model against realistic data
In this part of the tutorial we will use black-it to find the parameters of a SIR model fitted on the italian Covid-19 epidemiological data.
We will see that a proper modelling of the first wave of the epidemic requires the introduction of a structural break in the SIR simulator i.e., a specific point in time in which an abrupt change in the parameters occurs.
This is useful to model the effect of the lockdown over the spreading of the epidemic.
Load reference data
For didactic puposes, let's load a previously prepared dataset containing a very rough estimate of the SIR data for the first 20 weeks of the italian Covid-19 epidemic.
As the official data underestimates the number of cases, Susceptible and Recovered were rescaled by a constant factor.
real_data = np.loadtxt("data/italy_20_weeks.txt")
# plot the real time series we want to reproduce
plotSeries("Real data", real_data)
Initialize a calibrator object
1. Model simulator
from models.sir.sir_docker import SIR_w_breaks
2. Loss function
# we'll use a quadratic loss, a simple squared difference bewteen the two series
loss = MinkowskiLoss()
3. Samplers
sampler_batch_size = 16
samplers = [
HaltonSampler(batch_size=sampler_batch_size),
RandomForestSampler(batch_size=sampler_batch_size),
BestBatchSampler(batch_size=sampler_batch_size),
]
4. Parameter space (bounds and precision)
# brktime, beta1, beta2, gamma
bounds_w_breaks = [
[2, 0.1, 0, 0.1],
[7, 0.2, 0.1, 0.3],
]
precisions_w_breaks = [1, 0.0005, 0.0005, 0.0005]
Initialize the Calibrator
saving_folder = "output"
# initialize the Calibrator
cal = Calibrator(
samplers=samplers,
real_data=real_data,
model=SIR_w_breaks,
parameters_bounds=bounds_w_breaks,
parameters_precision=precisions_w_breaks,
ensemble_size=1,
loss_function=loss,
saving_folder=saving_folder,
)
Calibration
Perform 5 calibration rounds.
Note that, with these parameters, you would be able to achieve a much lower loss in 30 epochs.
params, losses = cal.calibrate(5)
# best parameters obtained so far
printBestParams(params)
Compare the original and the calibrated time series
idxmin = np.argmin(cal.losses_samp)
plotSeries("real vs calibrated", real_data, cal.series_samp[idxmin, 0])
Plots
plot_sampling(saving_folder)
plot_sampling_interact(saving_folder)
plot_losses(saving_folder)
plot_losses_interact(saving_folder)
plot_convergence(saving_folder)