Episode Transcript
Transcripts are displayed as originally observed. Some content, including advertisements may have changed.
Use Ctrl + F to search
0:00
Let me show you how to be a
0:02
good piece. Today I am honored to host
0:04
Adam Bucharski, a leaning expert in
0:06
infectious disease modeling, and epidemic
0:09
forecasting. Adam is a professor
0:11
of infectious disease epidemiology and
0:13
co-director of the Center for
0:16
Epidemic Preparedness and Response at
0:18
the London School of Hygiene
0:21
and Tropical Medicine. His research
0:23
focuses on harnessing data and
0:26
analysis to improve epidemic preparedness,
0:28
and he has contributed real-time
0:31
analysis to governments and health
0:33
agencies during major outbreaks, including
0:35
Ebola, Zika and COVID-19.
0:37
In this episode, Adam
0:40
takes us inside the
0:42
world of epidemiological modeling,
0:44
discussing how these methods
0:46
help refine predictions and
0:48
inform public health decisions.
0:50
We explore the challenges
0:52
of modeling infectious diseases.
0:54
from data uncertainty to
0:56
real-time forecasting, and the
0:58
importance of communicating findings
1:00
effectively to policymakers and
1:02
the public. Adam also
1:04
highlights common misconceptions about
1:06
epidemiological data and dies into
1:09
the world of automation and
1:11
AI in Epidemic Response. This
1:13
is an early invasion statistics,
1:16
episode 130, recorded November 26,
1:18
2024. Let me show you
1:20
how to be a good
1:22
Bayesian, change your predictions after
1:25
taking information. And if you're
1:27
thinking I'll be less than
1:29
amazing, let's adjust those expectations.
1:31
What's a Bayesian? It's someone
1:34
who cares about evidence. Welcome
1:36
to Learning Bayesian statistics, a
1:38
podcast about patient inference, the
1:40
methods, the projects, and the
1:43
people who make it possible.
1:45
I'm your host, Alex Andorra.
1:47
like the country for any
1:49
info about the show. LearnBatesat.com
1:52
is labless to be. Show
1:54
notes, becoming a corporate sponsor,
1:56
unlocking vision merge, supporting the
1:58
show on Patreon. is
2:01
in there. That's learn base stats.com.
2:03
If you're interested in one-on-one mentorship,
2:05
online courses, or statistical consulting, feel
2:08
free to reach out and book
2:10
a call at topmate.io/Alex underscore and
2:12
Dora. See you around, folks, and
2:15
best patient wishes to you all.
2:17
And if today's discussion sparked ideas
2:20
for your business, well, our team
2:22
at Poymce Labs can help bring
2:24
them to life. Check us out
2:27
at Pymce Dash Labs.com. Hello
2:30
my dear patients, hope you're doing
2:33
well. Two main announcements for today.
2:35
First and foremost, thank you so
2:37
much. to all of you who
2:40
sent testimonials about learning vision statistics
2:42
and intuitive base to support my
2:44
green card application. I was genuinely
2:47
touched, surprised, and moved by the
2:49
number, kindness, and generosity of messages.
2:51
I am so happy and grateful
2:54
to be surrounded by truly-minded, kind,
2:56
and helpful people. like you, I
2:58
received more than 40 testimonials. So
3:01
thank you so much to all
3:03
of you who've taken the time
3:06
out of their busy day to
3:08
work about a very nerdy endeavor
3:10
and how it changed the field
3:13
of statistics according to you. I
3:15
will of course let you know
3:17
what happens with my application and
3:20
I'm already planning a very special
3:22
in-person surprise for all of you
3:24
for February. 2026 but more info
3:27
will come in good time. So
3:29
again, thank you so much and
3:31
in the meantime, leave long and
3:34
prosper or shall we say leave
3:36
base and prosper and talking about
3:38
prospering if you are interested in
3:41
baseball and patient statistics and working
3:43
in an MLP team like let's
3:45
say the Miami Monins so that
3:48
would mean working with me if
3:50
you'd love that get to my
3:52
me get to meet me and
3:55
maybe working together in the baseball
3:57
research and baseball solutions teams. Well,
4:00
now is the time. Our teams
4:02
are growing fast and this is
4:04
your chance to get in on
4:07
something that's very special. Honestly, I
4:09
absolutely love what's going on there.
4:11
And we've just opened two brand
4:14
new roles, one baseball analyst for
4:16
the Solutions Group that's more here
4:18
to our junior applicants and a
4:21
senior one, which is senior data
4:23
scientist in the research group. It's
4:25
mine. So if you are passionate
4:28
about baseball research, I love working
4:30
on collaborative teams and well bonus
4:32
points, you know your way around
4:35
offense modeling approaches like for instance
4:37
patient methods with Pymc and Stan
4:39
and neural networks time series all
4:42
that stuff. Well we definitely want
4:44
to hear for you. I think
4:46
what we're doing in Miami is
4:49
some of the most exciting in
4:51
the MLB right now. Let me
4:54
know if you have any questions,
4:56
if you're a patron of the
4:58
show, you're on the discord with
5:01
me. Feel free to send you
5:03
any of your comments, questions, recommendations
5:05
of people. Send that to your
5:08
friends and maybe see you soon.
5:10
Miami. On that note, let's go
5:12
on with the show. Adam Kucherski.
5:15
Welcome to Learning Vision Statistics. Thank
5:17
you. Yeah, thank you for taking
5:19
the time and bearing with me
5:22
we've had a few technological problems
5:24
to record that episode folks that
5:26
was That was that was difficult,
5:29
but you know as they say
5:31
The obstacle is the way so
5:33
we are here And a big
5:36
thank you to Chris Wyman to
5:38
for putting us in in contact
5:41
Chris and I met at Stankon
5:43
2024 in Oxford. Chris was actually
5:45
in a panel discussion that recorded
5:48
with him. and Lisa Simonova, that
5:50
was episode 120 for people who
5:52
are curious and want to hear
5:55
more about how cool epidemiological science
5:57
and computational biology are. So feel
5:59
free to take that out. With
6:02
Adam today, we're actually going to
6:04
tackle some similar topics, but you
6:06
have a very wide and very
6:09
broad research. So that's why I'm.
6:11
I'm super happy to have you
6:13
on the show today. You do
6:16
so many things and you are
6:18
also a great science communicator. You've
6:20
written several books as I've said
6:23
in the introduction. So yeah, all
6:25
of these books will be in
6:27
the show notes. So people feel
6:30
free to check them out. I
6:32
definitely recommend them. They are absolutely
6:35
fascinating. But before we touch on
6:37
the end, can you tell us
6:39
with your tree? Actually nowadays, Adam
6:42
and how you ended up working
6:44
on this. Yeah, sure. So, I
6:46
mean, my work kind of bridges
6:49
really understanding epidemics and getting better
6:51
at predicting and responding to them.
6:53
So there's a mix of aspects
6:56
to that. So some of it
6:58
is understanding the drivers of what
7:00
we see in terms of how
7:03
things spread. So for something like
7:05
thank you fever, that might be
7:07
climate influences, accumulation of immunity in
7:10
the population for something like, say
7:12
flu or COVID. the implications of
7:14
vaccination over time, how those viruses
7:17
evolve. Alongside that we're also doing
7:19
a lot of work to build
7:21
up the methods and tools that
7:24
we need to respond to very
7:26
quickly during COVID. A lot of
7:29
these things were developed very quickly,
7:31
often just working over weekends and
7:33
can we actually do a lot
7:36
better particularly for the predictable questions
7:38
that we know we're going to
7:40
have to answer. In terms of
7:43
how I got into it, my
7:45
backgrounds originally in maths, have applied
7:47
maths as my PhD. But even
7:50
that was starting to go more
7:52
in this direction of epidemiology and
7:54
questions around. how we can understand
7:57
the process of infectious disease. And
7:59
for me, as a field, it
8:01
kind of sits quite nicely between
8:04
something where having some mathematical and
8:06
statistical understanding can give you a
8:08
lot of value very quickly. But
8:11
also, there's enough unknowns about the
8:13
underlying rules and processes. It's not
8:16
some like physics where we have
8:18
a lot more mechanistic understanding. So
8:20
it means. sort of squeezing a
8:23
lot more out of the data
8:25
you have available, particularly under pressure
8:27
in a situation like an epidemic.
8:30
I see, okay, that's very interesting.
8:32
So it's something that, yeah, was
8:34
a long time coming, right? You've
8:37
always kind of been interested in
8:39
these topics, at least since your
8:41
past high school studies, right? Yeah,
8:44
I think there's a lot of
8:46
those questions where, yeah, obviously has
8:48
an enormous impact on people, but
8:51
also just from a kind of
8:53
curiosity point of view, these are
8:55
quite hard questions and often the
8:58
methods that are in the textbooks
9:00
don't quite work for what you
9:02
need to do. So it's an
9:05
ongoing interesting research area to be
9:07
in because almost every epidemic you
9:09
work in, what you thought you
9:11
knew and what you thought you
9:13
had solved. Suddenly you happen. And
9:15
so yeah, it's kept me busy
9:17
and kept me interested along with
9:20
many of my colleagues. Yeah. And
9:22
what about patient stance? Do you
9:24
remember when you were first introduced
9:26
to them? So I think it's
9:28
particularly doing PhD. So I think
9:30
my kind of undergrad is a
9:32
lot more traditional math. So it's
9:34
a relatively little statistics, a lot
9:36
more kind of theory. to the
9:38
sort of thing where you'd learn
9:40
measure theory rather than learning the
9:42
kind of coming at it from
9:44
a data point of view. I
9:46
think during my PhD, I started
9:49
working a lot more with data,
9:51
and especially if you start to
9:53
look at processes involved in diseases
9:55
versus involving immunity, then that's the
9:57
next question is, well, how could
9:59
we estimate things meaning? meaningfully within
10:01
that. And then particularly if you
10:03
have patchy bits of data or different
10:05
studies that you want to combine in
10:08
some way or if you have some
10:10
analysis you've put together and then you
10:12
want to make statements about additional
10:15
data coming in, basic statistics are
10:17
a natural framework for doing that.
10:19
And so really in the mix
10:21
of my work in some cases
10:23
you just get simple probability problems
10:25
and base formulas are nice. get
10:27
out of jail way of solving
10:30
that problem. And in other cases,
10:32
it's more of a framework for
10:34
thinking about the entire analysis that,
10:36
you know, you want to be able
10:38
to come to the best possible explanation
10:40
or the best possible evidence at
10:42
even point in time, but then you
10:45
want to update that as you get
10:47
more data coming in and again it
10:49
gives you a very nice toolkit for
10:51
doing that. Yeah, yeah. Yeah, that makes
10:53
sense. Damn. That's very fussy to
10:55
you here. brings me a lot
10:57
of more questions for you
10:59
Anna. Yes, one of these
11:01
questions that I have for
11:04
you is, especially during COVID,
11:06
Bayesian modeling became a crucial tool.
11:08
I remember I did some work
11:11
at my very low level at
11:13
that time, but you did a
11:15
bunch of work on that and
11:18
as you were saying, I mentioned
11:20
there were a lot of very
11:23
short nights and very long weeks.
11:25
Can you share specific examples of
11:27
how these models were able to
11:29
inform public health decisions? Yes,
11:31
I think there's quite a
11:34
few instances, particularly around scenarios
11:36
where models can be a
11:38
very useful kind of decision
11:40
support tool. Because essentially if you're
11:42
going to make a decision about
11:44
what to do in an epidemic,
11:47
everyone has a model in their
11:49
head because if you think let's do
11:51
this, let's try this. you're making some assumptions
11:53
about what you think is going to happen,
11:55
you make some assumptions about how you think
11:57
epidemics work. So models are a very nice way
11:59
of it. allowing us to really write down
12:01
and formalize what we think those processes
12:04
are and what we think the
12:06
interventions are going to do. And then
12:08
we can debate whether that's reasonable,
12:10
whether that's not, so we can see
12:12
if that generates any counterintuitive effects.
12:15
But in doing that, you really want to
12:17
capture the extent of information you have
12:19
about what you're dealing with. So even,
12:21
for example, the question of how much
12:23
effort is it going to take to
12:25
get transmission down? One of the key
12:28
things that is going to influence that
12:30
is how much transmission there really is.
12:32
So a lot of the analysis that we did,
12:34
being able to pass around uncertainty is
12:36
really important. So very early in
12:38
the pandemic, for instance, because we
12:40
were relying on quite uncertainty from
12:43
China, quite uncertain data about
12:45
exported cases, we had some sense
12:47
of what transmission would do in
12:49
a country without any control measures,
12:51
but there was quite a large
12:53
amount of uncertainty on that probably
12:55
reproduction number. somewhere between two and
12:58
four. And so what we wanted to do
13:00
is when we generate UK scenarios to
13:02
present them, we didn't want to
13:04
give one number, we wanted to
13:06
say, look, we don't know actually it's more
13:08
the upper end or lower end of this,
13:11
but we want to kind of define
13:13
that uncertainty in what we simulates.
13:15
And then after that wave comes through, you
13:17
know, we had a lockdown as like many
13:19
other countries, there was this push to
13:21
reopen and then the question is okay, so
13:24
we give we have that wave. what's
13:26
the reopening going to look like? And
13:28
again, that's where these kind of basic
13:30
methods we have very helpful, because
13:32
we can take the uncertainty and
13:35
perhaps we've got more confidence now
13:37
about what we're dealing with, but then
13:39
we want to pass that level of
13:41
confidence into what's going on in the
13:43
future. And then, you know, it's very
13:45
inside emerging, that became even more important
13:47
because often the alpha we would
13:49
have some degree of confidence about
13:51
how much more transmissible it was,
13:53
And in any analysis, for
13:56
example, we did a lot of
13:58
work early in 2021, how... quickly you're
14:00
going to have to vaccinate. And I
14:02
think there's a lot of pressure to
14:04
lift lockdown and try to understand that
14:07
trade-up. If you're vaccinating at this rate
14:09
and you're lifting lockdown this quickly, what's
14:11
that going to look like? And again, none
14:13
of those values you had very precisely and
14:15
you had more epidemic data coming in
14:17
all the time. And as particularly that
14:20
process of reopening in 2021, there's a
14:22
very tight relationship between models and policy
14:24
because in the UK more of the few
14:26
times where the policy was actually very informed
14:29
by what was useful from a technical point
14:31
of view. So what they did was they
14:33
had these series of steps in the road
14:35
madness. It was designed to give enough signal
14:38
in the data post each step that
14:40
if something was going very wrong, they
14:42
wouldn't have implemented another step before
14:44
you had that signal. So they
14:46
would have deliberately spaced out so
14:49
the models and the epidemiologist would
14:51
have enough time to work out what that
14:53
relaxation had done. So again, from a
14:55
basic important... point of view that
14:57
that's really nice because it gives
14:59
you enough time to kind of
15:02
update your posterior sensibly before you
15:04
do the next step and work out
15:06
what effect that's going to have. Right, yeah,
15:08
yeah, yeah, yeah, okay, that's, that's, I
15:10
didn't know, I didn't know there
15:12
was that level of coordination where
15:14
you could actually do that, you know, go
15:16
there. There were many examples where,
15:19
yeah, it didn't, it didn't work
15:21
in such a coordinated fashion, but
15:23
I think that was that that
15:25
roadmap reopening is one where there
15:27
was a much tighter relationship
15:29
between the questions come
15:31
down from policy and what you
15:34
know models needed to say something
15:36
sensible about the
15:38
implications. Yeah, yeah, and I guess, I mean,
15:41
I was so wondering, you know, how
15:43
much do you think the pendulum
15:45
has swung back from from then,
15:47
you know, like do you think will
15:50
be faster to implement
15:52
these workflows and improve them
15:54
next time there is a
15:57
pandemic? Or will we have
15:59
to? work from a blank slate
16:01
because politics is so short-sighted
16:04
with very short, you know,
16:06
cycles? Yeah, I think that's
16:08
a really good question. I
16:10
think there's some things that
16:12
have been positive in terms
16:15
of progress, so I think in
16:17
some people we're doing others in
16:19
consolidating a lot of the tools
16:21
that are available. So some things
16:23
now that we, you know, look
16:25
at quick questions that
16:28
age 5 anyone. Questions that
16:30
I just wouldn't have bothered doing
16:32
previously just as a curiosity because
16:34
it would have been three four
16:36
hours just for maybe question
16:38
and now in 20 minutes you can
16:40
get a rough example. So I think
16:42
that's quite nice is bringing things into
16:44
reach. But I think in terms of
16:46
just staff capacity and people to do this
16:48
work, I think there are a lot of people
16:51
who put a huge amount of time in
16:53
often as around the world getting pulled
16:55
off different roles and off other projects. And
16:57
so I think in a way in that
17:00
sense, we probably don't have
17:02
the same workforce who could undergo that
17:04
in our pressure for that a matter
17:06
of time. So in a way that
17:08
creates a necessity, we need to get
17:10
better tools because we don't have, I
17:12
think that volume of people, but
17:15
also just all our projects, we
17:17
basically took people off a lot
17:19
of other funded projects, and
17:21
I think we wouldn't have the results
17:23
to do that in the same way.
17:25
Now, so for me, I think particularly
17:27
what we're interested in the moment is
17:29
if you imagine the set of tasks you might
17:31
have to do in an epidemic. Some are tasks
17:34
that a lot of other people have to
17:36
do, and we can really predict ahead of
17:38
time you're going to need to do that.
17:40
So stuff like working out the severity of
17:42
infection or working out the transmission. We know
17:45
we're going to need that done. We know lots
17:47
of people who are going to do that.
17:49
So that's a really good task for automation
17:51
and getting really leadingage, basic
17:53
methods. standardized if people can ever want
17:55
to do that and we don't all have to
17:57
duplicate effort. There's other things that would be really
17:59
specific. Maybe within a country there's a
18:02
kind of subgroup that's particularly important
18:04
or there's a certain variance or
18:06
something you might want to deal
18:08
with and that's going to require
18:10
more domain knowledge. And I think
18:12
at the moment, even if you
18:14
talk to people at different outbreak
18:16
organizations around the world, they probably
18:19
spend six or 70% of their
18:21
time on quite low level, predictable
18:23
data ranking type questions rather than
18:25
30% of expertise led to questions
18:27
they'd want. I don't necessarily think
18:29
in a pandemic people they're not
18:31
working less hard because everyone wants
18:33
to contribute as much as they
18:35
can, but I think what I'd
18:38
like to see is a lot
18:40
more time on that 30% and
18:42
going much deeper into what we
18:44
can understand, rather than spending a
18:46
lot of time, you know, even
18:48
just trying to get ahead the
18:50
sense of the data and the
18:52
basic tasks to answer very simple
18:55
questions. Not sure I'm
18:57
really reassured by that answer, but
18:59
yeah, that's absolutely realistic, the realistic
19:01
answer. Some things I'm also really
19:03
curious about as a, because I
19:05
see that a lot as a
19:08
modeler myself, is where the, you
19:10
know, the common misconceptions that you've
19:12
seen the public or elected officials
19:14
or even professionals have about epidemiological
19:16
data in what the scientific community
19:18
can do to clarify these misunderstandings.
19:20
Yeah, I think that's a really
19:23
good question. I mean, I think
19:25
one common misconception is even just
19:27
what data are as a thing.
19:29
You know, I think often when
19:31
people talk about law data, they're
19:33
talking about inferred estimates. So I
19:36
think one example of this is
19:38
excess stats, that I think the
19:40
media... often treat as a measured
19:42
thing rather than actually a counterfactual.
19:44
And I think I remember talking
19:46
to journalists about the comparison with
19:49
food. I think a lot of
19:51
them discovered that the flu mortality
19:53
statistics they've used every year actually
19:55
involved some quite big modeling assumptions
19:57
about seasonal patterns behind them. And
19:59
there isn't just this magic number
20:01
that we measure. Right. I think
20:04
similarly with a lot of the
20:06
estimates, for example, in the UK
20:08
they ran randomized testing surveys surveys.
20:10
randomly tested people in the community
20:12
and reported it. And that number
20:14
that wasn't a raw proportion because
20:17
of course you needed to do
20:19
some standardization across groups and waiting
20:21
and in some cases because it
20:23
was it was quite noisy across
20:25
multiple panels they would have you
20:27
know a kind of smooth underlying
20:30
proportion and then in further. So
20:32
it was sensible modeling but there
20:34
was modeling behind that and I
20:36
think the misconception often is that
20:38
these things award data, but often,
20:40
yeah, even a very simple thing
20:42
if you calculate, if you run
20:45
a survey and calculate a proportion,
20:47
you're making some modeling assumptions, or
20:49
if you know, are we going
20:51
to adjust for age, why are
20:53
you adjusting page, why are you
20:55
not adjusting to something else, you're
20:58
making assumptions about what you think
21:00
are important drivers of that thing
21:02
you're measuring. So I think that's
21:04
probably one thing we can get
21:06
better of communicating is that actually
21:08
model estimates. And actually a lot
21:10
of the things that sometimes people
21:13
treat as super complex models are
21:15
actually just quite straight simple steps
21:17
for thinking about data. So even
21:19
one of, you know, if you've
21:21
got a bunch of hospitalizations over
21:23
time and you want to know
21:26
when the infections happened, you can
21:28
use quite simple like a decomposition
21:30
model just to take that and
21:32
work out when the infections are.
21:34
And I think often in public
21:36
perception when people talk about models,
21:39
what they mean is very complicated.
21:41
scenario models that people are assuming
21:43
are used to make forecasts. I
21:45
think it's often a kind of
21:47
a crystal ball of uses big
21:49
complex models to say what's going
21:51
to happen. And I think in
21:54
reality what happens is that often
21:56
the more complex models are used
21:58
for scenarios and like what if
22:00
because we can't we can't make
22:02
forecasts often in pandemics as well
22:04
because you'll be forecasting what policy
22:07
makers are going to do. And
22:09
if your models are used as
22:11
tools to support their decisions, it
22:13
becomes quite odd to use that
22:15
as a forecast. I mean, I
22:17
think the example I sometimes get
22:20
is like, yeah. And at the
22:22
point in time you want to
22:24
understand the implications of your decisions.
22:26
What you don't really want is
22:28
someone on your shoulder saying, I
22:30
bet you're going to fold later
22:32
this round. What you want is
22:35
someone who can say, look, this
22:37
is likely, this is likely the
22:39
situation you're going to face, this
22:41
is the risk you're taking on,
22:43
if you fold, this is potentially
22:45
what the outcomes are going to
22:48
be. So that's really how a
22:50
lot of these models are used
22:52
in kind of decision support. But
22:54
I think in public consciousnessness, they're
22:56
often, this is going to happen.
22:58
in the future. And I think
23:00
part of it is communication and
23:03
particularly I think getting people to
23:05
be able to play with the
23:07
simple models can be very helpful.
23:09
So they realize it's not this
23:11
super complex thing. It's probably what
23:13
they're doing in their head, but
23:16
just writing it down in a
23:18
bit more restructured way. Yeah, yeah.
23:20
I find also indeed, you know,
23:22
walking through scenarios is something that's
23:24
really helpful to people. because, well,
23:26
they can imagine the scenarios and
23:29
that's much more, you know, tangible
23:31
and concrete than numbers and posterior
23:33
distributions. I can see that all.
23:35
I mean, sports is a lot
23:37
like that, especially baseball, lots of
23:39
different discrete scenarios. Yeah, and I
23:41
think, you know, often people, particularly
23:44
epidemics, are kind of doing it
23:46
in their head all in the
23:48
time that, you know, The epidemic's
23:50
going off because it's going to
23:52
be a problem. and then you
23:54
get a few data points that
23:57
seem to be tailing off a
23:59
bit. And basically everyone's doing that
24:01
updating in their head of where
24:03
they think it's going. But they
24:05
often, I would have loved to
24:07
have seen more politicians and journalists
24:09
actually write down their, yeah, you
24:12
could get them to actually just
24:14
write down their prediction and their
24:16
kind of distribution of what they
24:18
think is going to happen and
24:20
then see how there's update and
24:22
then you could actually. give
24:25
them a better understanding of what's
24:27
going on in their head relative
24:29
to actually what's possible given the
24:32
data was coming in. Yeah, yeah,
24:34
but So I completely agree with
24:36
that and I really love that
24:38
But I think here the incentives
24:40
are really bad for politicians might
24:42
be is then if you like
24:44
if you publicly and privately I
24:47
think I think yeah, I think
24:49
that's I think that's really that's
24:51
I think that's a really good
24:53
point. Yeah, I think getting politicians
24:55
to write that what it's going
24:57
to have in public is very
24:59
difficult. But I think even privately,
25:01
I think there were a lot
25:04
of people probably not doing that
25:06
that would. Yeah, so I, yeah,
25:08
I mean, amongst colleagues and stuff,
25:10
we sometimes just had, just probably
25:12
more things, when you think this
25:14
study comes out, what do you
25:16
think is going to show? And
25:19
I think it's sometimes quite, we
25:21
like to kid ourselves. So even
25:23
if it's just writing. I'm like
25:25
completely on board in helping people
25:27
develop a more probabilistic thinking. You
25:29
know, I think you doing the
25:31
work you're doing and also the
25:34
communication work you're doing is very
25:36
important. All your books, that's also
25:38
why I have these podcasts, people
25:40
like Nate Silver. and writing books
25:42
is very important too. I don't
25:44
know if you read his last
25:46
book on the edge, but that's
25:49
very important too for that, right?
25:51
And I think that'd be awesome
25:53
if we were gearing towards that
25:55
direction. But yeah, the problem is
25:57
like the incentives in. The public
25:59
incentives and politics are so bad
26:01
when it comes to that that
26:04
you actually have a much better
26:06
Standing if you just you know
26:08
say Anything and and everything and
26:10
just are not held accountable for
26:12
that then actually You know betting
26:14
on something that would happen and
26:16
then and then changing course if
26:19
actually what you said would happen
26:21
did not and then you know
26:23
I think partly from the communication,
26:25
there's also the, yeah, what are
26:27
the things that we can do
26:29
meaningfully and what are the things
26:31
that you can encourage people to
26:34
do versus just want to be
26:36
feasible. And I think also just
26:38
from a modeling point of view,
26:40
yeah, sometimes there's this idea that
26:42
for political decisions, you know, you
26:44
should have this big model with
26:46
absolutely everything in and all the
26:49
kind of the weights of how
26:51
you do everything. And I can't
26:53
see any political party wanting to
26:55
do it because ultimately those things
26:57
are going to be weighed, not.
26:59
in a kind of written down,
27:01
you know, we're going to put
27:04
10% on this and 15% on
27:06
this. And so I think it's
27:08
working out, yeah, where the science
27:10
can be really informative and where
27:12
actually there's more of that kind
27:14
of human political element and that,
27:16
you know, that's not the best
27:19
battle to be through fighting at
27:21
this point. Yeah, yeah, yeah, it's
27:23
a great point. And actually in
27:25
your... Talking about your books in
27:27
the rules of contagion, you explore
27:29
why things spread. And I really
27:31
love that. So can you tell
27:34
us a bit more about that?
27:36
And how does patient thinking help
27:38
in understanding these patterns, especially for
27:40
diseases? Yeah. So I think that's,
27:42
um, striking what in the book
27:44
is a set out. thinking that
27:46
there would be these analogies in
27:48
other fields and I wasn't sure
27:51
necessarily how strong there would be.
27:53
But the more I dug into
27:55
that, I mean, there's actually... there
27:57
were in many cases very explicit
27:59
and very generally informative. For example,
28:01
after the 2008 financial crisis, there
28:03
was a very purely described epidemic
28:06
thinking that drove a lot of
28:08
the interventions resource, things like ring
28:10
fencing, things like capital requirements of
28:12
banks that were risk in network.
28:14
It's really about thinking about it
28:16
like a contagion problem. Similarly, if
28:18
you dig into the history of
28:21
companies like Buzzfeed that were very
28:23
good at generating viral content. They
28:25
were actually writing research reports on
28:27
how you evaluate the reproduction number
28:29
of marketing campaigns. And actually, that's
28:31
what we discovered, that Buzzfeed journalists
28:33
would have a measure equivalent to
28:36
the reproduction number as a metric
28:38
for their articles. So it wasn't
28:40
just this quite fuzzy comparison. Actually,
28:42
this was the same bits of
28:44
theory. that were appearing between these
28:46
two fields. I think one thing
28:48
I find quite interesting on the
28:51
Bayesian angle in how things spread
28:53
is particularly some of the debates
28:55
around how you convince people and
28:57
how people adopt beliefs and they
28:59
take off. Because there was quite
29:01
a popular idea for a while
29:03
known as the backfire effect, which
29:06
is where if you try and
29:08
convince someone, you can end up
29:10
basically just strengthening their existing belief.
29:12
and this idea that attempts to
29:14
change people's beliefs can kind of
29:16
backfire, which also doesn't bode well
29:18
for any kind of social progress
29:21
because it's this idea if you
29:23
try and convince someone that marriage
29:25
equality or something is a good
29:27
idea, it's just going to need
29:29
them to be entrenched. But what
29:31
subsequently happened is a lot of
29:33
the work, both on the applied
29:36
side of actually people kind of
29:38
get these, get support for these
29:40
kind of, this progress. but also
29:42
on some of the scientific side
29:44
of people studying them, suggested it's
29:46
actually much closer to a Bayesian
29:48
problem, that it's not that you're
29:51
leaving people. to entrench their beliefs.
29:53
Rather, if you give people weak
29:55
evidence, you're not going to shift
29:57
their distribution much. I thought it
29:59
was kind of really interesting. And
30:01
I hadn't actually thought about it
30:03
quite much in that way. And
30:06
it kind of makes sense that
30:08
even if you've got like a
30:10
quite strong prior for something, if
30:12
someone gives you evidence that agrees
30:14
with that prior. it's going to
30:16
look pretty similar after us, and
30:18
especially because we're not doing all
30:21
those calculations in our head. We're
30:23
just sort of seeing the feeling
30:25
we've come away with. And it
30:27
really struck me that actually in
30:29
those situations, we're probably much better
30:31
evaluating the effects of evidence that
30:33
we disagree with, because the posterior,
30:36
you know, just mathematically expected the
30:38
posterior to move more. And so
30:40
perhaps the situations where what people
30:42
were thinking was about flash effects,
30:44
it's more just we're better. critiquing
30:46
evidence we disagree with rather than
30:48
the ones that kind of line
30:50
up because if it agrees with
30:53
us we're going to walk away
30:55
the same opinion afterwards anyway. So
30:57
yeah I think that was it's
30:59
obviously a much harder problem to
31:01
study in terms of spread of
31:03
relief and there's so many factors
31:05
that can play into that but
31:08
yeah there's still a lot of
31:10
ongoing debate on the extent to
31:12
which people's adoption of beliefs and
31:14
behaviors, these Bayesian versus you know
31:16
some other factors that kind of
31:18
explain how those are updated over
31:20
time. Yeah, I see. That's, that's
31:23
really amazing. I love that. I
31:25
didn't know that, that example about,
31:27
what's the, what's the, what's the
31:29
website you were saying? Ah, well,
31:31
Bosfield, yeah, yes, thank you. I
31:33
was, it's remarkable, yeah, and it's,
31:35
it was coming to it because
31:38
they did, you know, campaigns about
31:40
a hurricane Katrina with. I mean,
31:42
none of these were viral and
31:44
this one they keep findings that
31:46
this wasn't like COVID where it
31:48
just spends, spends, spends, but you
31:50
know, for 10 shares they might
31:53
get an extra seven or eight.
31:55
So, you know, if you get
31:57
it, if you spark lots of
31:59
little clusters of sharing, you might
32:01
actually. yet quite quite considerable additional
32:03
uptake as a result. I think
32:05
there was one that was a
32:08
marketing campaign for detergent and it
32:10
wasn't accountable at all basically. So
32:12
there was quite a nice quantifying
32:14
that there's certain things that people
32:16
obviously want to tell people about
32:18
and others that even if you
32:20
got the biggest marketing budget in
32:23
the world you're going to struggle
32:25
to make washing up liquid contagious.
32:28
You know, I'm something I'm I'm curious
32:30
about and I asked this question to
32:32
Chris to at Stankon is Confrictly what
32:35
does it look like for you to
32:37
work on an epidemiological model? Like
32:39
who are you talking to and what's
32:42
your workflow and technical stack? Yeah, that's
32:44
really good question. Generally, anything we build
32:46
starts with the problem and often that
32:49
problem either comes from someone, if
32:51
there's a policy where a question comes
32:53
from, so on policy sites, so maybe
32:55
it's scientific advisors to certain agencies in
32:58
the case of like an applied organization
33:00
at MSF or WHO, it will
33:02
come from a representative who we're working
33:04
on maybe that outbreak. or that situation.
33:07
And there'll often be quite specific things
33:09
that people are interested. Yes, it
33:11
might be a forecast of actually that.
33:13
What are we dealing with? It might
33:16
be that there's a plan to implement
33:18
vaccination or control measure. A lot of
33:20
the work within the BOLA, for
33:22
example, different control measures being proposed and
33:25
people wanted some idea of the relative
33:27
impact that that would have. There's also,
33:29
of course, just on the scientific side,
33:32
so the work we've done. round
33:34
effects of behavioral immunity, it might be
33:36
that you've got lab colleagues who've noticed
33:38
some interesting features and want to work
33:41
out. How can I get sufficient
33:43
estimates out of my data? I mean,
33:45
I mean, this is, I think another
33:47
example where Bayesian thinking is very helpful
33:50
that you might have say lots and
33:52
lots of antibody responses. And you
33:54
don't want to analyze them all as
33:56
like individual data sets because there's going
33:59
to be some commonalities in just how
34:01
the biology works between individuals. So having
34:03
this kind of hierarchical models can
34:05
be very powerful because you can have
34:08
shared information just on the underlying dynamics
34:10
across the population, but you can also
34:12
have individual features of what we
34:14
look at previously and this sort of
34:17
thing. the kind of model was that
34:19
it depends a bit on whether it's
34:21
similar to another problem you've seen. So
34:24
a case of policy question, often
34:26
what we'll do is, you know, in
34:28
the many, many models and things we've
34:30
dealt with previously, you'll find the thing
34:33
that's most similar. And I think increasingly
34:35
we're seeing progress in libraries. So
34:37
a lot of the work we do
34:39
in is in our or in some
34:42
cases you can kind of stand back
34:44
ends. And so you'll take something
34:46
that maybe is from the library that
34:48
you think is most appropriate for that
34:51
and then a model that's templated up
34:53
that's closest to that. And then ideally
34:55
you'll just plug in and work
34:57
straight away. Often you might either adapt
35:00
some features of the model process in
35:02
terms of say how transmissions happen or
35:04
what groups are affected or you
35:07
might bring in different data. You might
35:09
have a model set up for UK
35:11
contact structure and you'll just import data
35:14
from somewhere else so you can adapt
35:16
it. In other cases, you might
35:18
get something that's a very specific, quite
35:20
neat question. So, you know, for example,
35:23
I don't know, like the ones doing
35:25
COVID, testing people at certain types of
35:27
gathering or something, and that's not
35:29
something necessarily you have a model off
35:32
the shelf, but in some cases, the
35:34
equation is quite simple to write down.
35:36
You know, if you've got this
35:38
many people and you're testing this many
35:41
at this point in time, and in
35:43
that case, we might just build something
35:45
more to smoke. The challenge always, I
35:48
mean like with any kind of
35:50
software development problem, is to what extent
35:52
do you do something quick for a
35:54
very specific problem versus take on coding
35:57
debt if you've got to do that
35:59
problem repeatedly. And so I think
36:01
we're getting a better sense now. There's
36:03
certain things that are sufficiently complex. There's
36:06
large enough script for bugs. It's going
36:08
to be useful for enough people.
36:10
Well, actually, that makes sense to package
36:12
up more consistent library. And there's other
36:15
things that actually are simpler enough and
36:17
transparent enough and kind of bespoke enough
36:19
that you don't want to build
36:21
a software tool for every single one
36:24
of those. Some of those you can
36:26
just do quickly as the problem arises.
36:28
Yeah. Yeah. That makes tons of sense.
36:31
And once you're, I'm wondering also
36:33
the size of the of the teams
36:35
in these, in these cases, because obviously
36:37
each time I talk to a, to
36:40
a modeler in your field, it
36:42
sounds like the models are really big
36:44
and huge and and take a lot
36:46
of time to work on because they
36:49
are so complex. So yeah, I'm wondering
36:51
how many people does it take
36:53
to work on a model and how
36:55
do you actually do that? Because Like
36:58
yeah, is everybody working on the model
37:00
at the same time? Do you have
37:02
some team for that part of
37:04
the model, another team for the other
37:07
part? How does that? Yeah, so I
37:09
think, I mean, to give you an
37:11
example of some of the big
37:13
kerbic scenario models that are used by
37:16
our group in the UK, I haven't
37:18
got the get have worked on it.
37:20
It's probably, you know, at least 10,
37:23
50 people who made it's a
37:25
central contributions to that co-based at their
37:27
various points in time. And in time.
37:30
And in some case, can be, I
37:32
mean, ideally we'd make these things as
37:34
module as possible. So early on,
37:36
for instance, I had some colleagues who
37:39
have focusing very much on the transmission
37:41
dynamics and kind of how people interact,
37:43
what interventions were going to be,
37:45
and I worked a lot more on
37:48
the sort of disease burden module. So
37:50
once you have transmission infections, you can
37:52
convert those infections, the investment of how
37:55
many are you going to be
37:57
hospitalized for. So then you have that
37:59
kind of basic model structure, but then
38:01
over time that got... expanded because variance
38:04
came. in, vaccines came in, and
38:06
you ended up with multiple versions of
38:08
those models. And there's always this kind
38:10
of challenge of, you know, do you
38:13
make kind of one core model that
38:15
has sufficient flexibility to do all
38:17
those problems? Or do you kind of
38:19
fork a model and use it for
38:22
a special case? And you're not going
38:24
to use it again. So that model
38:26
has actually been used for multiple
38:28
countries. We adapted it to a whole
38:31
range of different settings, I think. patterns
38:33
of immunity and infection. And in that
38:35
case, it didn't make sense just
38:37
to build that into the original model
38:40
because it was just such a kind
38:42
of specific example. But I think that's
38:44
one thing where as well, if we
38:47
had to go back now, because
38:49
there's so many versions of the model
38:51
and so many applications because it's real
38:53
time we had to deliver that very
38:56
quickly, it's obviously harder to now say
38:58
how we would do that for
39:00
flu epidemic. And so I think what
39:02
we're trying to move more towards is
39:05
these very modular examples that you can
39:07
plug in all the bits you
39:09
need but also have that capacity to
39:11
adapt it and I think that's just
39:14
that's that's kind of an ongoing challenge
39:16
that you can have these things that
39:18
very stable and structured but very
39:20
hard to adapt or you can have
39:23
these things that very maybe flexible and
39:25
easy to adapt but not necessarily as
39:27
kind of efficiently structured as you like.
39:30
I mean there are examples as
39:32
well modeling where it might be one
39:34
or two people developing something quite quickly
39:36
or just making use of a library.
39:39
I mean some of the popular
39:41
methods for example for estimating reproduction numbers
39:43
that tool will have been used by
39:45
a huge amount of people but obviously
39:48
the active contributors to the development might
39:50
be a lot smaller. Okay, yeah
39:52
I see. Yeah so a big big
39:55
diversity in the in the size of
39:57
the other projects. And do you have
39:59
a... Do you have a favorite type
40:02
of models actually that you'd like
40:04
to share with this? So I think
40:06
one of, well I think one of
40:08
the models that I think has delivered
40:11
a lot of value in various
40:13
things that we've worked on over the
40:15
years is actually both on some of
40:17
the H7 and 9 analysis we did
40:20
about 10 years ago for the efforts
40:22
in China where you had infections
40:24
coming from poultry and potential human transmission
40:26
as well. There's a big question of
40:29
looking at that human data, how much
40:31
was coming for human human infections. And
40:33
we actually fat. you've got an
40:35
analogous version of the problem with some
40:38
of the COVID variants. So for Delta,
40:40
how much of this was imported cases
40:42
from India versus transmission establishing in
40:44
countries. And in both those situations, so
40:47
first of all, it's a bit of
40:49
a modeling headache because a lot of
40:51
traditional models are structured in a way
40:54
that basically says if you've got
40:56
your cases over time. the new cases
40:58
that appear have to have been one
41:00
of those past ones that infected them.
41:03
So if you look at a
41:05
lot of the common calculations for how
41:07
you do reproduction numbers, it's known as
41:09
a generative model. So in other words,
41:12
when your equation is the kind of
41:14
specific version. So the new infections
41:16
are the products of the infections that
41:18
come before. Someone has caused that infection.
41:21
If you've got importations or spillover, that's
41:23
no longer the case. You've actually got
41:25
this additional term coming into your
41:27
equation. So it makes it a trickier
41:30
inference problem if you've got two groups
41:32
just to infecting someone. But it can
41:34
also be more powerful because in
41:36
the case of avian flu we knew
41:39
when the wild like poultry markets were
41:41
closed and in the case of Delta
41:43
we knew when the travel ban against
41:46
India was implemented. So what you've
41:48
got as a kind of estimation problem
41:50
you've got two things that influence your
41:52
infections but you know the kind of
41:55
shape of one of these, because you
41:57
know when the market goes, you
41:59
know when the flight buttons came in.
42:01
And suddenly that gives you a lot
42:04
more estimation power on the thing you
42:06
care about, which is humans' human
42:08
transmission. So I think for me that
42:10
that's just a really nice example of
42:13
a model that it's not super common.
42:15
It's very relatively easy to explain. It's
42:17
like there's two things that can
42:20
affect someone which is it. But actually,
42:22
you can squeeze a remarkable amount out
42:24
of your data as soon as you
42:27
know the shape of one of those
42:29
processes. Okay. And does that... I
42:31
mean, it obviously comes with significant challenges.
42:33
I'm using these models. detail of these
42:36
changes that you face when creating such
42:38
models? Yeah, I think there's a
42:40
whole mix. I mean, in some cases,
42:42
there's, you know, just just understanding the
42:45
person you define a model that for
42:47
a lot of outbreaks, there's a lot
42:49
of things you'd like to know
42:51
the role of in in a process.
42:54
So for example, how different age groups
42:56
are interacting or affected by certain things,
42:58
but in some cases you might not
43:01
have the data that you need
43:03
to actually pick that apart. So a
43:05
good example in COVID was a lot
43:07
of people, you know, the epidemic would
43:10
come down, a lot of people
43:12
would argue about why that was. And
43:14
actually if you just look at case
43:16
data or just look at deaths, it
43:19
could be immunity, it could be changing
43:21
in behavior. it could be something
43:23
to do with the climate, it could
43:25
be, you know, a whole range of
43:28
different things that could influence that transmission
43:30
and just looking at one time series
43:32
of animals, you can't distinguish what
43:34
those are. You really need some data
43:37
on antibiotics, you need some data on
43:39
social mixing to tell you which of
43:41
those is the most likely explanations.
43:43
I think that's often a big challenge
43:46
is where you have lots of potential
43:48
explanations and you can't actually untangle them.
43:50
I mean the other... to use H5N1
43:53
as a current example, unlike the...
43:55
H5N9 and Delta where we knew how
43:57
that shape of the introductions was changing.
43:59
We don't know that for H5. So
44:02
now you get cases popping up
44:04
and they say they haven't had a
44:06
contact with poetry, maybe it's a wild
44:08
word, maybe it's a human, we've got
44:11
no idea. And I think that's that's
44:13
a kind of big challenge. Almost
44:15
a model can't really tell you anything
44:17
at this point because this is the
44:20
data is so uninformative about the process
44:22
that we, yeah, there's lots of question
44:24
at the question at the moment
44:26
at the moment, and I get, and
44:29
I get, and I get, and I
44:31
get, I get, and I get, I
44:33
get, and I get, and I
44:35
get, and I get, and I get,
44:38
the output data is just too bad
44:40
to say much. We can design some,
44:43
you know, what if hypothetical models, but
44:45
I think that's much harder. I
44:47
think there's also just the technical challenge
44:49
that, you know, in terms of just
44:52
making sure that models you build are
44:54
without bugs, and then also edge cases,
44:56
another classic one that there's, there
44:58
can be sometimes some psychicalters, just things,
45:01
particularly if you're doing the kind of
45:03
delay processes. So one example, which I
45:05
think it was a sort of
45:07
communication one, but even if you have
45:10
an epidemic that's going up and you
45:12
suddenly stop it and you have a
45:14
delayed outcome like deaths, the peak in
45:17
when the epidemic stops isn't the
45:19
same as the peak in death because
45:21
you're doing a kind of delayed convolution,
45:23
we're doing it smooth out of lay
45:26
distribution. As a feature like that, you're
45:28
making sure that you've actually got
45:30
that relationship defined properly. And again, this
45:32
is I think why the move to
45:35
a lot more established libraries rather than
45:37
trying to make sure these things
45:39
are sort of bug-free in real-time. Okay,
45:41
yeah, yeah. And a question I often
45:44
get, you know, personally, and I'm guessing
45:46
you are getting a lot too, is,
45:48
okay, cool. I get uncertainty estimation,
45:50
you know, with the Bayesian models. Why
45:53
do I care? So
45:55
I think a lot of what
45:57
you want to do, particular decision
46:00
making. comes down to confidence and
46:02
evidence so particularly in in the
46:04
middle of an epidemic there's a
46:06
lot of things we don't know
46:08
with any confidence. So I guess
46:10
on the one hand you could
46:12
just ignore it and just go
46:14
for you know the point estimate
46:16
is it going up is it
46:18
going down but particularly if you
46:20
want to say you want to
46:22
ask is the epidemic under control
46:25
it's not very helpful to have
46:27
a yes-no always that you might
46:29
want to say yes is under
46:31
control how confident are you for
46:33
it is and again having that
46:35
uncertainty in a reproduction number, if
46:37
you're like, well, 100% of our
46:39
density is below one. And that's
46:41
what we had post-lockdowns and social
46:43
mixing day, for example, in the
46:45
UK, that there was uncertainty in
46:48
that distribution, but all of that
46:50
density was below one. So the
46:52
conclusion was we're very confident that
46:54
transmission is coming down. I think
46:56
similarly for some of the variants
46:58
we found that we found that
47:00
some of the variants we found
47:02
that you'd get uncertainty in the
47:04
estimates. So maybe it's 30% maybe
47:06
it. be very confident it is
47:08
more transmissible and I think it's
47:11
a difficult one because policy makers
47:13
sometimes love you know single answers
47:15
they don't want a kind of
47:17
vague but I think particularly if
47:19
your uncertainty lands either side of
47:21
a particular threshold that matters that
47:23
can in a way give you
47:25
more confident communicating of saying yeah
47:27
look this is definitely going up
47:29
or this is definitely almost definitely
47:31
on a different crop. Just it's
47:33
similar, you know, the clinical problem,
47:36
you look at the conference into
47:38
book, I think it's the equivalent
47:40
of that, and you want to
47:42
know, you know, how much can
47:44
you rely on this estimate? And
47:46
I think that's where the uncertainty
47:48
really comes in. Yeah, yeah, so
47:50
basically making sure that you're not
47:52
fooled by the variance of the
47:54
processes. Yeah, and I think especially
47:56
when you're dealing with exponential processes,
47:59
that you know, exponential processes, that
48:01
you know, exponential processes, that you
48:03
know, that you know, I was
48:05
going to accumulate. So there's something
48:07
that might feel quite small of
48:09
the transmission rates this. And actually
48:11
if it's slightly higher or slightly
48:13
lower, you get a very different
48:15
outcome. Even something, you know, say
48:17
it's a very simple example, each
48:19
person affects two others and you
48:22
cut transmission in half. So, you
48:24
know, each person affects one other,
48:26
just on one other, a tiny
48:28
amount of uncertainty there after, you
48:30
know, two months could be the
48:32
difference between a massive epidemic and
48:34
a handful of cases. you know
48:36
if you don't communicate that people
48:38
be asking well was a massive
48:40
epidemic when you said it would
48:42
it would be allowed based on
48:44
the best estimate. Yeah yeah yeah
48:47
that's definitely a good point in
48:49
I'm curious also to have your
48:51
to hear your thoughts about you
48:53
know the latest advancement step we've
48:55
seen in the last year in
48:57
artificial intelligence and large language models
48:59
because I'm guessing this is going
49:01
to have also an impact hopefully
49:03
for the best on epidemiology. So
49:05
yeah with these advancements how do
49:07
you see the future of the
49:10
field of epidemiology and and the
49:12
role of patient stance in it?
49:14
Yeah I think it's a huge
49:16
amount of They're really promising development.
49:18
I think a lot of it
49:20
at the moment that we're working
49:22
on in most others is trying
49:24
to find where do these solutions
49:26
work. I think essentially we've been
49:28
given this increasingly amazing toolkit, but
49:30
in many cases it wasn't necessarily
49:32
developed exact problems we're working on.
49:35
And so finding where the applications
49:37
work is going to be very
49:39
powerful, where are the ones that
49:41
is going to struggle more. And
49:43
so even... It's a bit of
49:45
AI or some more traditional machine
49:47
learning approaches that there might be
49:49
situations where if we're very interested
49:51
in the mechanism and we've got
49:53
some process we define our model,
49:55
perhaps that estimation or that prediction
49:58
approach. isn't optimal for actually what
50:00
we're trying to solve. But I
50:02
think equally, the pandemic re-showed, there
50:04
was a huge amount of data
50:06
out there that in many cases
50:08
we weren't able to interpret in
50:10
a meaningful way. So you might
50:12
have something like a social contact
50:14
and if it got by a
50:16
certain amount, but you put that
50:18
in a model and a very
50:21
meaningful change. But you might have
50:23
quite a lot of very noisy
50:25
data, which is giving some indication
50:27
about how people are behaving. But
50:29
you can't extract those features and
50:31
weigh them in a useful way
50:33
in the same sense that an
50:35
AI model could do. So I
50:37
think it's in a way to
50:39
find the range of problems. I
50:41
think broadly the challenge we have
50:43
for our works often that they're
50:46
quite rare events. So even what
50:48
do you what do you validate
50:50
against and what and what's your
50:52
input and what you kind of
50:54
outcome you kind of predict. But
50:56
I think within a epidemic especially
50:58
you get larger clinical data sets,
51:00
larger behavioral data sets of a
51:02
lot of value there. I think
51:04
also some of the the methods
51:06
like universal differential equations and other
51:09
things coming through where it's taking
51:11
that that sort of transmission model
51:13
structure but then incorporating things like
51:15
neural networks to to allow for
51:17
more complexity in understanding the patterns
51:19
that are kind of going on
51:21
alongside that I think there's a
51:23
lot of really interesting progress there
51:25
I think just also just more
51:27
generally in the field there's some
51:29
of the AI models that essentially
51:32
learning features that can I mean
51:34
by the forecasting is one example
51:36
where it's many many times faster
51:38
and less energy intensive for simulations.
51:40
I think again that kind of
51:42
you can find ways of approximating
51:44
more complex models. I think just
51:46
also just in terms of the
51:48
data that comes in I mean
51:50
a lot of outbreaks often describe
51:52
the narrative reports where it's kind
51:54
of like a few paragraphs and
51:57
so did this and went here
51:59
and did that. And one of
52:01
my colleagues did this nice prototyping
52:03
but equally on this. That's all
52:05
local models that can take those
52:07
quite difficult to interpret. Now it's
52:09
important to convert them to structured
52:11
data which you can then... Yeah,
52:13
basically, you know, like in a
52:15
lot of other fields, you would
52:17
like to see kind of a
52:20
better communication, but I think there's
52:22
a lot of work, really across
52:24
the spectrum, both in a lot
52:26
of other fields, you would like
52:28
to see kind of a better
52:30
communication between these. kind of models,
52:32
the LA models in the humans
52:34
in the loop, but clearly you
52:36
don't see these kind of models
52:38
being made entirely by artificial intelligence.
52:40
No, but we've tried it actually,
52:42
so even things like Copilot workspace,
52:45
which look, you try them out
52:47
on a simple problem and they
52:49
look really, really cool. So, yeah,
52:51
the idea that you can just
52:53
give it a code base and
52:55
tell it to do things. But
52:57
often for quite specific things, if
52:59
you say build me a model
53:01
with these features to do this,
53:03
I think because just the training
53:05
set, just doesn't include anything with
53:08
them, or very little with them,
53:10
that kind of problem. So if
53:12
you want an LLEM to build
53:14
you, to build you some JavaScript,
53:16
it's pretty good, because there's just
53:18
so much to train on. If
53:20
you want quite an inch compatible
53:22
model of an epidemic, it's struggles
53:24
in some way. So I think
53:26
finding the shape, you know, it's
53:28
great for, I've all these functions.
53:31
package them up and add the
53:33
documentation and this kind of stuff.
53:35
It makes a lot of tasks
53:37
faster. But I think, yeah, the
53:39
idea is she's going to do
53:41
science. I think it's maybe put
53:43
forward by people who've only worked
53:45
on a narrow set of scientific
53:47
problems. And there's probably some science
53:49
that's going to do well and
53:51
then there's some that's going to
53:53
really struggle with, hopefully not just
53:56
the boring bits. But we'll, I
53:58
think we need to, yeah, map
54:00
out where those gaps are. You
54:02
know, it's the same in my
54:04
feel, whether that's sports modeling or
54:06
or just, you know, statistical. modeling
54:08
in general where it's really funny
54:10
because if you ask an LLLM
54:12
what a hierarchical model is it
54:14
can explain into training well you
54:16
know that's like that's exactly what
54:19
I would explain my students but
54:21
then if you ask for Pymc
54:23
code of that the code the
54:25
model is not at all hierarchical
54:27
it's just a classic model you
54:29
know but the the L& will
54:31
like well this is a hierarchical
54:33
model but it's not. So I
54:35
definitely have to drive that back
54:37
and forth but it makes you
54:39
it makes you yeah more more
54:42
efficient also you know it can
54:44
like it helps me for instance
54:46
to kill my darlings maybe a
54:48
bit faster which is very important
54:50
in modeling that's always better to
54:52
do it in my experience with
54:54
a human but sometimes you don't
54:56
have someone at the same stat
54:58
level as you in your organization
55:00
or project so then you have
55:02
you have to do that on
55:04
your own and that can be
55:07
quite hard. I think there's something,
55:09
we're using it actually for things
55:11
like reviewing training materials, because sometimes
55:13
you want to throw at human
55:15
view, but sometimes you want the
55:17
first part of, you know, if
55:19
an applied field ecologists is reading
55:21
this, is there anything in there
55:23
that just is going to not
55:25
make sense or really, really obvious
55:27
things you want to fix. It
55:30
can just help accelerate a lot
55:32
of that kind of production process.
55:34
Okay. Okay. Yeah. That's actually, yeah.
55:36
That sounds very useful. So you've
55:38
been already very generous with your
55:40
time. So we're gonna I'm gonna
55:42
play us out here, but I'm
55:44
also curious to hear Your thoughts
55:46
about more educational perspective because you
55:48
do a lot as we've heard
55:50
a lot of public communication. So
55:52
Given your experience what educational initiatives
55:55
would you recommend to better prepare
55:57
next? generations of epidemiology. Yes, but
55:59
also policymakers and citizens in general.
56:01
Yeah, I think there's probably like
56:03
some specific angle within the audience,
56:05
which has become more general ones.
56:07
I mean, I think it's not
56:09
realistic to try and get people
56:11
to have a deep understanding of
56:13
epidemics any more than it's realistic
56:15
to have a very, very deep
56:18
understanding of kind of the unique
56:20
threats or something. But I think
56:22
there are certain features of epidemics
56:24
which a very important and very
56:26
often misunderstood. So I think one
56:28
aspect is exponential growth is a
56:30
concept that people find very difficult.
56:32
You look at who makes and
56:34
loses money in finance and there's
56:36
a definite divide in terms of
56:38
understanding that's a concept. And so
56:41
I think having more interest of
56:43
that across more people is very
56:45
helpful. Similarly, things like having lagged
56:47
outcomes that, you know, if you
56:49
have an event you care about
56:51
and then an effect that happens
56:53
later. That's something that's kind of
56:55
widely misinterpreted during the pandemic. And
56:57
yeah, I think it wasn't increasingly,
56:59
as code went on, it wasn't
57:01
just people who hold academic posts
57:03
in epidemiologists that were making a
57:06
lot of useful contributions. You had
57:08
a lot of people who were
57:10
adjacent industries who maybe, you know,
57:12
actually doing finance and other bits
57:14
of, you know, academic work or
57:16
even just, you know, mass teachers,
57:18
whatever doing. quite useful stuff because
57:20
they hadn't understand those concepts when
57:22
you get a feel for some
57:24
of those data problems that just
57:26
needed more eyes on them. So
57:29
I think for me that's where
57:31
a lot of value is. It's
57:33
like how do we have more
57:35
people who can just not get
57:37
very basic things wrong and just
57:39
have useful eyes on the problem
57:41
even if they don't know the
57:43
ins and outs exactly how you
57:45
calculate that specific prancer. I think
57:47
more generally though we're also just
57:49
seeing I think it's a challenge
57:52
for a lot of fields that
57:54
with the emergency AI with the
57:56
mode of flying the sort of
57:58
more complex understand of epidemics. I
58:01
think we are moving into a
58:03
world that's much harder to sort
58:05
of teach yourself everything from scratch.
58:07
I mean, you know, if you
58:10
think about even statistics as a
58:12
field, the sort of statistics that
58:14
practitioners do relative to what's important
58:16
schools is now very different. You
58:19
know, it's the sort of doing
58:21
kind of regression lines and the
58:23
things that, you know, I did
58:25
at a level. And I think
58:28
we're seeing that in a lot
58:30
of fields now, where actually the
58:32
cutting edge is so far detached.
58:34
partly challenges in communication because even
58:37
if you, a very interesting climate,
58:39
you can't, I can't go and
58:41
run a climate model in the
58:44
way that you might be able
58:46
to do a simple mathematical proof
58:48
or a simple statistical problem. So
58:50
I think there's that kind of
58:53
relationship of how do we build
58:55
trust and enough understanding of how
58:57
those fields work that people can
58:59
engage with them even though actually
59:02
that the kind of cutting edge
59:04
is now. so computationally intensive, in
59:06
some cases just so difficult to
59:08
explain in terms of the algorithms.
59:11
Yeah, I think like AI is
59:13
a problem example, but I think
59:15
there's a lot of those situations
59:17
where it feels like there's a
59:20
bit of a growing gap that
59:22
we need to bridge. Yeah, yeah,
59:24
yeah. Yeah, definitely. I mean, completely
59:27
agree with everything you just said.
59:29
That's a topic that's the other
59:31
to me, obviously, and... We've talked
59:33
about that several times on the
59:36
podcast. I think one of the
59:38
best episodes we've done, but that
59:40
was episode 50 with Sir David
59:42
Piegel Halter, only sir, he had
59:45
to have been on the podcast.
59:47
But yeah, that was a great
59:49
episode. David is an awesome communicator
59:51
also. So I'll put that episode
59:54
in the show notes for people
59:56
who want to dig deeper. one
59:58
of the episodes I recommend a
1:00:00
lot. And then this, the next
1:00:03
one, episode 51, with Aubrey Clayton
1:00:05
about his book. and the crisis
1:00:07
of modern science really recommend the
1:00:10
book and end the episode also
1:00:12
because these two together make a
1:00:14
great combination if you're interested in
1:00:16
epidemiology. Okay, well, I am. That
1:00:19
was awesome. Really, thank you so
1:00:21
much for taking so much time.
1:00:23
was going to be fascinating and
1:00:25
it was. So, thanks a lot.
1:00:28
Thanks a lot to Chris Wyment
1:00:30
again. But before letting you go,
1:00:32
of course, I'm going to ask
1:00:34
you the last questions. I ask
1:00:37
every guest at the end of
1:00:39
the show. One, if you had
1:00:41
unlimited time and resources, which problem
1:00:43
would you try to solve? That's
1:00:46
a big one. One of the
1:00:48
things coming out of COVID that
1:00:50
really struck me is with a
1:00:53
lot of infections, we're actually quite
1:00:55
unambitious, I think. You know, that
1:00:57
we put up with a lot
1:00:59
of disease in day life and
1:01:02
even free COVID, if you look
1:01:04
at a lot of adverse for
1:01:06
medicine when you're ill, it's, you
1:01:08
know, just keep going, keep going.
1:01:11
And I think we saw a
1:01:13
lot of indications during COVID for
1:01:15
some of the sort of technologies
1:01:17
and approaches. it weren't lockdowns but
1:01:20
were actually much proficient and even
1:01:22
just like the work that I've
1:01:24
sort of done on things like
1:01:26
digital tools. So I think we've
1:01:29
got these little signals that we
1:01:31
could be much much better in
1:01:33
how we tackle these problems and
1:01:36
I think that's something that would
1:01:38
require quite a lot of resource
1:01:40
and time to do effectively. But
1:01:42
even like I've got young children
1:01:45
and yeah there's a lot of
1:01:47
kids that kind of hospitalized with
1:01:49
a lot of infections that We
1:01:51
could understand, we could test about,
1:01:54
and I think we rely a
1:01:56
lot on, we wait for a
1:01:58
vaccine to be developed, but I
1:02:00
think I've always just wondered, yeah.
1:02:03
could we do something a bit
1:02:05
clever? Can we go to all
1:02:07
this technology? I mean, there's a
1:02:09
US colleague who, during COVID, said,
1:02:12
you know, this is our Apollo
1:02:14
mission. Can we actually do something
1:02:16
extremely innovative and ambitious, COVID? I
1:02:19
think vaccines were amazing, some of
1:02:21
the treatments coming down, but we
1:02:23
didn't, I think, solve that, that
1:02:25
much earlier, a problem of what
1:02:28
to do around infection. So yeah,
1:02:30
I'd love us to live in
1:02:32
a different world where we can
1:02:34
actually. Yeah, yeah, yeah, definitely share
1:02:37
that passion and objective. And if
1:02:39
you had, if you could have
1:02:41
dinner with a great scientific mind,
1:02:43
dead alive or fictional, who would
1:02:46
it be? So I think recently
1:02:48
for a project, I mean, reading
1:02:50
up a lot about William Gossett
1:02:52
called AK Student. who develops tests
1:02:55
and works at Guinness, but actually
1:02:57
digging more into his work, he
1:02:59
was a really interesting character because
1:03:02
there was a lot of conflict
1:03:04
with Fisher in terms of outlook.
1:03:06
And Fisher was very much from
1:03:08
the kind of academic focus of,
1:03:11
you know, you recruit knowledge and
1:03:13
you want very high confidence in
1:03:15
that knowledge and that's why you
1:03:17
have the sort of thresholds and
1:03:20
experiments of science that he's been
1:03:22
bored. Gossett was much more of
1:03:24
a pragmatist. Yeah, he was working
1:03:26
for a big business and I
1:03:29
mean, there's one situation where he
1:03:31
had a P value of point,
1:03:33
um, point one three and he
1:03:36
said, you know, that's very good
1:03:38
evidence. You know, if it's a
1:03:40
business and it doesn't cost much
1:03:42
and we can explore further, it's
1:03:45
worth going forward. Well, as you
1:03:47
know, Fisher would have had that.
1:03:49
Okay, if it's, if it's not
1:03:51
hitting the five percent, we're not
1:03:54
interested, we're not interested. And we're
1:03:56
not interested. maybe a part of
1:03:58
statistics that got suppressed I think
1:04:00
probably a lot of the 20th
1:04:03
century. I think Fisher and Co.
1:04:05
probably one hour. in many ways
1:04:07
in terms of imposing those those
1:04:09
criteria and in some cases throwing
1:04:12
away a lot of evidence and
1:04:14
I mean although it wasn't so
1:04:16
it was basically this was very
1:04:19
anti-basian but I think that that
1:04:21
making use of limited information the
1:04:23
gossip was you know it was
1:04:25
very adamant in some cases I
1:04:28
think you had budget to get
1:04:30
two data points or something and
1:04:32
it was still that I want
1:04:34
to do something like that. So
1:04:37
I think that would be a
1:04:39
very interesting person to dig into
1:04:41
the strategy. It also just sounds
1:04:43
like he was just a very
1:04:46
nice guy relative to Fisher and
1:04:48
some of the others at the
1:04:50
time. So yeah, I think that
1:04:52
the kind of different outlook of
1:04:55
where you set the bar and
1:04:57
actually, you know, what are you
1:04:59
trying to do with statistics? Are
1:05:02
you trying to kind of get
1:05:04
perfect knowledge or are you actually
1:05:06
just trying to make better decisions?
1:05:08
Wasn't that hard to be a
1:05:11
nicer guy than Fisher apparently? But
1:05:13
yeah, no, I mean, I definitely
1:05:15
resonate with what you were saying
1:05:17
because all my models, most of
1:05:20
my models, I do them not
1:05:22
for academic use, but for companies.
1:05:24
And yeah, like, there is a
1:05:26
lot of things you have to
1:05:29
be able. to prioritize. I think
1:05:31
that's one of the most important
1:05:33
skills to have as a modeler,
1:05:35
especially if you're in a small
1:05:38
team, right? If you're in a
1:05:40
big project with a lot of
1:05:42
modelers, you can explore a lot
1:05:45
of path at the same time.
1:05:47
But if you're in a small
1:05:49
team, then you have to really
1:05:51
be able to set the priorities
1:05:54
and see what path you want
1:05:56
to explore first. That's often a
1:05:58
metaphor used to explain what the
1:06:00
modeling process is. It's like being
1:06:03
lost in a desert and you're
1:06:05
trying to find your way out
1:06:07
in the most efficient way is
1:06:09
usually to explore a lot of
1:06:12
path and see which ones are
1:06:14
successful. There can be several of
1:06:16
them. There can be just one
1:06:18
of them. There can be zero
1:06:21
sometimes. But also exploring a path
1:06:23
that ends up not being successful
1:06:25
is actually very informative. because then
1:06:28
that means the people behind you
1:06:30
won't make the same mistakes and
1:06:32
they won't go down that path.
1:06:34
So to explore these paths you
1:06:37
can do it alone if you
1:06:39
don't have any more people that's
1:06:41
just going to take you more
1:06:43
time but you have to do
1:06:46
it or you can do it
1:06:48
with several people simultaneously. But that's
1:06:50
where the idea of priorities, also
1:06:52
very important because your priorities are
1:06:55
going to dictate which path you
1:06:57
go down first. Yeah,
1:07:00
and I think it's that
1:07:02
that I said that the
1:07:04
priority is in the focus
1:07:06
that I think to kind
1:07:08
of angle which you know
1:07:10
maybe in the kind of
1:07:13
the pursuit perfection We don't
1:07:15
always get that balance right
1:07:17
in it particularly when it's
1:07:19
kind of academic academic research
1:07:21
interacting with very fast problems
1:07:23
Right. Yeah. Yeah. Awesome. Well
1:07:26
Adam Let's call you to
1:07:28
show that was this was
1:07:30
in Absolutely pleasure to have
1:07:32
you here on the show
1:07:34
as usual. I will link
1:07:36
to your website and your
1:07:39
socials and your books in
1:07:41
the show notes for people
1:07:43
who want to dig deeper.
1:07:45
Thank you again Adam for
1:07:47
taking the time and being
1:07:49
on this show. Yeah, that's
1:07:51
not me good shot. Be
1:07:54
sure to rate, review and
1:07:56
follow the show on your
1:07:58
favorite. put catcher and visit
1:08:00
run-based stats.com for more resources
1:08:02
about today's topics as well
1:08:04
as access to more episodes
1:08:07
to help you reach true
1:08:09
patient state of mind. That's
1:08:11
run-based stats.com. Our theme music
1:08:13
is good-basin by Baba Brittman,
1:08:15
fit M.S. and Megaran. Check
1:08:17
out his awesome work at
1:08:20
Baba Brittman.com. I'm your host
1:08:22
Alex Andorra. You can follow
1:08:24
me on Twitter at Alex
1:08:26
Undor andora like the country.
1:08:28
You can support the show.
1:08:30
and unlock exclusive benefits by
1:08:32
visiting patron.com/learn base stats. Thank
1:08:35
you so much for listing
1:08:37
and for your support, you're
1:08:39
truly a good lazy and
1:08:41
change your predictions after taking
1:08:43
information and if you're thinking
1:08:45
I'll be less than amazing.
1:08:48
Let's adjust those expectations. Let
1:08:50
me show you how to
1:08:52
be a good lazy. Change
1:08:54
calculations after taking fresh data.
1:08:56
that your brain is making.
1:08:58
Let's get them on a
1:09:01
solid foundation.
Podchaser is the ultimate destination for podcast data, search, and discovery. Learn More