WEBVTT
Kind: captions
Language: en
00:00:00.030 --> 00:00:02.010
this is a footnote to the main video on
00:00:02.010 --> 00:00:04.799
Bayes theorem if your goal is simply to
00:00:04.799 --> 00:00:06.930
understand why it's true from a
00:00:06.930 --> 00:00:08.880
mathematical standpoint there's actually
00:00:08.880 --> 00:00:10.440
a very quick way to see it based on
00:00:10.440 --> 00:00:12.660
breaking down how the word and works in
00:00:12.660 --> 00:00:15.299
probability let's say there are two
00:00:15.299 --> 00:00:18.480
events a and B what's the probability
00:00:18.480 --> 00:00:20.730
that both of them happen on the one hand
00:00:20.730 --> 00:00:22.109
you could start by thinking of the
00:00:22.109 --> 00:00:24.600
probability of a the proportion of all
00:00:24.600 --> 00:00:26.609
possibilities where a is true then
00:00:26.609 --> 00:00:29.369
multiply it by the proportion of those
00:00:29.369 --> 00:00:32.099
events where B is also true which is
00:00:32.099 --> 00:00:34.489
known as the probability of B given a
00:00:34.489 --> 00:00:36.989
but it's strange for the formula to look
00:00:36.989 --> 00:00:39.750
a symmetric in a and B presumably we
00:00:39.750 --> 00:00:41.250
should also be able to think of it as
00:00:41.250 --> 00:00:43.469
the proportion of cases where B is true
00:00:43.469 --> 00:00:46.020
among all possibilities times the
00:00:46.020 --> 00:00:48.809
proportion of those where a is also true
00:00:48.809 --> 00:00:52.590
the probability of a given B these are
00:00:52.590 --> 00:00:54.840
both the same and the fact that they're
00:00:54.840 --> 00:00:56.760
both the same gives us a way to express
00:00:56.760 --> 00:01:00.180
P of a given B in terms of P of B given
00:01:00.180 --> 00:01:06.180
a or the other way around so when one of
00:01:06.180 --> 00:01:07.710
these conditions is easier to put
00:01:07.710 --> 00:01:10.140
numbers to than the other say when it's
00:01:10.140 --> 00:01:11.700
easier to think about the probability of
00:01:11.700 --> 00:01:13.979
seeing some evidence given a hypothesis
00:01:13.979 --> 00:01:15.720
rather than the other way around
00:01:15.720 --> 00:01:18.420
this simple identity becomes a useful
00:01:18.420 --> 00:01:21.150
tool nevertheless even if this is
00:01:21.150 --> 00:01:23.220
somehow a more pure or quick way to
00:01:23.220 --> 00:01:25.229
understand the formula the reason I
00:01:25.229 --> 00:01:26.759
chose to frame everything in terms of
00:01:26.759 --> 00:01:28.740
updating beliefs with evidence in the
00:01:28.740 --> 00:01:30.720
main video is to help with that third
00:01:30.720 --> 00:01:32.820
level of understanding being able to
00:01:32.820 --> 00:01:35.280
recognize win this formula among the
00:01:35.280 --> 00:01:37.259
wide landscape of available tools in
00:01:37.259 --> 00:01:39.710
math happens to be the right one to use
00:01:39.710 --> 00:01:42.030
otherwise it's kind of easy to just look
00:01:42.030 --> 00:01:44.729
at it not along and promptly forget and
00:01:44.729 --> 00:01:47.159
you know while we're here it's worth
00:01:47.159 --> 00:01:48.869
highlighting a common misconception that
00:01:48.869 --> 00:01:52.200
the probability of a and B is P of a
00:01:52.200 --> 00:01:55.049
times P of B for example if you hear
00:01:55.049 --> 00:01:56.790
that one in four people die of heart
00:01:56.790 --> 00:01:58.740
disease it's really tempting to think
00:01:58.740 --> 00:02:00.090
that that means the probability that
00:02:00.090 --> 00:02:02.670
both you and your brother die of heart
00:02:02.670 --> 00:02:04.740
disease is one in four times one and
00:02:04.740 --> 00:02:07.710
four or one in sixteen after all the
00:02:07.710 --> 00:02:09.569
probability of two successive coin flips
00:02:09.569 --> 00:02:12.599
yielding tails is 1/2 times 1/2 and the
00:02:12.599 --> 00:02:13.770
probability of rolling 2
00:02:13.770 --> 00:02:17.100
one's on a pair of dice is 1/6 times 1/6
00:02:17.100 --> 00:02:22.230
right the issue is correlation if your
00:02:22.230 --> 00:02:23.610
brother dies of heart disease and
00:02:23.610 --> 00:02:25.320
considering certain genetic and
00:02:25.320 --> 00:02:26.520
lifestyle links that are at play here
00:02:26.520 --> 00:02:28.890
your chances of dying from a similar
00:02:28.890 --> 00:02:31.950
condition are higher a formula like this
00:02:31.950 --> 00:02:34.020
as tempting and clean as it looks is
00:02:34.020 --> 00:02:38.120
just flat-out wrong
00:02:38.120 --> 00:02:40.650
what's going on with cases like flipping
00:02:40.650 --> 00:02:42.780
coins are rolling two dice is that each
00:02:42.780 --> 00:02:46.050
event is independent of the last so the
00:02:46.050 --> 00:02:48.840
probability of B given a is the same as
00:02:48.840 --> 00:02:51.030
the probability of B what happens to a
00:02:51.030 --> 00:02:53.460
does not affect B this is the definition
00:02:53.460 --> 00:02:56.340
of independence keep in mind many
00:02:56.340 --> 00:02:58.440
introductory probability examples are
00:02:58.440 --> 00:03:01.230
given in very gamified contexts things
00:03:01.230 --> 00:03:03.480
with dice and coins where genuine
00:03:03.480 --> 00:03:05.490
independence holds but all those
00:03:05.490 --> 00:03:07.980
examples can skew your intuitions the
00:03:07.980 --> 00:03:09.210
irony is that some of the most
00:03:09.210 --> 00:03:11.220
interesting applications of probability
00:03:11.220 --> 00:03:13.380
presumably the whole motivation for the
00:03:13.380 --> 00:03:15.120
kind of courses using these gamified
00:03:15.120 --> 00:03:17.790
examples our only substantive win events
00:03:17.790 --> 00:03:20.280
aren't independent Bayes theorem which
00:03:20.280 --> 00:03:22.380
measures exactly how much one variable
00:03:22.380 --> 00:03:24.630
depends on another is a perfect example
00:03:24.630 --> 00:03:25.610
of this
00:03:25.610 --> 00:03:47.389
[Music]