WEBVTT
Kind: captions
Language: en
00:00:03.780 --> 00:00:14.590
[Music]
00:00:14.590 --> 00:00:17.360
here I want to discuss one common type
00:00:17.360 --> 00:00:19.210
of problem where integration comes up
00:00:19.210 --> 00:00:21.950
finding the average of a continuous
00:00:21.950 --> 00:00:24.529
variable this is a perfectly useful
00:00:24.529 --> 00:00:26.300
thing to know in its own right but
00:00:26.300 --> 00:00:27.710
what's really neat is that it can give
00:00:27.710 --> 00:00:29.390
us a completely different perspective
00:00:29.390 --> 00:00:31.489
for why integrals and derivatives are
00:00:31.489 --> 00:00:34.280
inverses of each other to start take a
00:00:34.280 --> 00:00:36.620
look at the graph of sine of X between 0
00:00:36.620 --> 00:00:40.550
and PI which is half of its period what
00:00:40.550 --> 00:00:42.710
is the average height of this graph on
00:00:42.710 --> 00:00:45.590
that interval it's not a useless
00:00:45.590 --> 00:00:48.050
question all sorts of cyclic phenomena
00:00:48.050 --> 00:00:49.940
in the world are modeled using sine
00:00:49.940 --> 00:00:52.670
waves for example the number of hours
00:00:52.670 --> 00:00:55.130
that the Sun is up per day as a function
00:00:55.130 --> 00:00:57.140
of what day of the year it is follows a
00:00:57.140 --> 00:00:59.360
sine wave pattern so if you wanted to
00:00:59.360 --> 00:01:01.940
predict say the average effectiveness of
00:01:01.940 --> 00:01:03.920
solar panels in summer months versus
00:01:03.920 --> 00:01:06.020
winter months you'd want to be able to
00:01:06.020 --> 00:01:08.540
answer a question like this what is the
00:01:08.540 --> 00:01:11.659
average value of that sine function over
00:01:11.659 --> 00:01:14.540
half of its period whereas a case like
00:01:14.540 --> 00:01:16.369
this is going to have all sorts of
00:01:16.369 --> 00:01:18.830
constants mucking up the function you
00:01:18.830 --> 00:01:20.479
and I are just going to focus on a pure
00:01:20.479 --> 00:01:23.330
unencumbered sine of X function but the
00:01:23.330 --> 00:01:24.650
substance of the approach would be
00:01:24.650 --> 00:01:26.300
totally the same in any other
00:01:26.300 --> 00:01:29.030
application it's kind of a weird
00:01:29.030 --> 00:01:30.890
question to think about though isn't it
00:01:30.890 --> 00:01:33.140
the average of a continuous variable
00:01:33.140 --> 00:01:35.690
usually with averages we think of a
00:01:35.690 --> 00:01:37.790
finite number of variables where you can
00:01:37.790 --> 00:01:40.310
add them all up and divide that sum by
00:01:40.310 --> 00:01:43.920
how many there are
00:01:43.920 --> 00:01:46.740
but there are infinitely many values of
00:01:46.740 --> 00:01:49.830
sine of X between 0 and pi and it's not
00:01:49.830 --> 00:01:51.479
like we can just add up all of those
00:01:51.479 --> 00:01:54.570
numbers and divide by infinity now this
00:01:54.570 --> 00:01:56.310
sensation actually comes up a lot in
00:01:56.310 --> 00:01:58.409
math and it's worth remembering where
00:01:58.409 --> 00:02:00.360
you have this vague sense that what you
00:02:00.360 --> 00:02:02.430
want to do is add together infinitely
00:02:02.430 --> 00:02:04.920
many values associated with a continuum
00:02:04.920 --> 00:02:06.780
even though that doesn't really make
00:02:06.780 --> 00:02:09.660
sense and almost always when you get
00:02:09.660 --> 00:02:12.000
that sense the key is going to be to use
00:02:12.000 --> 00:02:14.400
an integral somehow and to think through
00:02:14.400 --> 00:02:17.220
exactly how a good first step is usually
00:02:17.220 --> 00:02:19.470
to just approximate your situation with
00:02:19.470 --> 00:02:22.050
some kind of finite sum in this case
00:02:22.050 --> 00:02:24.420
imagine sampling a finite number of
00:02:24.420 --> 00:02:26.989
points evenly spaced along this range
00:02:26.989 --> 00:02:29.910
since it's a finite sample you can find
00:02:29.910 --> 00:02:31.980
the average by just adding up all of the
00:02:31.980 --> 00:02:33.930
heights sine of X at each one of these
00:02:33.930 --> 00:02:36.360
and then dividing that sum by the number
00:02:36.360 --> 00:02:39.260
of points that you sampled right and
00:02:39.260 --> 00:02:41.910
presumably if the idea of an average
00:02:41.910 --> 00:02:44.040
height among all infinitely many points
00:02:44.040 --> 00:02:46.530
is going to make any sense at all the
00:02:46.530 --> 00:02:48.360
more points we sample which would
00:02:48.360 --> 00:02:49.890
involve adding up more and more heights
00:02:49.890 --> 00:02:52.650
the closer the average of that sample
00:02:52.650 --> 00:02:54.959
should be to the actual average of the
00:02:54.959 --> 00:02:57.959
continuous variable and this should feel
00:02:57.959 --> 00:03:00.180
at least somewhat related to taking an
00:03:00.180 --> 00:03:02.430
integral of sine of X between 0 and PI
00:03:02.430 --> 00:03:05.040
even if it might not be exactly clear
00:03:05.040 --> 00:03:07.799
how the two ideas match up for that
00:03:07.799 --> 00:03:10.079
integral remember you also think of a
00:03:10.079 --> 00:03:12.810
sample of inputs on this continuum but
00:03:12.810 --> 00:03:14.730
instead of adding the height sine of X
00:03:14.730 --> 00:03:16.620
at each one and dividing by how many
00:03:16.620 --> 00:03:20.100
there are you add up sine of x times DX
00:03:20.100 --> 00:03:22.920
where DX is the spacing between the
00:03:22.920 --> 00:03:25.650
samples that is you're adding up little
00:03:25.650 --> 00:03:29.190
areas not Heights and technically the
00:03:29.190 --> 00:03:30.930
integral is not quite this sum it's
00:03:30.930 --> 00:03:33.750
whatever that sum approaches as DX
00:03:33.750 --> 00:03:36.510
approaches 0 but it is actually quite
00:03:36.510 --> 00:03:38.610
helpful to reason with respect to one of
00:03:38.610 --> 00:03:40.769
these finite iterations where we're
00:03:40.769 --> 00:03:43.200
looking at a concrete size for DX and
00:03:43.200 --> 00:03:46.440
some specific number of rectangles so
00:03:46.440 --> 00:03:48.720
what you want to do here is reframe this
00:03:48.720 --> 00:03:51.269
expression for the average this sum of
00:03:51.269 --> 00:03:53.010
the heights divided by the number of
00:03:53.010 --> 00:03:56.489
sample points in terms of DX the spacing
00:03:56.489 --> 00:03:57.090
between sin
00:03:57.090 --> 00:04:00.540
apples and now if I tell you that the
00:04:00.540 --> 00:04:03.030
spacing between these points is say zero
00:04:03.030 --> 00:04:05.519
point one and you know that they range
00:04:05.519 --> 00:04:08.310
from zero to pi can you tell me how many
00:04:08.310 --> 00:04:10.580
there are
00:04:10.580 --> 00:04:13.379
well you can take the length of that
00:04:13.379 --> 00:04:16.169
interval pi and divide it by the length
00:04:16.169 --> 00:04:19.560
of the space between each sample if it
00:04:19.560 --> 00:04:21.419
doesn't go in perfectly evenly you would
00:04:21.419 --> 00:04:22.590
have to round down to the nearest
00:04:22.590 --> 00:04:25.320
integer but as an approximation this is
00:04:25.320 --> 00:04:28.139
completely fine so if we write that
00:04:28.139 --> 00:04:31.650
spacing between samples as DX the number
00:04:31.650 --> 00:04:35.400
of samples is pi divided by DX and when
00:04:35.400 --> 00:04:37.350
we substitute that into our expression
00:04:37.350 --> 00:04:39.450
up here you can rearrange it putting
00:04:39.450 --> 00:04:42.060
that DX up top and distributing it into
00:04:42.060 --> 00:04:45.300
the sum but think about what it means to
00:04:45.300 --> 00:04:48.630
distribute that DX up top it means that
00:04:48.630 --> 00:04:50.280
the terms you're adding up will look
00:04:50.280 --> 00:04:52.710
like sine of X times DX
00:04:52.710 --> 00:04:54.840
for the various inputs X that you're
00:04:54.840 --> 00:04:57.630
sampling so that numerator looks exactly
00:04:57.630 --> 00:05:00.570
like an integral expression and so for
00:05:00.570 --> 00:05:03.000
larger and larger samples of points this
00:05:03.000 --> 00:05:04.680
average will approach the actual
00:05:04.680 --> 00:05:07.320
integral of sine of X between 0 and PI
00:05:07.320 --> 00:05:09.630
all divided by the length of that
00:05:09.630 --> 00:05:13.169
interval pi in other words the average
00:05:13.169 --> 00:05:15.539
height of this graph is this area
00:05:15.539 --> 00:05:18.599
divided by its width on an intuitive
00:05:18.599 --> 00:05:20.400
level and just thinking in terms of
00:05:20.400 --> 00:05:22.380
units that feels pretty reasonable
00:05:22.380 --> 00:05:25.020
doesn't it area divided by width gives
00:05:25.020 --> 00:05:28.139
you an average height so with this
00:05:28.139 --> 00:05:30.060
expression in hand let's actually solve
00:05:30.060 --> 00:05:33.210
it as we saw last video to compute an
00:05:33.210 --> 00:05:34.620
integral you need to find an
00:05:34.620 --> 00:05:37.080
antiderivative of the function inside
00:05:37.080 --> 00:05:39.450
the integral some other function whose
00:05:39.450 --> 00:05:42.599
derivative is sine of X and if you're
00:05:42.599 --> 00:05:44.400
comfortable with derivatives of trig
00:05:44.400 --> 00:05:46.680
functions you know that the derivative
00:05:46.680 --> 00:05:49.710
of cosine is negative sine so if you
00:05:49.710 --> 00:05:52.380
just negate that negative cosine is the
00:05:52.380 --> 00:05:54.539
function we want the antiderivative of
00:05:54.539 --> 00:05:56.849
sine and to go check yourself on that
00:05:56.849 --> 00:06:00.030
look at this graph of negative cosine at
00:06:00.030 --> 00:06:03.330
0 the slope is 0 and then it increases
00:06:03.330 --> 00:06:06.639
up to some maximum slope at PI halves
00:06:06.639 --> 00:06:08.889
and then goes back down to zero at PI
00:06:08.889 --> 00:06:12.639
and in general its slope does indeed
00:06:12.639 --> 00:06:14.590
seem to match the height of the sine
00:06:14.590 --> 00:06:17.620
graph at every point so what do we have
00:06:17.620 --> 00:06:19.689
to do to evaluate the integral of sine
00:06:19.689 --> 00:06:21.249
between 0 and pi
00:06:21.249 --> 00:06:24.550
well we evaluate this antiderivative at
00:06:24.550 --> 00:06:27.129
the upper bound and subtract off its
00:06:27.129 --> 00:06:30.039
value at the lower bound more visually
00:06:30.039 --> 00:06:31.840
that is the difference in the height of
00:06:31.840 --> 00:06:34.539
this negative cosine graph above pi and
00:06:34.539 --> 00:06:38.349
its height at 0 and as you can see that
00:06:38.349 --> 00:06:42.009
change in height is exactly 2 that's
00:06:42.009 --> 00:06:43.779
kind of interesting isn't it that the
00:06:43.779 --> 00:06:46.240
area under this sine graph turns out to
00:06:46.240 --> 00:06:49.810
be exactly 2 so the answer to our
00:06:49.810 --> 00:06:52.419
average-height problem this integral
00:06:52.419 --> 00:06:54.029
divided by the width of the region
00:06:54.029 --> 00:06:56.770
evidently turns out to be 2 divided by
00:06:56.770 --> 00:07:00.730
pi which is around 0.6 for
00:07:00.730 --> 00:07:02.800
I promised at the start that this
00:07:02.800 --> 00:07:04.930
question of finding the average of a
00:07:04.930 --> 00:07:07.390
function offers an alternate perspective
00:07:07.390 --> 00:07:09.340
on why integrals and derivatives are
00:07:09.340 --> 00:07:11.830
inverses of each other why the area
00:07:11.830 --> 00:07:14.440
under one graph has anything to do with
00:07:14.440 --> 00:07:17.620
the slope of another graph notice how
00:07:17.620 --> 00:07:20.110
finding this average value two divided
00:07:20.110 --> 00:07:22.660
by pi came down to looking at the change
00:07:22.660 --> 00:07:25.180
in the antiderivative negative cosine x
00:07:25.180 --> 00:07:27.940
over the input range divided by the
00:07:27.940 --> 00:07:31.150
length of that range and another way to
00:07:31.150 --> 00:07:33.610
think about that fraction is as the rise
00:07:33.610 --> 00:07:36.520
over run slope between the point of the
00:07:36.520 --> 00:07:39.430
antiderivative graph below 0 and the
00:07:39.430 --> 00:07:42.550
point of that graph above pi and now
00:07:42.550 --> 00:07:44.620
think about why it might make sense that
00:07:44.620 --> 00:07:46.540
this slope would represent an average
00:07:46.540 --> 00:07:50.350
value of sine of X on that region well
00:07:50.350 --> 00:07:52.450
by definition sine of X is the
00:07:52.450 --> 00:07:54.820
derivative of this antiderivative graph
00:07:54.820 --> 00:07:57.430
it gives us the slope of negative cosine
00:07:57.430 --> 00:08:00.520
at every point so another way to think
00:08:00.520 --> 00:08:02.650
about the average value of sine of X is
00:08:02.650 --> 00:08:05.620
as the average slope over all tangent
00:08:05.620 --> 00:08:09.610
lines here between 0 and pi and when you
00:08:09.610 --> 00:08:11.590
view things like that doesn't it make a
00:08:11.590 --> 00:08:13.660
lot of sense that the average slope of a
00:08:13.660 --> 00:08:15.790
graph over all of its points in a
00:08:15.790 --> 00:08:18.040
certain range should equal the total
00:08:18.040 --> 00:08:22.680
slope between the start and end points
00:08:22.680 --> 00:08:25.420
to digest this idea it helps to think
00:08:25.420 --> 00:08:27.070
about what it looks like for a general
00:08:27.070 --> 00:08:30.400
function for any function f of X if you
00:08:30.400 --> 00:08:32.440
want to find its average value on some
00:08:32.440 --> 00:08:36.070
interval say between a and B what you do
00:08:36.070 --> 00:08:38.260
is take the integral of f on that
00:08:38.260 --> 00:08:40.540
interval divided by the width of that
00:08:40.540 --> 00:08:43.720
interval B minus a you can think of this
00:08:43.720 --> 00:08:46.000
as the area under the graph divided by
00:08:46.000 --> 00:08:48.730
its width or more accurately it is the
00:08:48.730 --> 00:08:51.640
signed area of that graph since any area
00:08:51.640 --> 00:08:53.890
below the x-axis is counted as negative
00:08:53.890 --> 00:08:56.740
and it's worth taking a moment to
00:08:56.740 --> 00:08:58.810
remember what this area has to do with
00:08:58.810 --> 00:09:00.940
the usual notion of a finite average
00:09:00.940 --> 00:09:03.130
where you add up many numbers and divide
00:09:03.130 --> 00:09:05.800
by how many there are when you take some
00:09:05.800 --> 00:09:08.680
sample of points spaced out by DX the
00:09:08.680 --> 00:09:10.960
number of samples is about equal to the
00:09:10.960 --> 00:09:14.320
length of the interval divided by DX so
00:09:14.320 --> 00:09:16.600
if you add up the values of f of X at
00:09:16.600 --> 00:09:18.820
each sample and divide by the total
00:09:18.820 --> 00:09:21.250
number of samples it's the same as
00:09:21.250 --> 00:09:24.070
adding up the product f of x times DX
00:09:24.070 --> 00:09:26.740
and dividing by the width of the entire
00:09:26.740 --> 00:09:29.020
interval the only difference between
00:09:29.020 --> 00:09:31.300
that and the integrals is that the
00:09:31.300 --> 00:09:33.460
integral asks what happens as DX
00:09:33.460 --> 00:09:35.950
approaches 0 but that just corresponds
00:09:35.950 --> 00:09:37.810
with samples of more and more points
00:09:37.810 --> 00:09:39.220
that approximate the true average
00:09:39.220 --> 00:09:43.500
increasingly well now for any integral
00:09:43.500 --> 00:09:45.820
evaluating it comes down to finding an
00:09:45.820 --> 00:09:48.790
antiderivative of f of X commonly
00:09:48.790 --> 00:09:51.590
denoted capital f of X
00:09:51.590 --> 00:09:53.900
what we want is the change to this
00:09:53.900 --> 00:09:57.410
antiderivative between a and B capital F
00:09:57.410 --> 00:10:00.710
of B minus capital F of a which you can
00:10:00.710 --> 00:10:03.110
think of as the change in height of this
00:10:03.110 --> 00:10:05.390
new graph between the two bounds
00:10:05.390 --> 00:10:07.910
I've conveniently chosen an
00:10:07.910 --> 00:10:10.070
antiderivative that passes through zero
00:10:10.070 --> 00:10:12.260
at the lower bound here but keep in mind
00:10:12.260 --> 00:10:14.600
you can freely shift this up and down
00:10:14.600 --> 00:10:16.970
adding whatever constant you want to it
00:10:16.970 --> 00:10:18.860
and it would still be a valid
00:10:18.860 --> 00:10:22.310
antiderivative so the solution to the
00:10:22.310 --> 00:10:24.860
average problem is the change in the
00:10:24.860 --> 00:10:27.650
height of this new graph divided by the
00:10:27.650 --> 00:10:31.460
change to the x-value between a and B in
00:10:31.460 --> 00:10:33.410
other words it is the slope of the
00:10:33.410 --> 00:10:35.510
antiderivative graph between the two
00:10:35.510 --> 00:10:38.330
endpoints and again when you stop to
00:10:38.330 --> 00:10:40.070
think about it that should make a lot of
00:10:40.070 --> 00:10:43.339
sense because little f of X gives us the
00:10:43.339 --> 00:10:45.500
slope of the tangent line to this graph
00:10:45.500 --> 00:10:48.170
at each point after all it is by
00:10:48.170 --> 00:10:52.450
definition the derivative of capital F
00:10:52.450 --> 00:10:55.630
so why are anti-derivatives the key to
00:10:55.630 --> 00:10:58.300
solving integrals well my favorite
00:10:58.300 --> 00:10:59.889
intuition is still the one that I showed
00:10:59.889 --> 00:11:02.800
last video but a second perspective is
00:11:02.800 --> 00:11:04.990
that when you reframe the question of
00:11:04.990 --> 00:11:07.329
finding an average of a continuous value
00:11:07.329 --> 00:11:10.510
as instead finding the average slope of
00:11:10.510 --> 00:11:12.970
a bunch of tangent lines it lets you see
00:11:12.970 --> 00:11:15.370
the answer just by comparing endpoints
00:11:15.370 --> 00:11:17.829
rather than having to actually tally up
00:11:17.829 --> 00:11:23.649
all of the points in between in the last
00:11:23.649 --> 00:11:25.449
video I described a sensation that
00:11:25.449 --> 00:11:26.860
should bring integrals to your mind
00:11:26.860 --> 00:11:29.170
namely if you feel like the problem
00:11:29.170 --> 00:11:31.149
you're solving could be approximated by
00:11:31.149 --> 00:11:33.699
breaking it up somehow and adding up a
00:11:33.699 --> 00:11:36.910
large number of small things and here I
00:11:36.910 --> 00:11:38.709
want you to come away recognizing a
00:11:38.709 --> 00:11:40.690
second sensation that should also bring
00:11:40.690 --> 00:11:43.300
integrals to your mind if ever there's
00:11:43.300 --> 00:11:45.519
some idea that you understand in a
00:11:45.519 --> 00:11:48.070
finite context and which involves adding
00:11:48.070 --> 00:11:50.500
up multiple values like taking the
00:11:50.500 --> 00:11:53.019
average of a bunch of numbers and if you
00:11:53.019 --> 00:11:55.209
want to generalize that idea to apply to
00:11:55.209 --> 00:11:57.870
an infinite continuous range of values
00:11:57.870 --> 00:12:00.339
try seeing if you can phrase things in
00:12:00.339 --> 00:12:02.980
terms of an integral it's a feeling that
00:12:02.980 --> 00:12:05.019
comes up all the time especially in
00:12:05.019 --> 00:12:07.180
probability and it's definitely worth
00:12:07.180 --> 00:12:10.390
remembering my thanks as always go to
00:12:10.390 --> 00:12:16.160
those making these videos possible
00:12:16.160 --> 00:12:22.920
[Music]
00:12:22.920 --> 00:12:23.320
you
00:12:23.320 --> 00:12:39.879
[Music]