WEBVTT

00:00.080 --> 00:01.120
Welcome back.

00:01.200 --> 00:04.680
We finished creating the model.

00:04.680 --> 00:06.360
Now let's compile it.

00:06.400 --> 00:11.520
We use model the name of our model dot compile.

00:11.520 --> 00:19.520
And here we have to pass three parameters the optimizer the loss and the metrics.

00:19.560 --> 00:24.320
Ma let me explain step by step the model compilation.

00:24.440 --> 00:34.120
What model compilation does compilation configures the model for training by specifying how to learn.

00:34.200 --> 00:43.360
Optimizer, what to minimize loss function how to measure progress, which is the metrics.

00:43.400 --> 00:52.000
Okay, so compile the model means configuring the model for training by specifying the optimizer loss

00:52.000 --> 00:54.200
function and the metric.

00:54.200 --> 00:58.320
How to learn, what to minimize, how to measure progress.

00:58.320 --> 01:03.600
So we're going to specify them using the compile function.

01:03.640 --> 01:12.480
Here we'll start by the optimizer equals to tf TensorFlow dot optimizer dot s g SGD.

01:12.760 --> 01:21.120
The algorithm that updates the model's weight during training, reducing errors and the learning rate

01:21.120 --> 01:24.600
which is the parameter inside this function.

01:24.600 --> 01:30.560
Learning rate controls how big each weight update step is.

01:30.840 --> 01:41.160
The learning rate zero point if I specify 0.1, which is too high, might overshoot the optimal weights,

01:41.160 --> 01:49.640
and if I specify 0.001, which is too low, training becomes very slow.

01:49.840 --> 01:56.000
So in order to balance this problem we use 0.01.

01:56.040 --> 01:58.600
Not very fast and not very slow.

01:58.840 --> 02:05.370
Training would be Accurate and not optimal and not very slow.

02:05.570 --> 02:08.930
Then we need to specify the loss.

02:09.130 --> 02:09.850
Loss.

02:09.850 --> 02:19.090
Here we have MC MSE mean squared error measures how wrong the predictions are.

02:19.130 --> 02:29.130
Using this formula MSE equals to the average of true price minus predicted price to power squared.

02:29.170 --> 02:29.690
Okay.

02:29.890 --> 02:39.370
So we are calculating this using the the the training and the testing data.

02:39.370 --> 02:41.410
So we are comparing those.

02:41.450 --> 02:47.570
And we get by this formula this average in order to track our loss.

02:47.610 --> 02:49.450
Why MSE for regression.

02:49.450 --> 02:53.690
Because it penalizes large errors more heavily.

02:53.850 --> 03:00.490
Smooth curve helps gradient descent and the standard for regression problems.

03:00.490 --> 03:04.450
So we use it with standard regression problems.

03:04.450 --> 03:05.890
Exactly like this problem.

03:05.930 --> 03:10.650
M, m or m a mean absolute error.

03:10.690 --> 03:13.130
Alternative way to measure errors.

03:13.330 --> 03:20.930
The formula is average of the absolute value of true price minus predicted price.

03:21.210 --> 03:31.450
Okay, the last MSE used for training and may used for monitoring human interpretation.

03:31.570 --> 03:37.770
As a quick recap, we need to pass three parameters in order to compile the model.

03:37.770 --> 03:41.970
For the compile function, we have the optimizer.

03:42.330 --> 03:55.130
The optimizer is simple and is used in simple problems that didn't need complex optimizers.

03:55.170 --> 03:57.810
Learning rate 0.01.

03:57.850 --> 03:59.970
Fast enough to learn it quickly.

04:00.010 --> 04:02.100
Slow enough to be stable.

04:02.140 --> 04:09.420
MSE loss standard for regression matches your squared error in data generation.

04:09.580 --> 04:12.660
M easy to interpret.

04:12.900 --> 04:14.540
Average color error.

04:14.580 --> 04:15.500
For example.

04:15.500 --> 04:18.540
In summary, compilation tells the model.

04:18.580 --> 04:20.980
Learn using gradient descent.

04:21.020 --> 04:27.020
Measure errors with MSE and show me as a progress report.

04:27.020 --> 04:32.860
I want from you to write all those notes down because those are very important.

04:32.860 --> 04:42.100
By the way, don't, uh worry, I'm gonna provide you with all of those notes, all of those notebooks

04:42.100 --> 04:44.580
with detailed explanation.

04:44.580 --> 04:54.820
So as a quick recap, compilation tells the model, learn using gradient descent, measure errors with

04:54.860 --> 04:59.780
MSE and show me Ma as a progress report.
