WEBVTT

00:00.080 --> 00:01.080
Welcome back.

00:01.120 --> 00:04.000
We get out of range.

00:04.320 --> 00:05.400
Impossible.

00:05.400 --> 00:06.680
Predicted mileage.

00:06.840 --> 00:08.040
And this is wrong.

00:08.320 --> 00:15.280
This is due to wrong feature order, wrong scale constants or data entry bug.

00:15.520 --> 00:19.440
Let me check the feature order and constants.

00:19.600 --> 00:22.720
So let me go back to the car features.

00:22.840 --> 00:27.160
Split the screen and here we have the order.

00:27.160 --> 00:30.440
Let me check the order with scaling features.

00:30.640 --> 00:32.920
Cylinder displacement.

00:33.080 --> 00:35.320
Horsepower weight.

00:35.360 --> 00:38.280
They are the same acceleration.

00:38.280 --> 00:41.000
Then model year, then origin one, two and three.

00:41.280 --> 00:44.080
Acceleration model year, origin one, two and three.

00:44.440 --> 00:48.960
Those for the standard deviation and the mean.

00:48.960 --> 00:50.560
So here cylinders.

00:50.560 --> 00:51.120
Displacement.

00:51.160 --> 00:51.760
Horsepower.

00:51.760 --> 00:52.080
Weight.

00:52.080 --> 00:55.400
Acceleration model year one two and three.

00:55.440 --> 00:57.560
And those are the same.

00:57.600 --> 00:58.080
Okay.

00:58.480 --> 01:02.560
So the order is not the problem.

01:02.560 --> 01:05.040
Let's see the input data.

01:05.240 --> 01:10.340
So let's scroll down to here for the statistical summary.

01:10.340 --> 01:12.900
And let me put it like this.

01:13.140 --> 01:19.140
The statistical mean for cylinders 5.47.

01:19.260 --> 01:23.260
Displacement one nine for horsepower one zero.

01:23.300 --> 01:27.860
For weight 2977.

01:28.060 --> 01:29.100
And acceleration.

01:29.700 --> 01:32.980
The acceleration is the error.

01:33.180 --> 01:36.940
So here we get the error of 15.5.

01:36.980 --> 01:40.300
For the model year is 75.

01:40.500 --> 01:43.380
The origin is 1.57.

01:43.380 --> 01:45.820
Let me check the feature standard.

01:45.820 --> 01:54.820
So you see guys that any error your data will be missed and the model will be generating random and

01:54.820 --> 01:56.460
wrong predicted data.

01:56.500 --> 01:57.900
The standard for the first

01:57.900 --> 02:08.580
1.710438849, 2.7, 3.6 and 0.8.

02:08.860 --> 02:12.420
So this is good for our application.

02:12.460 --> 02:15.180
Now let's run our application again.

02:15.180 --> 02:18.730
And by the way you can debug the data.

02:18.730 --> 02:22.490
So scroll down to here in the Predict mileage.

02:22.490 --> 02:26.610
Let me debug the data the scaled data.

02:26.610 --> 02:33.650
So debugging the scaled features let me log scaling cylinders features.

02:33.650 --> 02:36.890
With scaling you can put like this.

02:37.050 --> 02:40.330
Or you can print all the features.

02:40.570 --> 02:51.970
So print LN scaled features features with scaling dot for each indexed go and set the index and the

02:51.970 --> 02:52.730
value.

02:52.770 --> 02:55.010
Print the feature and the index.

02:55.330 --> 02:57.170
Let's run our application.

02:57.170 --> 02:57.930
Here we go.

02:57.970 --> 02:59.930
This is our application open.

02:59.930 --> 03:01.330
Look at clear.

03:01.330 --> 03:02.210
Look at.

03:02.370 --> 03:08.490
And here we have the data entered predict MPG.

03:08.690 --> 03:13.730
And here we get a new value and it's acceptable.

03:13.730 --> 03:16.810
So now it's 34 mpg.

03:16.970 --> 03:20.330
Let's select European predict mileage.

03:20.330 --> 03:21.970
It's 25.

03:22.010 --> 03:29.190
Let's select Japanese and predict the MPEG and it's working fine.

03:29.190 --> 03:30.950
Let's go back to American.

03:31.230 --> 03:33.190
Let's predict the features.

03:33.190 --> 03:39.110
And here we get the scaled features zero -0.83.

03:39.310 --> 03:44.030
Let's compare them with our Google Colab.

03:44.230 --> 03:55.110
So here if we take the same sample new car four cylinders 150 100 horsepower, 3015 acceleration and

03:55.110 --> 03:58.430
model year 76 origin American run.

03:58.430 --> 04:02.230
It will give me 20.54.

04:02.470 --> 04:05.790
And this is a big difference between our application.

04:05.830 --> 04:09.030
Those are similar to what we've done.

04:09.230 --> 04:12.630
Select American predict and it's 34.

04:12.630 --> 04:14.270
And this is a big problem.

04:14.310 --> 04:23.110
Maybe there is here in the standard deviations and the mean we need to modify those origins because

04:23.270 --> 04:26.070
the features are the same guys.

04:26.230 --> 04:29.390
So here we have a lot of features.

04:29.390 --> 04:33.070
The first six features are same as before.

04:33.070 --> 04:35.900
We need to configure the origin.

04:36.140 --> 04:40.220
So let me test again with this code.

04:40.260 --> 04:48.620
So here under this go and start anew and paste X train origin one, origin two and origin three.

04:48.660 --> 04:51.540
Get the mean and the standard deviation of them.

04:51.700 --> 04:53.180
And here we go.

04:53.220 --> 04:57.740
We get the standard deviation and the mean of them.

04:57.780 --> 05:00.780
So those are for the standard deviation.

05:00.780 --> 05:02.580
Let me correct them.

05:02.580 --> 05:05.500
So here the standard deviation.

05:05.500 --> 05:07.020
Replace it by this.

05:07.220 --> 05:10.780
Then this and this.

05:10.940 --> 05:14.060
Also we need to correct the mean.

05:14.100 --> 05:27.940
Now our mission to get the feature means here if we see guys that the problem is origin are of integer.

05:28.060 --> 05:33.060
And we splitted the origin to three columns.

05:33.060 --> 05:39.060
So we need to get the data from the three columns, not from the one column.
