WEBVTT

00:00.080 --> 00:00.760
Welcome back.

00:00.760 --> 00:03.800
We finished the first step, which is preparing the data.

00:03.840 --> 00:07.320
Now let's train the linear regression model.

00:07.320 --> 00:10.200
We'll use the linear regression to fit a line.

00:10.240 --> 00:13.720
Price equals to W times size plus b.

00:13.760 --> 00:17.640
This is the linear regression relationship.

00:17.680 --> 00:21.080
Linear regression is a model from sklearn.

00:21.320 --> 00:22.560
Linearmodel.

00:22.600 --> 00:30.080
Because we used here linear regression that tries to predict a numeric value like price based on one

00:30.080 --> 00:32.600
or more input features like size.

00:32.880 --> 00:38.080
It assumes that the relationship between the features and the target is linear.

00:38.080 --> 00:47.040
For example, it can be approximated by a straight line in 2D or a flat plane hyperplane in higher dimensions.

00:47.040 --> 00:50.360
So think about it as a linear line.

00:50.360 --> 00:54.800
So if we go up here we have uh a straight line.

00:54.800 --> 00:58.720
If you if you see it like it's a straight line.

00:58.720 --> 01:00.320
So this is the linear.

01:00.360 --> 01:04.720
This is the linear relationship between size and the price okay.

01:04.760 --> 01:08.570
So this is the basic linear regression model.

01:08.610 --> 01:15.250
At first we need to set the model and tell the Python interpreter.

01:15.410 --> 01:19.330
With that we need to use this as a linear regression.

01:19.330 --> 01:22.650
So model equals to linear regression.

01:22.690 --> 01:26.890
Here we create an instance of the linear regression class.

01:27.050 --> 01:32.010
At this point the model exists but hasn't learned anything yet.

01:32.210 --> 01:35.650
It's like a blank sheet waiting to be trained.

01:35.930 --> 01:36.490
Okay.

01:36.730 --> 01:41.170
And this is from the library that we imported before.

01:41.370 --> 01:46.090
The second step is to use this model to train it.

01:46.250 --> 01:49.850
So here we are training the model.

01:49.970 --> 01:52.530
We use model dot fit.

01:52.530 --> 02:00.370
So I created an instance of the linear regression class and use this model instance to train the model.

02:00.570 --> 02:02.090
Model.fit.

02:02.370 --> 02:06.690
The fit method is used to train the model.

02:06.690 --> 02:08.010
So fit.

02:08.050 --> 02:10.930
I love to write notes down.

02:10.930 --> 02:15.290
Please pay attention to those notes because they are very important.

02:15.290 --> 02:23.170
Please write them down and because it's the revision and helps you understand the code better.

02:23.330 --> 02:26.370
It looks like a very simple.

02:26.410 --> 02:27.130
Yes.

02:27.130 --> 02:30.250
So Model.fit is the function.

02:30.250 --> 02:33.090
This is the method that trains the model.

02:33.290 --> 02:40.930
It looks at the training data, exit train for sizes, white train for prices, and calculates the best

02:40.970 --> 02:49.650
slope and intercept B that minimizes the difference between predicted and actual prices.

02:49.650 --> 02:53.210
So here training the model involves many steps.

02:53.210 --> 03:01.610
So this function gets xtrain and ytrain, which are the sizes and prices respectively, and calculates

03:01.610 --> 03:10.530
the best slope and intercept b that minimizes the differences between predicted and actual prices.

03:10.730 --> 03:17.420
After this step, the model knows the relationship between house size and the price.

03:17.620 --> 03:21.140
Again, guys, I want from you to focus with me here.

03:21.140 --> 03:25.580
We have a linear relationship between size and the price.

03:25.580 --> 03:26.900
We have this straight line.

03:26.900 --> 03:34.140
And as you know we have this relationship price equals to three size plus two.

03:34.180 --> 03:38.180
This is normally a linear relationship between size and price.

03:38.220 --> 03:41.660
It's in form of y equals ax plus b.

03:42.060 --> 03:45.020
So A is the slope or w is a slope.

03:45.020 --> 03:55.540
You can make it like w plus uh here like linear relationship y equals to w x plus b.

03:55.700 --> 04:00.300
So the slope is called w and intercept is b.

04:00.700 --> 04:07.340
Those are used by the model to minimize the difference between predicted and actual prices.

04:07.380 --> 04:13.020
After this step the model knows the relationship between house size and price.

04:13.060 --> 04:13.500
Okay.

04:13.740 --> 04:17.820
The third step is the learned parameters.

04:17.870 --> 04:21.550
Here we have the learned parameters w.

04:21.830 --> 04:27.950
The model coefficient contains the weights or the slopes learned from this feature.

04:27.950 --> 04:34.670
Since we have only one feature, which is the size, we take the first element with zero.

04:34.670 --> 04:36.750
So model dot coefficient.

04:37.110 --> 04:43.510
Since we have only one feature, we take only the zero from this array.

04:43.710 --> 04:54.990
So since we have only one feature, B contains the bias or intercept the value of price when size equals

04:55.030 --> 04:56.150
to zero.

04:56.190 --> 04:59.830
Let's run this cell and here we go.

04:59.870 --> 05:08.110
Model trained learned equation price equals to 2.9 times size plus 2.7.

05:08.270 --> 05:19.030
So after training this model this model gets and learned the equation that price equals to 2.9 times

05:19.070 --> 05:21.750
size plus 2.7.

05:21.790 --> 05:24.110
So this is the magic.

05:24.150 --> 05:27.670
This is how this model get to train.

05:27.790 --> 05:29.310
How this model.

05:29.510 --> 05:40.230
Get this equation and conclude this equation y equal 2.9 x plus 2.7.

05:40.430 --> 05:45.230
So price equals to 2.9 times size plus 17.

05:45.390 --> 06:03.830
So each 1000ft² increases in house size adds roughly um two $2,900 to the price starting from 2007

06:04.150 --> 06:07.670
$2,170.

06:07.910 --> 06:16.670
So in general, what we've done here, we created a linear regression model, training it on the training

06:16.670 --> 06:24.390
data, extract the slope x and intercept b, print the learned linear equation.
