WEBVTT

00:00.080 --> 00:01.000
Welcome back.

00:01.040 --> 00:04.720
In this video we're gonna talk again about standard scaler.

00:04.720 --> 00:11.120
Because this is a very important concept you should learn in order to use it in the next and future

00:11.120 --> 00:15.880
applications and model, because this is a crucial thing.

00:16.280 --> 00:23.320
Standard scaler is a data pre-processing tool that transforms your features to have a mean of zero and

00:23.320 --> 00:25.120
standard deviation of one.

00:25.320 --> 00:31.480
This is also called standardization or zero or Z-score normalization.

00:31.520 --> 00:43.280
Again, the formula equal z equal to x minus m over d x is the feature value minus the mean of the feature

00:43.400 --> 00:47.240
all over the standard deviation of the feature.

00:47.280 --> 00:54.800
Again, there is a very important note without standard scaler the weight feature.

00:54.800 --> 01:04.040
For example, if we enter 3000 3000kg with completely dominate the cylinders feature, which is ranging

01:04.040 --> 01:09.800
between 4 and 8 cylinders, making your neural network learning poorly.

01:10.080 --> 01:14.200
So here we don't need to dominate by numbers.

01:14.400 --> 01:22.560
We need to make a standard scalar in order to compare between the features.

01:22.560 --> 01:29.640
So we have nine features every feature with different scale with different unit.

01:29.640 --> 01:35.000
So we need to normalize all those features to one scale.

01:35.280 --> 01:40.400
And this is the main purpose behind using standard scalar.

01:40.440 --> 01:48.040
Don't worry we're going to see the standard scalar library in the next videos and lecture about it because

01:48.040 --> 01:54.120
it's very important note and very important library and very important concept.

01:54.160 --> 01:58.720
Not only importing it with the Sklearn.preprocessing.

01:58.920 --> 02:07.120
It's a very important concept you should use in machine learning and in AI or anything when dealing

02:07.120 --> 02:11.640
with different units and different features with different scales.

02:11.760 --> 02:20.160
Standardscaler makes features compatible by giving them same scale critical for neural networks.

02:20.160 --> 02:26.280
Help training converge, converge faster and preventing data leakage.

02:26.320 --> 02:27.200
Fit on train.

02:27.200 --> 02:28.560
Transform on test.

02:28.640 --> 02:29.880
Simple concept.

02:29.920 --> 02:32.840
Value minus mean over standard deviation.

02:32.840 --> 02:35.160
This is the main concept behind it.

02:35.200 --> 02:44.360
Without standard scaler, the weight feature 3000 or 4000kg would completely dominate the cylinder's

02:44.360 --> 02:50.960
features or the, um displacement feature or the horsepower feature.

02:51.080 --> 02:54.640
Making your neural network learn poorly.
