WEBVTT

00:00.080 --> 00:01.000
Welcome back.

00:01.040 --> 00:05.640
In this video, we're gonna learn about the types of ML algorithms.

00:05.640 --> 00:09.200
And we're going to deep dive into the supervised learning.

00:09.360 --> 00:17.440
Supervised learning is a type of machine learning where a model learns from labeled data.

00:17.640 --> 00:24.160
Meaning every input has a corresponding correct output.

00:24.160 --> 00:32.720
The model makes predictions and compares them with the true outputs, adjusting itself to reduce errors

00:32.720 --> 00:35.080
and improve accuracy over time.

00:35.200 --> 00:41.360
The goal is to make accurate predictions on new, unseen data.

00:41.600 --> 00:49.040
For example, a model trained on images of handwritten digits can recognize new digits.

00:49.160 --> 00:52.080
It has never been seen.

00:52.120 --> 01:00.960
Under supervised learning for classification, regression and forecasting, classification teaches a

01:00.960 --> 01:04.520
machine to sort things into categories.

01:04.800 --> 01:12.080
It learns by looking at examples with labels like emails marked spam or not spam.

01:12.400 --> 01:20.960
After learning, it can decide which category new items belong to, like identifying if a new email

01:20.960 --> 01:22.640
is spam or not.

01:22.920 --> 01:30.080
For example, a classification model might be trained on data set of images labeled as either dogs or

01:30.120 --> 01:39.080
cats, and it can be used to predict the class of a new and unseen images as dogs or cats based on their

01:39.080 --> 01:45.320
features such as color, texture, shape, tail, and others.

01:45.440 --> 01:51.320
Okay, so the classification means sorting things into categories.

01:51.360 --> 01:52.360
Regression.

01:52.520 --> 02:01.720
In regression tasks, the learning machine must estimate and understand the relationships between variables

02:01.720 --> 02:10.650
in a system by analyzing only one dependent variable, as well as a few other variables that are constantly

02:10.650 --> 02:11.370
changing.

02:11.690 --> 02:18.170
Regression analysis is particularly useful for forecasting and prediction.

02:18.170 --> 02:26.410
So regression, where the goal is to predict a continuous numerical value based on one or more independent

02:26.410 --> 02:27.210
features.

02:27.370 --> 02:33.330
It finds relationships between variables so the prediction can be made.

02:33.450 --> 02:36.170
We have two types of variables present.

02:36.170 --> 02:37.170
Regression.

02:37.210 --> 02:42.050
Dependent variable which is the target variable we are trying to predict.

02:42.050 --> 02:50.450
For example house price or independent variable or features that the input variables are influenced

02:50.490 --> 02:54.010
the prediction for example locality number of rooms.

02:54.290 --> 03:02.850
So regression analysis problem works with if output variable is a real or continuous value, such as

03:02.850 --> 03:04.850
salary or weight.

03:04.890 --> 03:06.050
Forecasting.

03:06.250 --> 03:14.570
Forecasting involves analyzing, analyzing past and present data to make predictions about the future.

03:14.570 --> 03:21.170
As a quick recap about the supervised learning and the steps, the working of supervised machine learning

03:21.170 --> 03:23.730
follows these key steps.

03:24.010 --> 03:26.170
Training data set.

03:26.290 --> 03:27.810
Collecting labeled data.

03:27.850 --> 03:33.410
Gather a data set where each input has a known correct output.

03:33.570 --> 03:38.010
For example, images of handwritten digits like three.

03:38.290 --> 03:39.410
Like this.

03:39.610 --> 03:41.410
Like for example this.

03:41.610 --> 03:48.450
Those are the images of handwritten digits with their actual numbers as labels.

03:48.450 --> 03:57.090
So the label would be number three, the label here would be number three and so on.

03:57.330 --> 03:57.810
Okay.

03:57.970 --> 04:02.290
So every image has its correct label.

04:02.410 --> 04:11.130
Then splitting the data set divide the data set into training data about 80% and testing data about

04:11.170 --> 04:11.850
20%.

04:11.850 --> 04:14.850
And this is what we're going to do in the next videos.

04:15.090 --> 04:17.530
The model will learn from training data.

04:17.530 --> 04:28.100
And we evaluated on the testing data, then training the model feed the training data inputs and outputs

04:28.100 --> 04:36.980
and their labels to a suitable supervised learning algorithm like decision trees like SVM or linear

04:37.020 --> 04:37.940
regression.

04:37.940 --> 04:44.740
So in the next videos, we're going to learn about the algorithms of the supervised learning.

04:44.780 --> 04:50.660
The model tries to find patterns and that map inputs to correct outputs.

04:50.660 --> 04:55.420
Then the fourth step is to validate and test the model.

04:55.460 --> 04:58.180
Evaluate the model using testing data.

04:58.180 --> 05:03.100
It has never been seen before, and the model predicts output.

05:03.220 --> 05:10.780
And those predictions are compared with the actual labels to calculate accuracy or error.

05:10.900 --> 05:13.740
The last step is the prediction.

05:13.740 --> 05:23.020
Once the model performs well, it can be used to predict outputs from completely new unseen data.

05:23.020 --> 05:25.380
Let me show you an example.

05:25.380 --> 05:29.060
Here we have the input row data.

05:29.060 --> 05:31.700
And here we have the labels.

05:31.740 --> 05:38.700
Every element every image here has its own and corresponding label.

05:38.700 --> 05:42.700
So for example this is the image of elephant elephant elephant.

05:42.900 --> 05:45.180
And those are the elephants.

05:45.180 --> 05:50.740
And the label class contains the elephant label.

05:50.780 --> 05:54.580
Also this is true for the cow and camel.

05:54.580 --> 06:05.700
So every element here, every input data has a label gathering a data set where each input has a known

06:05.700 --> 06:07.100
correct output.

06:07.260 --> 06:12.860
Images of animals with their actual names as label.

06:12.860 --> 06:15.300
Then splitting the data.

06:15.580 --> 06:25.900
Divide the data into training data about 80% for uh, 80% for training and testing data, about 20%

06:25.900 --> 06:28.430
for training the model.

06:28.470 --> 06:32.150
Here we have the algorithm, as I told you.

06:32.190 --> 06:40.910
We have as a VM, we have decision trees, we have linear regression and so on.

06:40.990 --> 06:50.990
Based on this algorithm and training data set and the desired output, we are going to use and validate

06:51.150 --> 06:52.550
the model.

06:52.630 --> 06:59.270
So evaluating the model using testing data the model then predicts the output.

06:59.270 --> 07:06.990
And those predictions are compared with the actual labels to calculate accuracy or error.

07:07.150 --> 07:12.110
This is the level of processing the data.

07:12.150 --> 07:15.870
Then we get the output the machine.

07:15.910 --> 07:20.230
Now recognize that this image is for elephant.

07:20.270 --> 07:24.510
This image is for camel and this image is for cow.

07:24.750 --> 07:32.190
Once the model performs well it can be used to predict outputs for completely new unseen data.
