WEBVTT

00:00.120 --> 00:05.200
Another type of machine learning is semi-supervised machine learning.

00:05.400 --> 00:14.800
It's similar to supervised machine learning system, but it uses both labeled and unlabeled data in

00:14.800 --> 00:16.680
addition to supervised data.

00:16.720 --> 00:24.360
The term labeled data refers to information that has a meaningful tag that allows the algorithm to understand

00:24.360 --> 00:32.680
the data, whereas unlabeled data does not have such a tag, which means that machine learning algorithm

00:32.680 --> 00:37.760
can be taught to label data that has not been seen labeled.

00:37.880 --> 00:46.440
Several unlabeled data sets are provided to the classifier after it has been trained on the labeled

00:46.440 --> 00:47.000
data.

00:47.040 --> 00:54.480
Upon classification and classifying the unlabeled data, the model is further retrained using the originally

00:54.480 --> 00:58.400
available labeled data to increase the accuracy of the model.

00:58.640 --> 01:07.560
The goal is to learn a function that accurately predicts outputs based on inputs similar to supervised

01:07.560 --> 01:11.600
learning, but with much less labeled data.

01:11.640 --> 01:13.520
Let's take an example.

01:13.720 --> 01:19.280
Here we have an input of animals elephant, cow, and camel.

01:19.280 --> 01:23.050
And those are the pictures of those animals.

01:23.210 --> 01:27.330
But the problem here, we don't have all the labels.

01:27.330 --> 01:36.770
We have partial labels like camel and cow that we don't have the label for the elephant in semi-supervised

01:36.770 --> 01:44.850
learning here we are training the model and and here it uses a small amount of labeled data combined

01:44.850 --> 01:49.210
with large amount of unlabeled data to train models.

01:49.210 --> 01:54.450
The goal is to learn a function that accurately predicts output based on inputs.

01:54.450 --> 02:00.130
So here we are predicting that this animal would be an elephant.

02:00.130 --> 02:03.250
So the prediction it's an elephant.

02:03.290 --> 02:12.170
This is the goal of using this type of machine learning a mix where the teacher provides some concepts

02:12.170 --> 02:20.490
in a class, and the students and the student practices with homework assignments based on those concepts.

02:20.490 --> 02:28.090
So this is like a similar to a student being taught some concepts in class and trying to figure out

02:28.330 --> 02:32.210
the and solve the homework assignment based on those concepts.

02:32.250 --> 02:35.170
Okay, so we are not training all the data.

02:35.170 --> 02:37.730
We don't have all the input data.

02:37.730 --> 02:39.970
We don't have all the labels.

02:39.970 --> 02:41.850
We have partial labels.
