WEBVTT

00:00.120 --> 00:01.040
Welcome back.

00:01.160 --> 00:04.720
We used numpy to generate random numbers.

00:04.720 --> 00:07.960
Also we created two variables size and price.

00:07.960 --> 00:13.520
Size depends on random numbers and price depends on size.

00:13.520 --> 00:17.040
With this linear relationship between size and price.

00:17.080 --> 00:24.080
Also, we converted numpy array into a pandas data frame table like structure.

00:24.120 --> 00:33.520
Using this PD pandas dot data frame, and we concatenated the size and the price to draw and create

00:33.560 --> 00:34.560
this table.

00:34.840 --> 00:36.360
The size and the price.

00:36.360 --> 00:39.960
Those are random generated numbers.

00:40.000 --> 00:44.400
Now let's visualize the relationship here.

00:44.400 --> 00:48.120
We're going to use matplotlib.

00:48.360 --> 00:51.440
Start with plt dot figure.

00:51.640 --> 00:58.760
And here set the figure size equals to eight by five okay.

00:58.960 --> 01:03.340
This creates a new figure creating Creating.

01:03.380 --> 01:06.100
And you figure the plot window.

01:06.300 --> 01:12.980
The figure size sets the width to eight inches and height to five inches.

01:13.220 --> 01:16.540
So here with five and eight inches.

01:16.700 --> 01:17.860
Five inches.

01:17.900 --> 01:18.460
Okay.

01:18.580 --> 01:26.420
If we run the code, we have the figure size 800 with 500 with zero axis.

01:26.500 --> 01:29.020
So we need to set this axis.

01:29.140 --> 01:32.220
Clear the input and output.

01:32.260 --> 01:32.820
Sorry.

01:32.860 --> 01:37.020
Now we're going to plot a scatter plot or dots.

01:37.180 --> 01:42.580
So plots plotting set plt dot scatter.

01:42.820 --> 01:44.580
What data we need to use.

01:44.820 --> 01:48.740
We need to use the data size.

01:48.860 --> 01:54.300
So data we need to get the size which is this.

01:54.300 --> 01:58.380
So we are going to get this column the size.

01:58.380 --> 02:03.180
And remember we stored all the data inside the data variable.

02:03.180 --> 02:05.590
So we need to access the data variable.

02:05.830 --> 02:07.030
How to access it?

02:07.070 --> 02:15.310
We specify its name and we specify inside the angled bracket the column that we need to display.

02:15.550 --> 02:17.750
This is the first one.

02:17.790 --> 02:19.630
This is the first axis.

02:19.910 --> 02:22.190
The second axis which is the y.

02:22.390 --> 02:27.350
So the first the first variable is the x axis.

02:27.390 --> 02:30.110
The second one is the y axis.

02:30.110 --> 02:39.190
Starting with data, the price and set the color blue and label data points okay, this plots a scatter

02:39.190 --> 02:45.030
plot dots with x the size and the y axis the price.

02:45.110 --> 02:50.790
The color dots are blue and the data data points used in the legend.

02:50.830 --> 02:55.590
Okay, now if we run it, we have those dots.

02:55.630 --> 02:56.110
Okay.

02:56.150 --> 03:00.350
So here we have this figure eight by five inches.

03:00.470 --> 03:09.690
We have the 100 dots around 100 dots because we are not getting the first ten in order to get the first

03:09.690 --> 03:10.970
ten data.

03:11.090 --> 03:14.170
Here we need to set it inside the scatter.

03:14.330 --> 03:16.370
So let me make it like this.

03:16.410 --> 03:19.650
In order to get and use the head.

03:19.690 --> 03:22.490
Head ten to get the first ten.

03:22.530 --> 03:23.890
Head ten.

03:23.930 --> 03:25.090
Run it again.

03:25.090 --> 03:26.010
And here we go.

03:26.210 --> 03:31.730
We have only the first ten results from the data set that we get.

03:31.730 --> 03:40.290
So since we are getting the first ten size and the prices, we need to plot them here inside this figure.

03:40.330 --> 03:44.650
Now we can add a title at the top of the plot.

03:44.650 --> 03:52.010
So PLT the plot dot title house house size versus price.

03:52.010 --> 03:59.370
Also, we can set the label for the x axis to tell the user that this is the size in square feet and

03:59.370 --> 04:00.290
the y label.

04:00.290 --> 04:02.770
This is in thousand dollars.

04:02.770 --> 04:07.780
Also, we can set a legend, a grid view and to show it.

04:07.780 --> 04:17.700
So here we have this grid in order to add the plotting to this copy or cut this and paste it here.

04:17.900 --> 04:20.180
Cut the code, paste it here.

04:20.340 --> 04:24.980
Use the copy selection not copy and paste it here okay.

04:25.220 --> 04:26.660
Delete this cell.

04:26.660 --> 04:27.980
And here we go.

04:28.340 --> 04:35.140
Now we're going to display the same output but with grids and legend.

04:35.140 --> 04:36.140
Run it again.

04:36.340 --> 04:37.860
You see the grids.

04:38.020 --> 04:41.220
You see the label size and the price.

04:41.220 --> 04:42.940
You see the legend here.

04:43.180 --> 04:46.340
And the dots are inside this grid.

04:46.340 --> 04:53.100
So this is a very important thing to visualize and plot the data on this chart.

04:53.100 --> 05:00.980
So in this video we learned how to plot a scatter a scatter plot, how to create this diagram showing

05:01.020 --> 05:07.940
the points and the results of all the first ten random generated results.
