WEBVTT

00:00.960 --> 00:08.800
And to clearly explain what machine code and assembly language are, how they relate to processor architecture,

00:08.800 --> 00:15.120
and how programmers and hardware designers use binary instruction encoding to perform operations like

00:15.360 --> 00:20.600
additions, subtraction, and memory access.

00:22.040 --> 00:28.320
Now, this will be done in a way that is engaging, hands on, and suitable for learning.

00:29.920 --> 00:32.400
Now let's begin with the fundamental idea here.

00:32.520 --> 00:36.080
Computers can encode logic as data.

00:36.480 --> 00:43.400
Now that might sound simple, but in reality it's actually what makes modern computing so powerful.

00:44.400 --> 00:54.520
Early mechanical calculators could add and subtract numbers, but they couldn't change what they were

00:54.640 --> 00:58.280
doing unless you physically rewired them.

00:58.280 --> 01:09.710
Now computers can store instructions in memory just like they store the numbers and then read and execute

01:09.910 --> 01:11.630
those instructions.

01:11.630 --> 01:14.630
And this is called the machine code.

01:18.350 --> 01:24.310
Now it's the set of binary instructions that the processor understands natively.

01:24.710 --> 01:30.830
Now for example you can write on this for example binary.

01:35.070 --> 01:40.990
Now 001110010

01:41.590 --> 01:42.110
and zero.

01:42.350 --> 01:53.870
Yeah this is just a bit right now with these zeros and ones we are basically telling the CPU to add

01:53.910 --> 02:01.780
two to the value in R zero and store it in R one.

02:02.420 --> 02:08.900
So instead of being logged into one task, a computer can change behavior instantly.

02:08.940 --> 02:15.900
Now that's how we can install new software, update an application, or upgrade our operating system.

02:15.900 --> 02:22.100
And all by changing this code that the CPU runs.

02:24.420 --> 02:25.820
Now think about it.

02:25.860 --> 02:31.940
How does a CPU know what to do when it sees a stream of ones and zeros?

02:32.220 --> 02:36.100
Now this is where instruction encoding comes in.

02:36.260 --> 02:41.900
So every processor has something called the instruction.

02:46.380 --> 02:46.980
Set.

02:57.180 --> 03:01.700
Instruction set architecture or AI.

03:07.540 --> 03:10.380
So this is like a language manual for the.

03:10.580 --> 03:17.820
C defining how every combination of bits should be interpreted.

03:18.060 --> 03:21.260
Now let's say we are designing our simple CPU.

03:21.620 --> 03:26.740
We will need to do define what instruction exists.

03:26.780 --> 03:34.100
Like in this case we will add the add subtract.

03:37.700 --> 03:38.660
And load.

03:41.500 --> 03:47.940
And we need to assign a unique binary pattern like an opcode to each instruction.

03:52.060 --> 03:59.890
And we need to decide how many bits to use and how to divide those bits among the parts of the instruction,

04:00.210 --> 04:01.450
which is what?

04:03.810 --> 04:04.890
Operation.

04:09.570 --> 04:12.050
Then it will be, we'll say inputs.

04:14.770 --> 04:16.010
And outputs.

04:24.010 --> 04:31.890
So here what we will basically do is we will add we will give an add.

04:34.810 --> 04:39.970
Here what let's say we want to give it.

04:42.210 --> 04:45.730
Let's say yeah it could be a random number.

04:45.730 --> 04:46.410
It shouldn't.

04:46.450 --> 04:55.480
It is not obligated to use bytes or bytes or just a Signed characters here so we can use seven bits

04:56.400 --> 05:00.440
0001110.

05:00.600 --> 05:01.960
So this is for add.

05:02.480 --> 05:05.680
And let's also use the sub which is subtraction.

05:07.800 --> 05:12.400
0001111.

05:13.040 --> 05:17.720
Now these binary patterns are called the machine instructions.

05:17.720 --> 05:22.640
And when you write a program in binary that is what you are writing.

05:23.760 --> 05:29.720
And writing binary by hand is tedious and error prone.

05:29.760 --> 05:34.800
Now instead, programmers use something called the assembly language.

05:42.640 --> 05:45.120
This is a symbolic version of machine code.

05:45.280 --> 05:48.040
So each instruction has a mnemonic.

05:48.360 --> 05:51.550
This is a short code word to represent that.

05:51.550 --> 05:57.230
For example, this is ad for addition and sub for subtraction.

05:57.550 --> 06:06.390
So here's how um our earlier this mnemonics would look in this case here.

06:06.390 --> 06:08.750
So we have the operation.

06:08.750 --> 06:11.150
It does uh let's say addition.

06:14.110 --> 06:21.870
The opcode it has in this case it is 00011110.

06:22.110 --> 06:24.790
And it has the mnemonic is the AD.

06:25.510 --> 06:31.070
And so we are basically basically creating a dictionary for a CPU here.

06:31.270 --> 06:33.350
So this is for us right.

06:33.390 --> 06:36.870
The operation is for us operation.

06:36.870 --> 06:37.510
It does.

06:39.710 --> 06:47.710
For the this is for the programmer that will use this architecture that we are developing here.

06:47.950 --> 06:48.940
And yeah.

06:48.940 --> 06:53.260
So this is a opcode in bits.

06:53.540 --> 06:55.980
And here this is a mnemonic.

06:58.580 --> 07:02.380
How we want to represent in assembly language.

07:02.820 --> 07:05.700
So we will also have the subtraction.

07:11.180 --> 07:19.300
So in subtraction opcode we will again 01110 no 1111.

07:19.940 --> 07:21.540
And this is what sub.

07:24.580 --> 07:30.660
Assembly code is human readable but maps directly to machine instructions.

07:30.660 --> 07:34.380
So assemblers convert this into binary.

07:34.580 --> 07:37.220
The CPU can run.

07:37.820 --> 07:45.100
Now think of a machine code as a music written in binary and assembly as a sheet music.

07:45.220 --> 07:50.490
So you play the same tune, but one is easier for humans to read.

07:53.130 --> 07:56.410
So we can say that assembly is basically the machine code.

07:58.490 --> 08:04.250
So you may also ask here, where do numbers come from?

08:05.330 --> 08:05.650
Huh?

08:06.010 --> 08:09.210
So yeah, we are using numbers uh.

08:11.770 --> 08:13.370
To use registers.

08:13.370 --> 08:19.530
So when we add numbers we have to specify what to add and where to put the result.

08:19.850 --> 08:24.210
So CPUs use registers to store temporary data.

08:24.210 --> 08:31.290
And registers are small storage locations built directly into the CPU for fast access.

08:31.450 --> 08:37.050
And in this example, let's say um, let's write this code here.

08:38.450 --> 08:39.090
Add.

08:41.370 --> 08:42.370
Register.

08:46.930 --> 08:48.250
Register one.

08:51.370 --> 08:52.890
Register zero.

08:55.650 --> 08:56.090
Two.

08:56.930 --> 09:02.650
So Register zero is the source.

09:07.370 --> 09:07.810
The.

09:11.610 --> 09:15.530
This two here is an immediate value.

09:18.530 --> 09:20.090
Let's actually use different color.

09:27.610 --> 09:31.970
And the R1 here is a destination register.

09:31.970 --> 09:34.930
So this is where the the result is stored.

09:41.010 --> 09:45.920
So registers are crucial because reading from memory is slow.

09:46.360 --> 09:49.680
So using registers keep things fast and efficient.

09:51.520 --> 09:52.080
And.

09:54.360 --> 09:59.200
Another thing you need to learn here is the instruction format.

09:59.640 --> 10:01.560
So we will put it all together here.

10:01.720 --> 10:06.840
So let's assume we are building a 16 bit instruction.

10:07.200 --> 10:13.320
Uh that means every instruction is exactly, uh, 16 binary digits log long.

10:13.360 --> 10:16.680
So we'll break the binary, uh, down.

10:17.440 --> 10:23.480
So but first I will open a new.

10:26.280 --> 10:27.560
Blue board here.

10:36.320 --> 10:36.800
Okay.

10:37.240 --> 10:46.430
So what we will do here, we will arrange seven bits for opcode.

10:49.790 --> 10:58.510
Uh, so the opcode is basically means what operation to do you remember from this previous lecture.

11:01.590 --> 11:07.470
And we will save three bits for the immediate value.

11:08.670 --> 11:10.630
This is a constant number let's say.

11:17.670 --> 11:25.470
And we will save uh also three bits for source register.

11:31.230 --> 11:39.310
So we will write this as r n and we will save three bits.

11:39.310 --> 11:42.780
We will just have to speak more correctly.

11:43.140 --> 11:55.260
We will use three bits for the destination register and in code we will write this as r d d for uh,

11:55.540 --> 11:58.700
basically for the destination here.

12:02.860 --> 12:08.100
So what we will do here is.

12:10.420 --> 12:11.380
In OP.

12:13.780 --> 12:18.940
We will have the zero seven bits.

12:23.580 --> 12:25.180
Seven bit.

12:26.340 --> 12:30.580
This is basically the OP and in the.

12:32.860 --> 12:37.020
Constant number we will have three bit.

12:42.090 --> 12:52.770
In the source register, which is r n, we will have again three bit, and in the destination register

12:53.370 --> 12:56.570
we will have again three bits.

12:59.170 --> 13:04.250
So let's manually encode this instruction.

13:04.930 --> 13:07.130
So operation is add.

13:14.650 --> 13:15.130
Two.

13:28.090 --> 13:29.930
So let's break it down.

13:30.570 --> 13:36.010
So remember from the previous lecture the add is.

13:39.090 --> 13:44.090
00001110.

13:45.890 --> 13:52.130
This two in binary is 010 and the are zero.

13:52.170 --> 13:58.930
Let's say hypothetically is zero and R1 is 001.

14:00.370 --> 14:03.810
So the full binary instruction is

14:05.010 --> 14:11.890
001110010000

14:11.890 --> 14:12.970
and 001.

14:14.050 --> 14:21.010
And if we convert that into hexadecimal it is going to be

14:22.730 --> 14:28.170
18C1X.

14:28.610 --> 14:29.010
Yeah.

14:30.170 --> 14:33.810
So this is basically how it is in hexadecimal.

14:35.160 --> 14:37.440
Now let's do another example.

14:40.200 --> 14:42.840
Let's say sub.

14:45.280 --> 14:51.080
By the way you can change this places with each other.

14:51.080 --> 14:52.440
So it doesn't matter the.

14:54.680 --> 14:56.960
Arrangement of this here.

14:58.000 --> 15:01.760
Basically we are creating a new architecture hypothetically.

15:02.520 --> 15:10.520
So yeah again let's say 2R1 or R zero or.

15:10.640 --> 15:14.240
Yeah let's do the same R one.

15:14.800 --> 15:23.480
So here remember from the previous lecture the sub we assign that 0001.

15:24.200 --> 15:27.520
So in this case 0001.

15:30.840 --> 15:32.630
One and four ones once.

15:32.630 --> 15:32.910
Here.

15:32.950 --> 15:33.350
Yeah.

15:33.750 --> 15:34.350
One.

15:34.350 --> 15:34.910
One.

15:34.910 --> 15:35.390
One.

15:35.830 --> 15:38.550
And now it is time to this.

15:38.590 --> 15:39.070
Yeah.

15:39.350 --> 15:40.310
This is going to be again.

15:40.350 --> 15:41.070
Same.

15:41.790 --> 15:42.990
000.

15:43.390 --> 15:46.870
And again here it's going to be 001.

15:47.590 --> 15:53.110
And this is the thing for the second instruction.

15:54.910 --> 15:58.990
So that's how it basically works.

15:59.790 --> 16:05.550
Now of course the writing binary manually like this is impractical.

16:05.990 --> 16:15.310
So most assemblers can convert uh, from, uh, this readable format into a final machine code.

16:15.310 --> 16:21.950
So you can write a simple program that does not come this, that does this conversion automatically

16:22.150 --> 16:27.190
from this, uh, assembler language to machine code.

16:27.190 --> 16:30.270
So you don't need to worry about bits and bytes here.

16:31.660 --> 16:32.380
Mostly.

16:33.180 --> 16:37.940
Not mostly, but I can say like 20% of the times.

16:38.220 --> 16:40.740
And you will see why I'm saying that.

16:42.820 --> 16:49.100
Uh, so let's add also add the all the R instruction.

16:49.100 --> 16:53.380
You, you remember a load instruction.

16:54.220 --> 16:57.220
Um, so we need to access the memory as well.

16:57.540 --> 16:58.100
Right.

17:00.180 --> 17:04.300
So what will basically do is.

17:08.220 --> 17:17.020
So yeah, so far we have covered the arithmetic instructions uh like the add and sub uh, which operate

17:17.020 --> 17:19.700
on the data that's already inside the CPU's register.

17:19.700 --> 17:26.100
But what happens when we need to use a value that's stored in a memory instead of in register?

17:26.140 --> 17:34.290
Now this is where new kind of instruction Distraction comes into play, which is the load instruction

17:34.490 --> 17:43.090
in our, uh, so-called our own, uh, instruction set architecture by typhoon.

17:43.570 --> 17:44.050
Yeah.

17:44.690 --> 17:48.970
Uh, now, um, let's assign the opcode for this instruction.

17:49.690 --> 17:55.010
Uh, so also write this here and why we are using so much red here.

17:55.010 --> 17:56.290
I don't like the red.

17:56.650 --> 17:57.050
Yeah.

17:58.050 --> 18:00.210
So we will use the load.

18:01.970 --> 18:09.930
So for the load instruction opcode is going to be again let's use the seven bits.

18:10.090 --> 18:14.610
Uh so one and 000.

18:15.130 --> 18:18.970
So my mnemonic is going to be uh load.

18:31.720 --> 18:36.520
So what we will basically do here is.

18:40.680 --> 18:47.360
We have also give the opcode operation name.

18:47.360 --> 18:54.480
And the mnemonic for this operation name is just for us to see what this instruction does.

18:56.080 --> 19:00.760
So we will say our load instruction takes two operands.

19:06.200 --> 19:11.040
Takes it takes two operands a register that will receive the data.

19:11.600 --> 19:13.320
So destination.

19:14.120 --> 19:20.400
And the register that contains the memory address to load from memory.

19:20.400 --> 19:20.960
Right.

19:21.000 --> 19:23.000
So it is.

19:23.040 --> 19:24.560
So it's called the source.

19:27.520 --> 19:29.040
Source rage.

19:32.240 --> 19:36.680
So now let's define our registers.

19:36.840 --> 19:38.960
Uh, just like before.

19:39.280 --> 19:42.480
So we have a register, uh, let's say R0.

19:42.520 --> 19:44.480
Let's actually use the different color.

19:45.160 --> 19:59.320
So we have a register R zero is 000, R one is 001, R two is 010.

19:59.760 --> 20:02.840
And r oops.

20:02.880 --> 20:09.440
Yeah R2010 R3 is 011.

20:10.720 --> 20:11.840
Now that's good.

20:12.160 --> 20:16.640
So if you want to write the load.

20:18.680 --> 20:22.520
R3R2.

20:23.280 --> 20:30.790
That means the load, the value from the memory address stored in R two.

20:33.630 --> 20:37.030
And put it into our tree.

20:41.630 --> 20:44.950
So let's encode this instruction.

20:45.630 --> 20:48.070
You won't need this in real life.

20:48.190 --> 20:53.230
To turn this instruction to machine bits and bytes.

20:53.550 --> 20:59.470
But this will be very helpful for you in your learning process.

21:00.550 --> 21:00.990
Yeah.

21:01.350 --> 21:03.350
Now, let's, uh, get the opcode.

21:03.390 --> 21:11.910
You remember opcode was, uh, three zeros, one on and three zeros again from here.

21:11.950 --> 21:12.350
Oops.

21:12.910 --> 21:13.830
From here.

21:14.550 --> 21:16.910
And um, again.

21:17.270 --> 21:23.460
So here we have uh, 010 on the 010.

21:24.620 --> 21:40.620
So and we also have 011 for the R3 and three bits is unused uh reserved for future.

21:40.620 --> 21:43.140
So we will say just 000.

21:43.300 --> 21:45.540
So this is let's say reserved.

21:49.500 --> 21:52.780
Remember in our previous yeah it is deleted here.

21:52.780 --> 21:53.180
Yeah.

21:54.340 --> 21:57.660
In our subtraction addition we will use this three bits.

21:57.820 --> 22:02.620
Um, but in our a lot we don't need that three bits.

22:02.620 --> 22:07.900
So we will just write it zero saying that it is reserved for now.

22:11.300 --> 22:16.580
Also, this is not a good practice to use a reversed uh, bytes in registers.

22:18.140 --> 22:23.250
But you see We have instructions for these places.

22:23.290 --> 22:31.970
CPU will not see this as a me thing, but yeah, it is good to keep in mind on this.

22:33.650 --> 22:34.170
Bits.

22:34.570 --> 22:39.250
They should be unique for each instruction here.

22:39.250 --> 22:41.130
So you can't use these bits.

22:41.130 --> 22:43.410
And this one and this bits on this one right.

22:45.130 --> 22:45.450
Yeah.

22:45.450 --> 22:51.610
And you will learn uh, that nitty gritties of assembler language uh, as you.

22:54.010 --> 22:55.690
Advanced more in these lectures.

22:56.290 --> 22:57.130
So, yeah.

22:57.410 --> 23:04.010
Uh, so the final, uh, 16 bit instruction, uh, basically becomes, uh, this so you can write this

23:04.010 --> 23:11.130
in binary and hex version, um, on, um, your own, uh, to reinforce and, and.

23:11.490 --> 23:11.930
Yeah.

23:16.370 --> 23:21.600
So, uh, we also have another question you may ask here.

23:22.240 --> 23:22.960
What happened?

23:23.480 --> 23:24.880
What happened during the load?

23:25.560 --> 23:34.680
Basically, first it reads the value in register R2 which is R2 is

23:35.360 --> 23:39.800
0100.

23:40.120 --> 23:45.320
Now it treats that value as a memory address and it goes to the memory location.

23:46.200 --> 23:47.880
It reads the value stored at.

23:48.240 --> 23:54.200
For example here and it stores the value into the register r tree.

23:54.960 --> 23:59.480
And this is called the indirect addressing.

24:00.080 --> 24:03.800
So we are using the contents of a register to point to a memory location.

24:04.560 --> 24:11.360
So basically R2 goes to 010.

24:11.800 --> 24:17.710
So basically says the memory Hypothetically speaking.

24:17.750 --> 24:18.750
Like here.

24:18.750 --> 24:20.070
It doesn't say anything.

24:20.470 --> 24:22.910
42, but so close to here.

24:22.910 --> 24:25.750
And says the memory has value.

24:28.030 --> 24:29.830
42 so.

24:29.990 --> 24:30.950
And it is.

24:31.110 --> 24:31.750
Load.

24:33.950 --> 24:37.550
Load to our tree.

24:39.590 --> 24:48.190
And we also have we can also write the opposite of the load which is what the store.

24:50.830 --> 24:57.990
So we also want if we often also want to write values from a register back into a memory.

24:58.030 --> 25:01.990
Now this is done using this store instruction.

25:02.270 --> 25:04.030
Just I'm saying that hypothetically.

25:04.030 --> 25:08.310
So this is just we are creating our own instruction set architecture.

25:08.590 --> 25:13.470
Now so to be prepared for the next lectures.

25:14.550 --> 25:20.950
And yeah, now this will store the value in R3 into the memory address pointed to R2.

25:21.910 --> 25:25.630
Uh, and of course the it will have a different opcode.

25:25.670 --> 25:31.990
Let's say, uh, for the opcode it will have 0001001.

25:33.350 --> 25:35.230
And as a.

25:38.070 --> 25:40.510
Store and store.

25:41.670 --> 25:43.350
Store Reg.

25:44.590 --> 25:45.190
And.

25:45.230 --> 25:45.670
Yeah.

25:47.350 --> 25:54.550
So you may ask, what is the equivalent of the load in the real arm architecture, uh, instruction

25:54.550 --> 25:55.430
set architecture.

25:55.430 --> 26:02.670
It is the all the r which you will learn in the next lectures.

26:03.310 --> 26:09.150
Now, uh, let's review what we have learned today in this almost half an hour lecture.

26:09.470 --> 26:17.940
So machine code is a binary language that the CPU runs Assembly is a readable version using mnemonics

26:18.020 --> 26:20.780
uh, like uh add sub load.

26:20.820 --> 26:23.380
And there's so much mnemonics here.

26:23.580 --> 26:29.060
And every instruction is encoded as a series of bytes that follow specific rules.

26:29.300 --> 26:38.820
And remember registers are fast local storage units that keep temporary values and memory access using

26:38.860 --> 26:46.740
indirect addressing via instruction like LDR and in our our own architecture set we use this load.

26:47.340 --> 26:48.020
And.

26:51.820 --> 26:52.500
Uh, yeah.

26:52.540 --> 26:54.980
In real CPU like ARM.

26:55.220 --> 27:01.100
Uh, the hundreds of there are hundreds of instructions, but understanding a small subset like the

27:01.140 --> 27:05.860
add sub and LDR gives you a solid foundation.

27:06.100 --> 27:07.340
Now on the phone.

27:07.500 --> 27:09.340
Uh, thank you for watching.

27:09.620 --> 27:11.940
And I'm waiting you in the next lecture.
