WEBVTT

00:00.800 --> 00:01.480
Hello everyone.

00:01.480 --> 00:07.080
Again, I'm Tayfun and today we're starting a fresh lecture where we break down the basic structure

00:07.080 --> 00:11.400
of assembly instructions at the building blocks of any low level program.

00:12.680 --> 00:18.520
Now, if you ever looked at an assembly listing and thought it looked like a pure chaos, this lecture

00:18.520 --> 00:20.840
will give you a clear map to understand it.

00:21.440 --> 00:25.200
So let's understand the structure of assembly instruction.

00:25.520 --> 00:33.240
Assembler language is made up of straight direct line of code, and most follow a standard format like

00:33.560 --> 00:33.920
the.

00:33.960 --> 00:39.760
At the most left we have label or address.

00:41.560 --> 00:46.680
Then we have mnemonic, and after that we have operands.

00:48.800 --> 00:51.040
And then we have the command.

00:52.760 --> 00:53.920
This is just us.

00:53.920 --> 00:55.600
If we add command there will be.

00:55.640 --> 00:57.560
If we don't, there won't.

00:58.520 --> 01:03.240
Now the label format uh, as a label address.

01:03.240 --> 01:05.720
For example, let's write some assembly code here

01:05.750 --> 01:13.990
00A92AB7

01:15.110 --> 01:15.750
move.

01:17.470 --> 01:27.070
And the operates we have x dword p and.

01:29.150 --> 01:36.190
We'll move it to another address 0A9B for example.

01:37.150 --> 01:40.590
Yeah just f a.

01:42.070 --> 01:42.590
That's it.

01:42.750 --> 01:45.750
And after that uh, we have the command.

01:46.470 --> 01:48.150
This command is just the moves.

01:48.190 --> 01:58.390
Let's say we are with the command here that moves the default value at um, uh, zero a, nine b for

01:59.830 --> 02:00.030
now.

02:00.030 --> 02:01.870
Default value is this here address.

02:03.870 --> 02:10.790
And yeah, let's start now with the label or address.

02:10.990 --> 02:21.620
Now this part defines the memory location where the instruction resides in debugging tools or disassemblers.

02:21.740 --> 02:24.180
This is often raw memory address.

02:25.020 --> 02:35.140
And in this uh, in this case it is uh 00A92A B7, just an address random address.

02:35.420 --> 02:41.540
And during development labels can be used instead like, uh, start loop or read value.

02:41.660 --> 02:48.900
Now these labels make code more readable and help during branching and function calls.

02:49.780 --> 02:53.140
And then we have the mnemonic.

02:53.180 --> 02:54.300
In this case it's mov.

02:54.820 --> 03:00.460
So this is the actual instruction the operation you want the CPU to perform.

03:01.620 --> 03:13.900
Now common mnemonics include mov uh add subtract gmp uh which add adds the value subtract subtracts

03:13.900 --> 03:17.370
the values gmp jump to another instruction.

03:17.410 --> 03:22.450
And we also have call which calls a function or subroutine.

03:23.250 --> 03:29.250
We had an introductory lecture to mnemonics and instructions in previous lectures.

03:29.690 --> 03:34.250
I strongly suggest if you have passed that lecture, watch that again.

03:35.450 --> 03:41.850
Now the mnemonic is human readable representation of a binary opcode which the CPU will execute.

03:42.890 --> 03:49.890
And here we have the operands the operands basically the destination and the source.

03:50.490 --> 03:54.810
In our example x dword ptr and some hexadecimal address here.

03:55.490 --> 04:06.210
Now this means move the 32 bit value, which is uh, Dword is 32 bit value from that address up here

04:06.850 --> 04:15.770
into a EAX register, and the CPU reads the value from the memory and stores it in a register.

04:15.810 --> 04:26.470
So some operands are immediate values like mov uh, for example, move x uh five.

04:27.350 --> 04:36.710
Now some registers like EB or EC and some are memory registers like memory references like, uh, this

04:36.710 --> 04:37.150
one here.

04:37.150 --> 04:37.550
Right.

04:38.630 --> 04:42.230
And yeah, last one is the comments here.

04:43.630 --> 04:49.430
Uh, anything after the semicolon in assembly is a command.

04:49.550 --> 04:51.230
Uh, comments are not executed.

04:51.390 --> 04:54.310
Uh, just like any other programming languages.

04:54.310 --> 04:58.350
They're just there to help humans understand what code is doing.

04:59.230 --> 05:06.430
And for example, uh, in this case, we can add the comment that says, uh, to other programmers and

05:06.430 --> 05:09.030
for us that this loads the contour value.

05:09.750 --> 05:13.110
Now, commenting is crucial in reverse engineering and debugging.

05:13.470 --> 05:18.470
Uh, because when you're trying to understand someone else's code, especially without the source,

05:18.790 --> 05:22.310
these annotations became, uh, your friend.

05:23.790 --> 05:28.780
And to summarize every line in assembly language, and assemble a program doesn't matter.

05:28.780 --> 05:36.900
It's an ARM or x86 usually follows this clean structure, an address or label to tell where it lives,

05:37.220 --> 05:45.380
a mnemonic to say what it does, and some operands to show what it does it to, and optionally a comment

05:45.980 --> 05:48.020
to explain why it is there.

05:49.740 --> 05:55.060
Now, learning this format helps make sense of disassembled binaries and lets you communicate with the

05:55.060 --> 05:57.260
CPU in its native language.

05:58.380 --> 06:05.900
Now in the next lecture, we will start analyzing common instruction types and patterns used in real

06:06.060 --> 06:09.180
world binaries, especially the opcodes.

06:09.180 --> 06:10.340
And we also learn.

06:10.380 --> 06:11.780
We will also learn about the copy.

06:11.820 --> 06:12.980
How to copy the data.

06:13.780 --> 06:20.260
And after that we will start analyzing other reverse engineering and malware tools.

06:21.540 --> 06:23.540
Now that's it with our lecture.

06:23.620 --> 06:24.420
Thank you for watching.

06:24.460 --> 06:26.540
Stay tuned.

06:26.540 --> 06:32.540
And we'll turn this cryptic code into something you can read like a book.
