WEBVTT

00:00.400 --> 00:01.360
Hello everyone.

00:01.400 --> 00:05.480
I'm Typhoon and welcome back to another awesome lecture here.

00:09.680 --> 00:15.160
So today we are going to start our journey into the world of low level programming by understanding

00:15.160 --> 00:21.400
something called a r m assembly language.

00:21.640 --> 00:27.400
Now if you have heard of it, you might already know it is critical for analyzing how software works

00:27.400 --> 00:34.240
on ARM based systems like your smartphone, tablet, or even many embedded systems.

00:34.520 --> 00:39.120
But what exactly is assembly language and why should we care about it?

00:39.840 --> 00:42.520
So if we.

00:48.120 --> 00:49.720
Like high level.

00:54.400 --> 00:55.400
And low level.

00:55.400 --> 00:55.800
Right.

00:56.200 --> 00:57.560
Low level.

00:58.200 --> 01:03.690
So most programmers today write code in high level languages like Python.

01:07.450 --> 01:11.170
C plus plus C even C here.

01:11.930 --> 01:14.650
And these are easier to read and understand.

01:14.690 --> 01:20.010
They look a bit like English and are designed for humans to work with.

01:21.170 --> 01:24.810
And even Java or JavaScript.

01:28.490 --> 01:33.130
Not exactly JavaScript, but like C sharp.

01:36.410 --> 01:37.770
And so on and so forth.

01:39.770 --> 01:44.610
However, in your computer's brain, the CPU.

01:47.170 --> 01:49.850
Doesn't understand these high level codes.

01:51.050 --> 01:59.730
So it only understands machine code, which is made up of long sequences of zeros and ones.

02:00.010 --> 02:07.500
And to bridge this gap, high level code is compiled into a binary machine code, which is actually

02:07.540 --> 02:09.420
what runs on the hardware.

02:09.860 --> 02:12.180
Now here's where assembly comes in.

02:12.220 --> 02:16.260
So assembly language is human readable version of machine code.

02:23.540 --> 02:27.660
Which is made up of long sequences of zeros and ones.

02:27.700 --> 02:32.780
Now to bridge this gap, high level code is compiled into a binary.

02:37.020 --> 02:40.340
Machine code, which is what actually runs on the hardware.

02:40.340 --> 02:47.300
But in this phase here we have lots of, um, things like linking libraries and which you will learn

02:47.300 --> 02:49.180
that, but basically high level code.

02:49.420 --> 02:53.380
This final destination is binary, which is a low level code.

02:54.660 --> 02:57.420
Now here's where assembly comes in.

02:58.140 --> 03:02.220
So assembly language is human readable version of machine code.

03:02.340 --> 03:09.790
And each assembly The instruction corresponds to a single operation that CPU can execute.

03:10.030 --> 03:17.910
So while the CPU doesn't run assembly directly, it's a critical step between high level code and this

03:17.950 --> 03:19.030
binary here.

03:20.030 --> 03:23.870
So why learn assembly for reverse engineering?

03:24.990 --> 03:31.750
So imagine trying to understand a book written in a strange language without any spaces or punctuation.

03:31.790 --> 03:34.470
That's what raw code looks like.

03:34.790 --> 03:38.150
And now assembly gives us structure and readability.

03:38.270 --> 03:47.310
And in reverse engineering, we often work with binaries or programs where the source code is unavailable.

03:47.350 --> 03:56.870
Now, by learning assembly, especially in the R or X 8664.

03:59.630 --> 04:06.920
We can analyze how these programs work and even find bugs or security Vulnerabilities.

04:07.720 --> 04:14.080
Now, to better appreciate this, let's rewind and talk about how computers started here.

04:14.760 --> 04:19.560
So at the most basic level, computers doesn't understand language like we do.

04:19.600 --> 04:23.600
Instead they deal with electrical signals.

04:24.640 --> 04:31.960
Now these signals are either on or or now on is actually one as computer understand.

04:32.120 --> 04:35.640
But you have learned that in previous lectures in computer architecture.

04:35.840 --> 04:40.960
So it is either 0 or 1 which is on or off.

04:41.600 --> 04:47.840
On corresponds to one and zero corresponds to off.

04:50.640 --> 04:53.080
Now these are represented by the voltage levels.

04:53.240 --> 04:59.280
Now to represent this in a way we can store and manipulate we use this binary system.

05:01.840 --> 05:06.320
Now each bit this is a short for binary digit.

05:06.320 --> 05:13.490
So This is actually means what binary digit.

05:22.570 --> 05:26.330
So one bit can only hold two values.

05:26.330 --> 05:27.010
Not much.

05:28.010 --> 05:33.890
One bit can only hold either 1 or 0.

05:36.050 --> 05:38.370
So let's take an example here.

05:39.930 --> 05:44.570
The number that I will write some random number here to

05:44.810 --> 05:52.250
84334537.

05:53.930 --> 05:58.530
This number in decimal looks in binary are

05:58.570 --> 06:05.930
100010, and so on.

06:05.930 --> 06:08.010
So this is just a random binary number right.

06:08.490 --> 06:11.260
So that is basically what that is basically.

06:11.820 --> 06:15.900
35 bits long.

06:17.580 --> 06:23.300
So you need 35 bits in binary to represent this large number.

06:23.340 --> 06:25.180
Now this is pretty powerful right.

06:25.740 --> 06:27.780
But there's a catch here.

06:28.620 --> 06:29.620
We do.

06:30.020 --> 06:30.660
We know.

06:32.980 --> 06:37.660
How we do know where one number ends and the next begins.

06:38.380 --> 06:44.980
And this is where the concept of byte comes into place.

06:49.860 --> 06:53.260
So early computer designers faced this problem.

06:53.260 --> 06:59.260
Now their solution was to create a fixed size groups of bits.

06:59.700 --> 07:04.060
And these groups are called what bytes?

07:08.180 --> 07:12.120
So how many bits should a byte have now?

07:12.120 --> 07:16.200
Surprisingly, it wasn't always eight bits.

07:18.480 --> 07:25.200
Like we are used to, early systems had either four bit or.

07:27.320 --> 07:31.160
Six bit bytes for instance.

07:31.480 --> 07:32.960
So one byte is either.

07:33.160 --> 07:37.200
In old computers it is one byte was in the four bits or six bits.

07:37.240 --> 07:47.120
For instance, in IBM's early computers used a six bit system called b.

07:49.480 --> 07:53.920
C, d, I, c in the 1950s.

07:55.680 --> 07:57.640
But eventually, with the IBM system

07:59.640 --> 08:12.850
360 and the E, b, c the IC format in the 1960s.

08:13.610 --> 08:22.490
The eight bit, the eight bit here, eight bit byte became the standard.

08:22.490 --> 08:23.810
So today.

08:26.650 --> 08:29.650
Today basically one byte.

08:32.770 --> 08:38.210
One byte equals eight bits.

08:42.050 --> 08:47.330
So you may ask here why eight bits.

08:50.490 --> 08:55.970
The first reason to this question we have three reasons.

08:56.010 --> 08:57.330
Three basic reasons.

08:57.330 --> 09:05.010
The first reason you can represent 256 unique values.

09:13.300 --> 09:17.020
From 0 to 255.

09:17.220 --> 09:21.500
So basically 256 and it is large enough.

09:23.340 --> 09:32.500
So it is large enough to represent every standard letter number or a symbol in a single byte.

09:33.460 --> 09:35.580
And it is efficient.

09:41.620 --> 09:45.660
It is efficient for memory and the storage.

09:46.900 --> 09:51.620
And here's the cool part a byte is just a series of bits.

09:53.220 --> 09:56.420
But how do you interpret those bits?

09:56.420 --> 09:58.220
Depends on the software.

09:58.220 --> 10:05.180
So if you tried to treat it as an unsigned number, we get values from zero inside.

10:05.300 --> 10:07.580
Let's actually write it in a way that.

10:08.260 --> 10:13.140
So in unsigned we have two two types of bytes.

10:16.910 --> 10:18.710
Or numbers on?

10:20.870 --> 10:21.350
Yeah.

10:22.630 --> 10:23.190
On.

10:23.990 --> 10:26.670
Signed and signed.

10:33.750 --> 10:36.230
Let's draw the eight boxes here.

10:36.630 --> 10:37.110
Two.

10:37.630 --> 10:38.190
Three.

10:38.750 --> 10:39.310
Four.

10:39.670 --> 10:40.350
Five.

10:40.710 --> 10:41.350
Six.

10:41.710 --> 10:42.190
Eight.

10:43.110 --> 10:45.510
123.

10:45.950 --> 10:46.390
Four.

10:46.990 --> 10:47.670
Five.

10:48.070 --> 10:48.750
Six.

10:49.270 --> 10:49.790
Eight.

10:50.510 --> 10:52.470
Now remember, these are bits.

10:53.030 --> 10:57.470
Eight bits, both unsigned and signed numbers.

10:57.790 --> 10:59.350
Has eight bits, of course.

10:59.790 --> 11:04.790
Basically, bite is a bite means eight bits.

11:05.190 --> 11:05.590
So.

11:07.830 --> 11:17.880
If we trade a bite as an unsigned number, we get values from 0 to 255.

11:18.560 --> 11:26.280
If we use two's complement, we can get signed values from -128 to.

11:31.000 --> 11:34.240
Plus 127.

11:35.400 --> 11:38.320
So we will go into more detail.

11:38.360 --> 11:41.680
In binary arithmetic and two's complement later.

11:42.080 --> 11:51.040
Uh, but the key takeaway is everything in computers starts from binary which is ones and zeros.

11:51.360 --> 12:04.880
How we organize and interpret them while explaining and uh, this lectures I have, I remember that

12:04.920 --> 12:14.880
uh, in our previous lectures, uh, like, um, probably ten lectures from now on in a section named

12:15.080 --> 12:16.080
Boolean.

12:20.250 --> 12:22.130
Boolean algebra.

12:28.810 --> 12:35.530
We have a lecture named Boolean Algebra for low level computing.

12:54.050 --> 12:57.810
So I highly recommend you watch this lecture.

12:58.650 --> 13:05.530
And in this lecture I have explained how unsigned and signed bits and bytes work.

13:05.530 --> 13:13.410
So how we how we represent from 0 to 255 and how we represent the negative numbers using this signed

13:14.890 --> 13:15.570
numbers.

13:15.730 --> 13:17.090
So thank you for watching.

13:17.090 --> 13:19.210
I'm Tayfun and I'm waiting you in the next lecture.
