WEBVTT

00:01.070 --> 00:02.300
Hello, My name is Stefan.

00:02.840 --> 00:08.690
Now that we have explored the inner workings of an object file, it's time to venture into the disassembly

00:08.690 --> 00:10.460
of a complete binary.

00:10.490 --> 00:14.660
Now, let's begin with an example binary that contains symbols.

00:14.660 --> 00:21.350
And then we will proceed to examine its stripped counterpart to observe the contrasting disassembly

00:21.350 --> 00:22.000
output.

00:22.010 --> 00:30.530
And it's important to note that disassembly and disassembling an object file differs significantly from

00:30.530 --> 00:32.760
disassembling a binary executable.

00:32.780 --> 00:39.410
So when disassembling an object file, we have the luxury of working with symbols that provide valuable

00:39.410 --> 00:41.100
contextual information.

00:41.120 --> 00:49.040
Symbols acts as a guidepost, enabling us to navigate through the code with greater ease.

00:49.070 --> 00:55.370
However, the disassembly of a binary executable presents a unique challenge.

00:55.370 --> 01:03.600
Without the presence of symbols or other symbolic information, we must rely solely on the structure

01:03.600 --> 01:06.000
and patterns with the binary itself.

01:06.000 --> 01:11.730
So this requires a deeper understanding of assembly language and the ability to decipher the code's

01:11.740 --> 01:15.240
logic based on its instructions and data.

01:15.930 --> 01:22.520
Disassembling a binary executable without symbols demands a more meticulous and intricate approach.

01:22.530 --> 01:29.760
We must carefully analyze the code, identify known functions or recognizable patterns, and reconstruct

01:29.760 --> 01:33.450
the program's logic through careful observation.

01:33.750 --> 01:39.570
It requires a keen eye and a solid grasp of assembly language concepts, while disassembling a stripped

01:39.570 --> 01:48.630
binary may be more arduous, it serves as an excellent exercise in honing up reverse engineering skills

01:48.630 --> 01:52.230
and gaining deeper understanding of the code's inner workings.

01:53.660 --> 02:02.090
And here, before explaining further, we will use the objdump to disassemble an executable here.

02:02.090 --> 02:08.990
So objdump uppercase m Intel d a dot out here.

02:09.440 --> 02:11.960
And this is our output here.

02:13.060 --> 02:13.800
The first.

02:13.800 --> 02:15.730
Firstly, we have.

02:16.250 --> 02:21.350
The that's what this assembling section in it.

02:21.350 --> 02:22.010
Right.

02:22.340 --> 02:23.240
So.

02:24.140 --> 02:25.010
Here.

02:26.750 --> 02:31.720
You can see that the binary has a lot more code than the object file.

02:31.730 --> 02:35.600
So it's no longer just the main function or even just a single code section.

02:35.630 --> 02:38.860
There are multiple sections now.

02:38.870 --> 02:41.150
The names like the init.

02:41.800 --> 02:43.640
Pee pee pee.

02:44.940 --> 02:45.390
Here.

02:45.420 --> 02:46.790
This is a pre init.

02:46.880 --> 02:47.700
PLT.

02:49.340 --> 02:51.260
And text.

02:53.320 --> 02:57.730
Here we have the three main sections that we are interested in.

02:58.000 --> 03:04.900
So these sections all contain codes serving different functions such as programming, initialization

03:04.900 --> 03:06.820
or stubs for calling shared libraries.

03:06.820 --> 03:12.100
So the text sections here, let's actually copy this and.

03:18.430 --> 03:23.020
Let's actually copy this into Notepad and we'll see that better.

03:29.780 --> 03:30.950
And here.

03:32.460 --> 03:33.870
What are we going to do is.

03:35.430 --> 03:37.050
Let's go to the next section.

03:42.130 --> 03:44.280
Assembly of a section text.

03:44.280 --> 03:47.220
So the text section is the main code section.

03:47.220 --> 03:51.120
So and it contains the main function as well as you as you saw here.

03:51.120 --> 03:57.930
So as you can see, this is a main function and it also contains a number of other functions such as

03:57.930 --> 04:05.970
the Start that are responsible for tasks such as setting up the command line arguments and runtime environment

04:05.970 --> 04:10.680
for main and cleaning up after Main.

04:12.680 --> 04:19.520
And these extra functions are standard functions present in any Elf binary produced by GCC.

04:19.550 --> 04:28.190
You can also see that the previously incomplete code and data references have now been resolved by the

04:28.190 --> 04:28.690
linker.

04:28.700 --> 04:32.000
For instance, the call paths here.

04:33.040 --> 04:36.270
Let me find that call.

04:36.440 --> 04:37.210
Here.

04:38.570 --> 04:39.500
But this one.

04:42.150 --> 04:43.050
This one here?

04:43.050 --> 04:43.620
Yeah.

04:44.250 --> 04:48.360
The call here in our main function.

04:49.780 --> 04:54.130
Now points to the proper setup in the PLT.

04:54.310 --> 04:57.040
Set in the PLT here.

04:57.040 --> 04:58.570
Paths at PLT.

04:58.600 --> 04:59.230
Here.

05:00.250 --> 05:01.600
And for this.

05:02.630 --> 05:08.030
Points to a proper setup for the shared library that contains paths itself.

05:08.030 --> 05:11.480
So I will explain the workings of pit stops in next lectures.

05:11.630 --> 05:12.140
Again.

05:12.140 --> 05:12.710
So.

05:12.710 --> 05:20.930
So that the full binary executable contains significantly more code and data that I haven't showed it

05:21.590 --> 05:22.430
for now.

05:22.580 --> 05:24.440
And then the corresponding object file.

05:24.440 --> 05:24.740
So.

05:24.740 --> 05:28.220
But so far the output isn't much more difficult to interpret, right?

05:28.220 --> 05:37.520
So the changes to the binary is stripped, which uses the objdump to disassemble the stripped version

05:37.520 --> 05:38.990
of an example binary.

05:38.990 --> 05:42.800
We're going to use that now let's clear that.

05:43.600 --> 05:45.790
Let's close the backgrounds here.

05:49.460 --> 05:49.730
Come.

05:55.730 --> 05:56.780
And here.

05:57.170 --> 05:58.760
Objdump.

05:59.540 --> 06:00.440
Objdump.

06:02.180 --> 06:06.230
And Intel here and a or.

06:08.050 --> 06:08.880
My app.

06:08.890 --> 06:10.780
My app dot.

06:13.750 --> 06:14.320
Sexually.

06:14.560 --> 06:16.090
I think we had this.

06:20.850 --> 06:22.800
My apt out of.

06:29.570 --> 06:30.410
A dot out.

06:31.960 --> 06:35.320
And here again, have this.

06:36.780 --> 06:37.080
By.

06:37.470 --> 06:38.460
Copy this again.

06:40.830 --> 06:42.000
The new mousepad.

06:46.440 --> 06:53.580
And here the main takeaway of this output is that while the different sections are still clearly distinguishable

06:53.580 --> 06:58.170
like the init, the init the.

07:00.260 --> 07:00.580
Light.

07:02.010 --> 07:04.140
They also have the text here.

07:06.590 --> 07:08.960
Uh, but the functions are not.

07:09.320 --> 07:15.350
Instead, all functions have been merged into one big blob of code.

07:15.350 --> 07:18.860
So the start functions here start.

07:19.950 --> 07:23.400
Start functions begins at here.

07:24.020 --> 07:29.030
And the register or the register here?

07:30.360 --> 07:32.520
The register functions.

07:32.520 --> 07:36.120
The register team clones begins after the start.

07:36.850 --> 07:37.030
Those.

07:37.070 --> 07:42.430
So the main functions start at somewhere bottom here.

07:42.460 --> 07:46.270
Here the main function starts here.

07:47.140 --> 07:49.360
And ends at here.

07:50.080 --> 07:57.160
So but in all of these cases, there's nothing special to indicate that the instructions at these markers

07:57.160 --> 07:58.480
represent function starts.

07:58.480 --> 08:06.640
So the only exceptions are the functions in the PLT sections here, the assembly of section PLT.

08:09.630 --> 08:10.260
And.

08:14.950 --> 08:21.280
And which these sections have their names, as we saw before.

08:22.190 --> 08:23.600
Uh, the before output.

08:24.510 --> 08:30.690
And other than that, you are on your own to try make sense of this disassembly output.

08:30.720 --> 08:34.500
Even in this simple example, things are really confusing.

08:34.500 --> 08:40.260
Imagine trying to make sense of a larger binary containing hundreds of different functions all fused

08:40.260 --> 08:40.800
together.

08:40.800 --> 08:49.650
This is exactly why accurate automated function detection is so important in many areas of reverse engineering,

08:49.650 --> 08:55.950
malware analysis or binary analysis, which you will learn in next lectures.
