WEBVTT

00:00.820 --> 00:01.960
Hello, my name is Ivan.

00:01.960 --> 00:07.660
And in this lecture, let's shift our focus to the fascinating world of the program header table, which

00:07.690 --> 00:14.860
offers a segment view of the binary in contrast to the section header table which we discussed earlier,

00:14.860 --> 00:20.050
which that provides a section view primarily for static linking purposes.

00:20.080 --> 00:25.510
The program header table, which you will learn in this lecture, serves a different purpose.

00:25.780 --> 00:33.010
So it is utilized by the operating system and dynamic linker during the loading process of an Elf binary

00:33.010 --> 00:35.650
into a process for execution.

00:35.890 --> 00:43.000
The program header table enables them to locate the relevant code and make informed decisions about

00:43.000 --> 00:46.420
what to load into the virtual memory.

00:46.420 --> 00:51.760
And in an Elf binary, a segment encapsulates a zero or more sections.

00:51.760 --> 00:56.230
So let's go back to the Kali machine here and.

00:57.520 --> 00:59.350
Here we have.

00:59.530 --> 01:03.550
Let's open the include elf dot header file.

01:03.550 --> 01:10.930
And here, as you can see here, a segment encapsulates zero or more sections, essentially bundling

01:10.930 --> 01:13.660
them together into a cohesive unit.

01:13.660 --> 01:22.360
So the segments provide an execution view, making them essential for Elf for executable files, but

01:22.360 --> 01:26.410
not executable files like Relocatable objects do not require them.

01:26.410 --> 01:34.210
So to represent this segment, view the program header tables instruct employs program headers of type

01:34.210 --> 01:36.670
the Elf 64 here.

01:40.340 --> 01:46.910
As you can see here, F 64 and specifically F 64, the error.

01:49.100 --> 01:49.520
Here.

01:51.330 --> 01:59.790
So these are the as you can see here, this header file also has the header of the also comments to

01:59.790 --> 02:03.600
that says that this is a program header program, segment header.

02:03.600 --> 02:10.740
And here which this each containing various fields that provide essential information.

02:10.740 --> 02:16.920
And here as you can see, the segment types flags, file offsets, virtual addresses, physical address,

02:16.920 --> 02:21.630
size and file size and memory and segment alignment.

02:21.630 --> 02:28.950
So understanding the program header table is crucial as it grants us insights into how the operating

02:28.950 --> 02:34.380
system and dynamic linker organize and load binaries into a memory.

02:35.380 --> 02:41.500
By examining the program headers, we can decipher the layout and composition of the binary, enabling

02:41.500 --> 02:48.040
us to comprehend the crucial components that contribute to the binaries functionality.

02:48.040 --> 02:54.160
In the upcoming sections, we will delve deeper into the inner workings of program headers and their

02:54.160 --> 02:59.050
role in the loading and execution of Elf binaries.

02:59.100 --> 03:04.840
And as we continue our exploration of the Elf format, we gain a comprehensive understanding of its

03:04.840 --> 03:06.880
intricate structure.

03:07.060 --> 03:16.090
This knowledge equips us with the necessary tools to dissect and analyze binaries effectively, or reverse

03:16.090 --> 03:24.070
engineer malware and unraveling their secrets and uncovering the fascinating world of binary analysis.

03:24.100 --> 03:25.390
Now let's.

03:26.360 --> 03:31.490
Analyze this, files this information more deeply here.

03:31.490 --> 03:35.870
So and I will describe each of these fields in the next.

03:37.060 --> 03:41.050
Uh, times, uh, some of them in this lecture, some of them in next lecture.

03:41.050 --> 03:42.820
And now we will.

03:44.630 --> 03:45.290
Again.

03:45.560 --> 03:46.910
Use the red elf here.

03:46.910 --> 03:47.720
We can actually.

03:47.720 --> 03:54.830
No, let's actually open the new tab and let's go to desktop where our Hello World program exists.

03:55.670 --> 03:59.090
And here we have that A.out.

03:59.100 --> 04:01.040
Let's actually run this.

04:01.960 --> 04:04.960
And you will see that we have Hello comma world.

04:06.390 --> 04:16.710
So here we will again use the clear red, white and segments, segments and a dot out.

04:16.950 --> 04:20.010
And this is our compiled application.

04:20.340 --> 04:22.800
Just a regular Hello world application.

04:26.960 --> 04:28.640
And here we have this.

04:29.440 --> 04:30.460
Output here.

04:31.030 --> 04:37.600
You can see that we have the program headers section to segment mapping and.

04:39.290 --> 04:43.660
Which in this section we are interested in this field.

04:43.670 --> 04:52.190
So also keep you in mind that section to segment mapping at here of the rate of output, which clearly

04:52.220 --> 04:58.970
illustrates that segments are simply a bunch of sections bundled together.

04:59.300 --> 05:00.560
As you can see here.

05:00.560 --> 05:07.160
And this specific section to segment mapping is typical for most Elf binaries, but you will encounter

05:07.160 --> 05:08.660
and in.

05:10.100 --> 05:10.550
This.

05:10.590 --> 05:16.460
The rest of the section, we will learn the program header fields.

05:17.120 --> 05:23.150
As you can see here, specifically the P type which we will start P type.

05:23.300 --> 05:26.750
And we will also learn the P flags and.

05:27.850 --> 05:28.660
So on.

05:28.660 --> 05:33.190
So here we let's go back to the header file.

05:35.080 --> 05:36.520
He type P flags.

05:36.520 --> 05:38.170
We have this segments, right?

05:38.290 --> 05:46.060
So P type P flex P offset p v adder, which is segment virtual address segment, physical address,

05:46.060 --> 05:49.720
segment size and file, segment size in memory and segment alignment.

05:49.720 --> 05:53.140
So let's start with the segment type here and we will also need that.

05:56.130 --> 05:59.250
The here and perfect.

05:59.280 --> 06:02.790
We also have this marker, which is not great, but.

06:04.350 --> 06:05.600
Susceptible, I think.

06:05.610 --> 06:11.700
And here let's also increase the font size a little bit so you can see better.

06:11.700 --> 06:17.880
And this P type field, as you can see, we have two of these.

06:18.960 --> 06:24.840
Which one is for the Elf 64 and one is for Elf 32.

06:25.050 --> 06:31.680
As you can see here, this has several additional methods, variables.

06:31.680 --> 06:34.860
And as you can see here, we have in 64.

06:36.490 --> 06:38.110
After the word here.

06:41.080 --> 06:43.150
As you can see, Elf64 error.

06:43.150 --> 06:47.290
And here in L32 we have the just the regular word.

06:47.290 --> 06:50.980
And here in Elf64, we have the word.

06:52.210 --> 06:57.130
So their names may be varied depending on the structure, but.

07:00.730 --> 07:06.910
If you look at this here, they all have the same functionality here and some description.

07:07.030 --> 07:15.310
And this P type, which is segment type field, identifies the type of the segment and important values

07:15.310 --> 07:22.810
for this fields include the load dynamic and p t interpreter.

07:23.820 --> 07:28.020
And as you can see here, Interp, we have offset virtual.

07:29.760 --> 07:30.630
Address.

07:30.850 --> 07:31.280
Yes.

07:31.320 --> 07:35.880
Virtual address, physical address, file size, mem size, flags and so on.

07:38.470 --> 07:47.320
And the segments of this type load, as the name implies, are intended to be loaded into memory when

07:47.320 --> 07:55.510
setting up the process and the size of the loadable chunk and the address to load it at are described

07:55.720 --> 07:58.240
in the rest of the program header.

07:58.270 --> 08:07.120
As you can see in this output, there are usually less load here and here.

08:07.120 --> 08:07.720
Let's actually.

08:08.790 --> 08:12.060
We have the entire peer dynamic.

08:13.910 --> 08:20.810
And we have the Lords, the Lord of Lords here, dynamic node, node, node, renew property and so

08:20.810 --> 08:22.340
on, which you will learn here.

08:22.340 --> 08:23.690
And we also.

08:24.960 --> 08:32.250
Have the flags field second field so this flags field.

08:33.640 --> 08:34.810
Hit the flags.

08:35.110 --> 08:36.010
The flags.

08:36.010 --> 08:41.890
So the flags specify the runtime access permission for the segment.

08:41.890 --> 08:51.070
And three important types of flags is the exist is t, f, x, p, f, w and the p f read or are here.

08:52.470 --> 08:53.940
T f x flag.

08:54.480 --> 08:56.250
Let's actually write it down here.

08:57.540 --> 08:58.200
He.

09:00.950 --> 09:09.350
Underscore X means that it indicates that the segment is executable and set for the code.

09:09.350 --> 09:16.040
Segments like Rudolph displays it as an E rather than X in the flag column here.

09:18.070 --> 09:18.640
It's actually.

09:21.530 --> 09:24.710
You can see here HDR flag in.

09:24.710 --> 09:27.950
Sometimes here we have R, W and E here.

09:27.950 --> 09:30.710
So you can read this as X, so.

09:32.130 --> 09:39.930
They are the same in reality, but rather wants to write it as E because it's actually the first word,

09:40.170 --> 09:42.270
the first character of the executable.

09:42.270 --> 09:45.060
So it makes the sense.

09:45.060 --> 09:55.890
But X was as as acceptable here and finally here, which the and also the R here means the readable

09:55.890 --> 09:56.670
and.

09:57.900 --> 10:04.710
We have W, which is the means that segment is writable and it's normally set only for writable data

10:04.710 --> 10:07.590
segments and never for the code segments.

10:07.590 --> 10:11.160
And we have this R here obviously.

10:12.620 --> 10:22.070
This means that readable segment here, as in normally the case for both code and data segments.

10:22.070 --> 10:25.340
And the later we have this.

10:26.210 --> 10:30.530
After flax we have PE offset pe vector.

10:30.560 --> 10:31.430
PE pe.

10:31.460 --> 10:38.360
After pe files SC and he mem SC here.

10:38.450 --> 10:46.160
These fields are analogous to the C-H offset here which you saw previously.

10:47.620 --> 10:52.140
We have this here section file offset section size in bytes and so on.

10:52.150 --> 10:53.080
So.

10:55.090 --> 10:58.710
You need to go back to p files here.

10:59.920 --> 11:00.400
Yes.

11:00.520 --> 11:07.630
So they specify the file offset at which the segment starts and the virtual address at which it is to

11:07.630 --> 11:10.090
be loaded and the file size.

11:11.230 --> 11:13.900
Of the segments respectively for loadable segments.

11:13.930 --> 11:14.290
P.

11:15.010 --> 11:16.270
V Adder here.

11:16.300 --> 11:16.720
P.

11:16.720 --> 11:17.080
P.

11:17.110 --> 11:17.800
Adder.

11:18.600 --> 11:28.740
And R must be equal to P offset, which is typically 4096 bytes.

11:28.740 --> 11:36.030
And on some systems it's possible to use the p addr field to specify at which address in physical memory

11:36.030 --> 11:37.650
to load the segment.

11:37.680 --> 11:45.540
On modern operating systems such as Linux, this field is unused and set to zero since they execute

11:45.540 --> 11:47.790
all binaries in virtual memory.

11:47.790 --> 11:54.600
So at first glance it may not be obvious that why there are distinct fields for the file size of the

11:54.600 --> 11:55.080
segment.

11:55.080 --> 11:56.520
Like if.

11:57.350 --> 12:01.930
He files SEC and the size and memory memes.

12:02.930 --> 12:09.800
Um, to understand this, let's recall the subsections only indicate the need to allocate some bytes

12:09.800 --> 12:13.880
in memory, but don't actually occupy these bytes in the binary file.

12:13.880 --> 12:17.210
So for instance, the BSS section.

12:18.080 --> 12:22.700
Which you can't see on the shelf here, contains zero initialized data.

12:22.700 --> 12:26.870
Since all data in this section is known to be zero anyway.

12:27.670 --> 12:34.330
And there are no need to actually include all these zeros in the binary.

12:34.370 --> 12:34.750
Right.

12:34.750 --> 12:44.440
So, however, when loading the segment containing the BSS into virtual memory, all the bytes in BSS

12:44.560 --> 12:46.300
should be allocated.

12:46.300 --> 12:52.150
So this is possible for mem mem sec to be larger than the PE file sec.

12:52.180 --> 12:59.110
When this happens, the loader adds the extra bytes at the end of the segment when loading the binary

12:59.110 --> 13:02.710
and initializes them to zero.

13:02.710 --> 13:07.120
And lastly before ending this.

13:08.040 --> 13:08.880
Section here.

13:08.910 --> 13:10.200
This field.

13:11.560 --> 13:16.950
Here we have just one field left to explained.

13:16.960 --> 13:26.620
So the p align field is analogous to the error a align filled in the section header.

13:26.620 --> 13:35.020
It indicates the required memory alignment in bytes for the segments, just as with the Kwadril line,

13:35.290 --> 13:42.400
an alignment value of 0 or 1 indicates that no particular alignment is required and if p align is set

13:42.400 --> 13:52.750
to 0 or 1 then its value must be power of two and p of adder must be equal to p offset modulo p align.

13:52.750 --> 13:58.030
And in this lecture you learned all the intricacies of the elf format.

13:58.890 --> 14:02.460
And we have covered the format of executable header.

14:02.490 --> 14:07.140
The section header and program header tables and contents of sections.

14:07.140 --> 14:14.490
So that was quite an endeavor and it was worth it because now that you are familiar with the innards

14:14.490 --> 14:22.770
of Elf binaries, you have a great foundation for learning more about binary analysis and reverse engineering.

14:23.680 --> 14:31.870
And stay tuned for more exciting insights into format and reverse engineering and its impact on binary

14:31.870 --> 14:32.560
analysis.

14:32.560 --> 14:34.270
I'm waiting you in the next lecture.
