WEBVTT

00:00.410 --> 00:06.050
All right, Now let's dive into the fascinating world of the linking phase, which marks the final stage

00:06.050 --> 00:08.450
of the compilation process.

00:08.480 --> 00:15.500
As the name suggests, this phase brings together all the object files and seamlessly merge them into

00:15.500 --> 00:17.600
a unified binary executable.

00:17.630 --> 00:24.200
In modern systems, this phase may also incorporate an additional optimization path known as link time

00:24.200 --> 00:30.500
optimization or enhancing the overall performance of the resulting executable.

00:30.710 --> 00:37.970
And unsurprisingly, the program that performs the linking phase is called a linker or link editor is

00:37.970 --> 00:43.370
typically separate from the compiler, which usually implements all the preceding phases.

00:43.520 --> 00:51.230
As I've already mentioned, object files are relocatable because they are compiled independently from

00:51.230 --> 00:52.760
each other.

00:52.790 --> 00:55.410
As you can see here, preventing the.

00:56.000 --> 01:03.200
That's because the they are preventing the compiler from assuming that an object will end up at any

01:03.200 --> 01:04.890
particular base address.

01:04.910 --> 01:13.460
So these are relocatable and moreover object files may reference functions or variables in other object

01:13.460 --> 01:17.270
files or in libraries that are external to the program.

01:17.270 --> 01:23.750
So before the linking phase, the addresses at which the reference code and data will be placed are

01:23.750 --> 01:24.740
not yet known.

01:24.740 --> 01:33.140
So the object files only contain relocation symbols that specify how functions and variable references

01:33.140 --> 01:35.190
should eventually be resolved.

01:35.210 --> 01:43.470
So in the context of linking references that rely on a relocation symbol are called symbolic references.

01:43.490 --> 01:50.690
When an object file reference is one of its own functions or variables by absolute address, the reference

01:50.690 --> 01:53.090
will also be symbolic.

01:53.090 --> 01:59.840
So the linker job is to take all the object files belonging to a program and merge them into a single

01:59.840 --> 02:06.390
coherent executable, typically intended to be loaded at a particular memory address.

02:06.420 --> 02:11.670
Now here, the arrangement of all modules in the executable is known.

02:11.700 --> 02:15.720
The linker can also resolve more symbolic references as well.

02:15.840 --> 02:23.160
References to the libraries may or may not be completely resolved depending on the type of library.

02:23.990 --> 02:34.310
And static libraries, which on Linux typically have the extension of dot a as here as a dot a.

02:36.140 --> 02:38.240
And are merged into the binary executable.

02:38.250 --> 02:40.500
So allowing any references to them.

02:40.860 --> 02:42.600
To be resolved entirely.

02:42.600 --> 02:46.260
So there are also dynamic which is also called.

02:46.350 --> 02:53.250
The shared libraries which are shared in memory among all programs that run on a system.

02:53.250 --> 02:58.890
In other words, rather than copying the library into every binary that uses it, the dynamic libraries

02:58.890 --> 03:07.770
are loaded into memory only once, and any binary that wants to use the library needs to use this shared

03:07.770 --> 03:08.220
copy.

03:08.220 --> 03:16.260
So during the linking phase, the addresses at which dynamic libraries will reside are not yet known,

03:16.260 --> 03:18.610
so references to them cannot be resolved.

03:18.630 --> 03:24.780
Instead, the linker leaves symbolic references to these libraries even in the final executable, and

03:24.780 --> 03:31.110
these references are not resolved until the binary is actually loaded into memory to be executed.

03:31.110 --> 03:38.230
So most compilers like GCC automatically call the linker at the end of the compilation process to use

03:38.350 --> 03:47.590
the to produce a complete binary executable so you can simply call GCC without any special switches

03:47.860 --> 03:50.050
and compile your application.

03:50.050 --> 03:53.380
My app dot C and that's it.

03:53.380 --> 03:58.570
And here we will now use the file my app dot a.

03:59.520 --> 04:01.770
Or a dot out.

04:02.100 --> 04:02.730
A dot out.

04:02.730 --> 04:04.740
Because this is our output.

04:05.850 --> 04:09.510
While the final file and that's it.

04:09.510 --> 04:11.160
And here.

04:12.330 --> 04:12.990
The first.

04:12.990 --> 04:16.830
Let's actually understand this firstly.

04:17.010 --> 04:21.470
So by default, the executable is called a dot out.

04:21.480 --> 04:30.060
But you can override this by naming, by passing the or parameter or switch to GCC followed by a name

04:30.060 --> 04:31.080
for the output file.

04:31.080 --> 04:38.460
So the file utility now tells us that we are dealing with an Elf 64 bit.

04:39.620 --> 04:40.310
LSB.

04:41.880 --> 04:45.930
I executable rather than a relocatable file.

04:45.960 --> 04:50.070
As you can see here, we have this relocatable file, as you saw.

04:50.850 --> 04:57.480
Also, you saw in the previous lecture and other important information is that the file is dynamically

04:57.480 --> 04:59.280
linked here.

05:01.220 --> 05:04.040
Dynamically length.

05:07.120 --> 05:15.700
So meaning this dynamically linked here, meaning that it uses some libraries that are not merged into

05:15.700 --> 05:21.460
executable but are instead shared among all programs running on the same system.

05:21.520 --> 05:25.630
And finally, we have this interpreter.

05:25.660 --> 05:26.740
This here.

05:28.530 --> 05:29.280
Here.

05:31.020 --> 05:31.770
So.

05:33.570 --> 05:47.670
Um here the interpreter lib64 Linux and x86 64 .0.2 Here in the file output tells you which dynamic

05:47.670 --> 05:55.050
linker will be used to resolve the final dependencies on dynamic libraries when the executable is loaded

05:55.050 --> 05:57.060
into memory to be executed.

05:57.060 --> 06:06.660
So when you run the binary using the dot a dot out command, we are running here.

06:06.660 --> 06:15.510
You can see that it produces the expected output printing the Hello world to us, which confirms that

06:15.510 --> 06:18.360
you have produced a working binary.

06:18.570 --> 06:24.870
But here we have the node script here or stripped here.

06:26.370 --> 06:32.950
But let's what is bit about this binary not being stripped or not stripped.

06:32.950 --> 06:36.910
You will learn that in next lecture.
