WEBVTT

00:03.560 --> 00:08.800
Hello everyone, and welcome back to Mastering Software Exploitation.

00:09.160 --> 00:14.480
In this section we are going to dive deep into stack overflows.

00:14.480 --> 00:22.720
We will craft a simple exploitation and we will explore the most common security mitigations, uh,

00:22.760 --> 00:26.600
existing against memory vulnerabilities.

00:27.480 --> 00:29.400
So let's jump right in.

00:30.120 --> 00:31.680
What is the stack memory?

00:31.800 --> 00:34.400
As always, we start with the definitions.

00:34.480 --> 00:40.880
So the stack is, you may have already know, is a special region in memory used for temporary storage

00:40.920 --> 00:42.480
during function calls.

00:42.920 --> 00:46.520
It operates in less than less first out manner.

00:46.880 --> 00:54.360
And the structure of the stack frame, while varies between different architectures, is pretty much

00:54.400 --> 01:01.350
around this layout that you see on the screen where we have the local variables saved in the stack,

01:01.710 --> 01:05.510
which are the variables you actually define in your function.

01:05.510 --> 01:11.870
If you have a buffer or a variable defined in your function, it goes into the stack frame.

01:12.150 --> 01:21.150
Then we have the return address, which is to indicate to the function when, uh, or actually where

01:21.270 --> 01:24.470
it should return when it's finished.

01:24.710 --> 01:30.630
So I call function A and function A when it ends.

01:30.670 --> 01:33.750
Need to know who called her where.

01:33.910 --> 01:36.070
Where should it go back to?

01:36.470 --> 01:43.750
And this is the return address that helps us continue the execution after we jumped into a function.

01:45.830 --> 01:52.150
So let's dive into a specific example about a specific architecture.

01:52.670 --> 01:59.780
Um, so in ARM, what you can see is that we have first some definitions.

01:59.820 --> 02:01.900
PC is the program counter.

02:01.900 --> 02:07.260
It's actually the process register that indicates, uh, the process.

02:07.300 --> 02:10.780
What is the next address to execute code from?

02:11.660 --> 02:17.100
LR is the link register or like in x86 the return address.

02:17.100 --> 02:24.540
This is the register that actually saves the address where function should return to the function.

02:24.540 --> 02:27.300
Returns via restoring PC from LR.

02:27.460 --> 02:27.860
Right.

02:27.900 --> 02:36.900
If I finish the function now, I take what's within LR, put it into PC and then the execution continues

02:37.260 --> 02:41.260
from uh, the LR saved address.

02:42.420 --> 02:46.460
And this is what we really need to know, uh, for this section.

02:48.300 --> 02:53.980
Now let's talk about the oldest vulnerability out there, Stack Overflow.

02:54.020 --> 02:57.610
We even have a website called that name.

02:57.970 --> 03:03.690
Uh, so you can imagine how common and known that vulnerability is.

03:04.330 --> 03:09.570
The reasons for stack overflow is improper boundary checks.

03:09.570 --> 03:10.890
So we have a buffer.

03:11.330 --> 03:14.850
We copy, uh, data into this buffer.

03:15.130 --> 03:19.970
And we are not verifying that this data is not bigger than the buffer size.

03:20.530 --> 03:25.890
The buffer is allocated on the stack, which is why it's called stack overflow.

03:26.250 --> 03:35.010
If the buffer was dynamically allocated, then it was a heap overflow because the buffer was allocated

03:35.010 --> 03:35.930
on the heap.

03:36.330 --> 03:44.810
But now we are talking about stack overflows and pay attention that those sample functions are functions

03:44.810 --> 03:51.050
that if being used, we have to check the size before calling them, because those functions are not,

03:51.490 --> 03:54.640
uh, verifying the input size on their own.

03:55.560 --> 03:57.520
So naive example.

03:57.520 --> 04:00.560
I'm sure you've seen those in the past.

04:01.040 --> 04:05.840
We have a function that defines a buffer size 64.

04:06.240 --> 04:13.800
And then we use str copy copying user input into the buffer without checking the size.

04:14.400 --> 04:20.720
In our main function we receive user input and we call this vulnerable function.

04:20.720 --> 04:29.960
So basically if the user input is bigger than 64 we are going to call vulnerable function with a buffer

04:29.960 --> 04:32.480
or string bigger than 64.

04:32.840 --> 04:41.200
And we're just going to copy it into our 64 sized buffer which will cause us to overflow.

04:42.720 --> 04:53.640
So, uh, if we take this into the frame, uh, of the stack, what we will have is the buffer 64 size

04:53.640 --> 04:59.960
buffer, and after the buffer we will have LR, the link register or the return address.

05:00.400 --> 05:09.160
So if we overflow the buffer after 64 bytes, what would actually happen is that we will start overwriting

05:09.640 --> 05:10.880
the return address.

05:11.280 --> 05:22.760
Since PC is actually taking its um restoring its execution from the LR register, we actually can control

05:22.800 --> 05:23.400
PC.

05:23.920 --> 05:32.160
Controlling PC means that we control where the system is going to execute code from, and that means

05:32.160 --> 05:34.360
we control the program.

05:34.720 --> 05:42.360
So overwriting the return address causes us to take full control over the system execution.

05:43.840 --> 05:47.440
Now let's actually exploit this same vulnerability.

05:47.680 --> 05:49.520
So we have some definitions here.

05:49.520 --> 05:51.230
Buffer size is 64.

05:51.270 --> 05:52.430
Buffer address.

05:52.990 --> 05:57.910
We can extract it and know the address which is a const.

05:58.390 --> 06:02.590
For the purpose of this example is this address.

06:02.870 --> 06:05.190
The return address size is four bytes.

06:05.990 --> 06:09.390
So we will now create our shellcode.

06:09.430 --> 06:13.070
Shellcode is first some NOP sled.

06:13.550 --> 06:18.990
Basically just to make sure that we can continue and fall on our code.

06:18.990 --> 06:22.150
So we have some safe space which is NOP slide.

06:22.430 --> 06:25.510
And then our malicious code coming afterwards.

06:26.270 --> 06:28.710
Then we start constructing the payload.

06:28.710 --> 06:33.350
First we need to fill out the 64 bytes of the buffer.

06:33.750 --> 06:37.270
So we take buffer size and we fill it with ones.

06:38.190 --> 06:46.070
Now we need to overwrite the return address with address that is actually pointing into our shellcode.

06:46.390 --> 06:49.740
So walk with me to the image on the right side.

06:49.740 --> 06:50.940
And let's take a look.

06:51.380 --> 06:54.140
The beginning of the buffer is the buffer address.

06:54.500 --> 06:58.860
After 64 bytes we have the return address.

06:59.220 --> 07:05.340
And after these four bytes this is where the shellcode is going to be placed on.

07:05.700 --> 07:11.220
So the address after the return address is where the shellcode goes.

07:11.620 --> 07:20.820
So what we really want to do is have the return address overwritten with the address here, which is

07:20.820 --> 07:21.780
the shellcode.

07:21.820 --> 07:22.180
Right.

07:22.180 --> 07:24.740
Because then PC will get this address.

07:25.060 --> 07:31.780
This address will point to the shellcode and PC will start executing our shellcode.

07:32.300 --> 07:39.700
So now going back to the payload you can see that the shellcode address is being calculated as the buffer

07:39.700 --> 07:44.660
address plus the buffer size plus the return address size.

07:45.380 --> 07:53.010
Which brings us exactly after the return address, and then we added some offsets into the NOP slide

07:53.050 --> 07:57.170
to make sure that we fall into our NOP slide and then shell code.

07:57.610 --> 08:04.850
So again, the shell code address is the buffer address plus the buffer size 64 plus the return address

08:04.850 --> 08:06.730
size which is another four bytes.

08:07.330 --> 08:09.450
And then we added some safety.

08:09.450 --> 08:14.370
So we follow here where it's written shell code goes here.

08:15.050 --> 08:19.730
So now the shell code address parameter contains the real shell code address.

08:20.570 --> 08:23.210
And now we just need to pack all together.

08:23.610 --> 08:30.690
So we take the payload which we filled up with 64 ones to fill up the buffer.

08:30.690 --> 08:34.410
And we add the shell code address right afterwards.

08:34.890 --> 08:43.370
That means that the LR that comes right after the buffer is going to be overridden with the shell code

08:43.410 --> 08:44.210
address.

08:45.130 --> 08:52.200
Now we actually take the payload and add the shell code, which is the malicious code.

08:52.680 --> 09:00.160
So now we have the payload crafted from once to fill the buffer's shellcode address, to override return

09:00.200 --> 09:04.560
address with the shellcode address and then the actual shellcode.

09:05.680 --> 09:13.920
So when we actually use this payload, what will happen is that we're going to override the buffer override

09:14.080 --> 09:18.080
uh LR with the address right after LR.

09:18.560 --> 09:22.080
And then place our shellcode right after LR.

09:23.160 --> 09:32.760
When we finish this call, the stack will look like LR pointing to an address that contains our shellcode.

09:32.800 --> 09:41.920
So now when the function restores PC will execute code from LR which is the address where our shellcode

09:42.280 --> 09:43.080
went to.

09:43.680 --> 09:51.790
And this is how we basically finalize the exploitation and executing code directly from the stack code

09:51.790 --> 09:57.390
that we placed there that performs our own desired operations.

10:00.910 --> 10:03.070
So this is a very simple exploitation.

10:03.070 --> 10:08.550
And there is a very good security solution to prevent this type of exploitation.

10:09.390 --> 10:16.710
Please note that by preventing exploitation, we are not preventing the presence of the vulnerability.

10:16.750 --> 10:20.070
Stack overflow vulnerability continue to exist out there.

10:20.630 --> 10:27.950
But what we are attempting to do in security mitigations is block the attempt to exploit the vulnerability.

10:28.310 --> 10:32.390
We can't control in a very good way.

10:34.590 --> 10:36.350
Existence of vulnerabilities.

10:36.350 --> 10:40.350
It's hard for us to eliminate all types of vulnerabilities in our code.

10:40.350 --> 10:45.420
What's easier to do is to prevent the attacker from taking this exploitation steps.

10:45.900 --> 10:53.020
So Stack Canaries is a security mitigation to prevent stack overflows from being exploited.

10:53.700 --> 10:55.700
And the way it works is very simple.

10:56.140 --> 11:03.300
When we enter a function, the first stage is to add a cookie and we are placing a cookie at each function

11:03.300 --> 11:04.260
entry point.

11:04.620 --> 11:09.700
And this code that places the cookie is being inserted at compile time.

11:10.300 --> 11:14.860
What that means is that every function frame will now look like this.

11:14.900 --> 11:22.340
We will have a buffer after the buffer and right before the return address we always have the canary.

11:23.300 --> 11:26.980
It will always be the last thing before the return address.

11:27.700 --> 11:34.660
Now, before the function actually returns and exits, it validates that the canary value remained the

11:34.660 --> 11:38.620
same value that was placed there at the beginning of the function.

11:39.060 --> 11:40.420
This is really important.

11:40.420 --> 11:43.770
So when we enter the function, we put some magic value.

11:43.810 --> 11:46.730
Let's say it's one, two, three, four.

11:47.290 --> 11:54.410
Now, when we come to exit the function before we actually return, we verify that the canary value

11:54.410 --> 11:57.810
is still 123, four.

11:58.450 --> 12:03.210
If it's not, then we are not allowing the function to return.

12:03.250 --> 12:09.450
We are assuming that the return address is corrupted, because if our canary is corrupted, the return

12:09.450 --> 12:11.370
address might be corrupted as well.

12:11.850 --> 12:15.850
So we are not returning and we are preventing the exploitation.

12:16.530 --> 12:23.490
Only if the canary value is intact and is identical to the value that was placed there at the beginning

12:23.490 --> 12:31.890
of the function, is where we allow the continuous of execution and restore PC from the return address.

12:33.410 --> 12:40.490
Pay attention now that if we have a vulnerability and a vulnerable buffer as an attacker, I need to

12:40.530 --> 12:44.290
overflow the buffer and I want to get into the return address.

12:44.730 --> 12:51.690
There is no way for for me as an attacker to actually overwrite the return address without overwrite

12:51.690 --> 12:58.770
the cannery value first, because the cannery is placed between the local variables and the return address.

12:59.010 --> 12:59.530
Excuse me?

12:59.530 --> 13:04.530
Between the vulnerable buffer and the return address, I cannot overflow the buffer.

13:05.450 --> 13:09.170
Keep the cannery intact and then overflow the return address.

13:09.570 --> 13:14.610
So our previous exploitation will not work because we will corrupt the cannery value.

13:15.130 --> 13:17.570
We will also corrupt the return address.

13:17.570 --> 13:25.010
But since we will co-op the cannery value, the validation will fail and the function will not allow

13:25.010 --> 13:31.810
us to return from our corrupted return address, which is how the exploitation will be prevented.

13:33.330 --> 13:36.330
There are some bypass techniques to stack canaries.

13:36.610 --> 13:39.760
The first one is information leak vulnerability.

13:39.800 --> 13:47.080
Remember the Heartbleed vulnerability we talked about earlier, which gave the attacker a way to leak

13:47.080 --> 13:48.400
data from the memory?

13:48.800 --> 13:55.280
So imagine me as an attacker has an information leak that could first tell me what's in the canary.

13:55.640 --> 13:59.240
What's the value if I know it's one, two, three, four?

13:59.400 --> 14:06.840
Then I can corrupt the buffer, overflow the buffer, make sure I put one, two, three, four here

14:06.840 --> 14:13.360
in the canary so it remains the same, and only then continue to overwrite the return address.

14:14.120 --> 14:20.600
That way the validation of the cookie will be successful and the function will return for my coopted

14:20.600 --> 14:21.640
return address.

14:22.000 --> 14:30.000
So by chaining two vulnerabilities information leak plus stack overflow, I'm able to restore the canary

14:30.040 --> 14:39.830
value and bypass or pass the validation to continue, uh, the execution for my malicious address.

14:40.790 --> 14:43.270
Another example is brute force.

14:43.710 --> 14:52.390
Let's say that the system is not high in entropy, and the canary value is really limited to a small

14:52.390 --> 14:58.830
amount of, uh, of uh, uh, values.

14:59.030 --> 15:06.390
Then I can just continue and try to overflow with different values that I place in the canary until

15:06.390 --> 15:15.390
I hit the right value and successfully pass the validation, which is kind of the same as the info leak

15:15.430 --> 15:16.070
example.

15:16.110 --> 15:18.510
Only this time I'm guessing again and again.

15:18.550 --> 15:21.230
What's the canary value until I'm successful?

15:21.990 --> 15:24.030
A few limitations to stack canaries.

15:24.750 --> 15:27.230
First, it can only be applied on source code.

15:27.230 --> 15:33.790
So if you have a third party binary that is not compiled with stack canaries, then this binary is still

15:33.790 --> 15:35.820
very vulnerable to stack overflows.

15:36.540 --> 15:41.780
Efficiency depends on the entropy of the system, which we discussed when we talked about the brute

15:41.780 --> 15:42.260
force.

15:42.420 --> 15:52.540
Uh, bypass and local function variables are not protected because everything that goes here before

15:52.540 --> 15:55.740
the canary is placed can still be corrupted.

15:56.220 --> 16:03.420
So if there is a local variable that is important or influence the system, behavior I can still control

16:03.420 --> 16:04.500
is value.

16:04.900 --> 16:08.060
And that can also cause unintended behavior.

16:08.100 --> 16:15.020
While it's not straightforward like the return address, there are still exploitations that can be successful

16:15.060 --> 16:17.500
only by corrupting the local variables.

16:19.660 --> 16:27.700
Another very cool security mitigation is Aslr address space layout randomization, which basically means

16:27.700 --> 16:34.970
randomizing the memory addresses each time the process loads, or the system boots that actually creates

16:34.970 --> 16:37.730
unpredictable memory layout.

16:38.730 --> 16:40.930
Why does that matter?

16:41.690 --> 16:47.930
As you noted before, attackers rely on known vulnerability addresses to exploit vulnerabilities.

16:48.090 --> 16:50.250
Aslr makes it very difficult.

16:50.650 --> 16:59.650
So in the example that we actually, uh, went through earlier without aslr, the vulnerable buffer

16:59.690 --> 17:03.970
always was always at a certain address.

17:04.370 --> 17:08.330
So it was easy to craft a reliable exploit payload as we did.

17:08.330 --> 17:10.490
We just need to jump to a known address.

17:11.050 --> 17:16.090
But with Aslr, the buffer location changes randomly with each execution.

17:16.090 --> 17:22.890
So in the first run it could be at this address, the second round it could be this address and so on

17:22.890 --> 17:23.890
and so forth.

17:24.330 --> 17:29.610
So when I'm writing my exploit sorry.

17:30.690 --> 17:32.760
When I'm writing my exploit.

17:32.960 --> 17:38.800
This buffer address that I put here and used it is actually unknown to me.

17:39.400 --> 17:40.320
And this works.

17:40.360 --> 17:47.960
What makes Aslr a strong security mitigation against exploit crafting?

17:48.760 --> 17:58.840
So the advantages are significantly raising the bar for attackers by eliminating address predictability.

17:59.280 --> 18:04.800
It forces attackers to find memory disclosure vulnerabilities first, like we talked about information

18:04.800 --> 18:13.640
leaks that will indicate to us what is the buffer size at this moment in real time so that we can use

18:13.640 --> 18:15.080
it in our exploitation.

18:15.520 --> 18:22.080
It breaks hardcoded exploit addresses, as we've seen, and it provides protection without really any

18:22.080 --> 18:23.280
performance impact.

18:23.280 --> 18:31.470
I'm not using any code or any CPU cycles, I'm just loading the addresses differently every time, so

18:31.470 --> 18:32.670
no code additions.

18:34.030 --> 18:37.670
The limitations are as follows.

18:37.870 --> 18:39.190
First entropy.

18:39.470 --> 18:44.670
If I don't have a lot of entropy, then addresses could become pretty much predictable.

18:45.670 --> 18:49.390
Information leaks completely bypass this protection.

18:49.390 --> 18:52.670
So many times we will see two vulnerabilities that are being used.

18:52.670 --> 18:57.430
The first one is info leak and the second one is a buffer overflow.

18:57.670 --> 19:02.830
So if you ever thought to yourself, well info leak, it's not that important.

19:03.150 --> 19:10.910
Well it is because it brings attackers away to, uh, exploit other vulnerabilities that you have in

19:10.910 --> 19:11.550
the system.

19:12.830 --> 19:15.430
Resource constraints limit implementation.

19:15.430 --> 19:23.190
So for example, in IoT devices, uh, you will see less and less use of aslr, although this becomes

19:23.230 --> 19:27.470
now, uh, better because they have more resources.

19:27.940 --> 19:37.500
And most importantly, I think or, uh, worth noting is that even if you have a DSLR and it's working

19:37.500 --> 19:41.020
properly, the system will still crash.

19:41.020 --> 19:44.100
If I start, if I if I exploit it.

19:44.540 --> 19:51.620
So while the attacker is not able to gain remote code execution, he is still able to crash the system,

19:51.620 --> 19:57.020
which is a certain of denial of service and not the desired behavior.

19:57.060 --> 20:06.940
Anyhow, the reason is that if I overflow a buffer and able to overflow other elements in the memory,

20:07.420 --> 20:16.780
even though I cannot, uh, predict the buffer address and correctly jump to my malicious code, I am

20:16.780 --> 20:19.780
still corrupting important pieces in the software.

20:20.380 --> 20:27.460
If I corrupt important pieces in the software, many times I will cause unexpected behavior.

20:27.460 --> 20:37.220
So while Aslr prevented the code execution, it didn't prevent the overflow itself, which means the

20:37.220 --> 20:38.540
system might crash.

20:40.300 --> 20:45.540
Another mitigation is next bit, not execute bit.

20:45.820 --> 20:54.100
This means restricting the execution to only designated code segments, so it will be harder for attackers

20:54.100 --> 20:55.980
to execute arbitrary code.

20:57.100 --> 20:59.900
Even if vulnerabilities exist in the code.

21:00.500 --> 21:08.220
So the most straightforward implementation of it is to make the stack and heap memory non-executable.

21:08.580 --> 21:15.380
If you remember from the example before we placed our shellcode right on the stack, we put it after

21:15.380 --> 21:18.100
the return address and then we jump to it.

21:18.660 --> 21:26.690
But if we have the NX bit, that means that the CPU cannot execute code coming from stack memory, which

21:26.690 --> 21:31.810
means that if you place your shellcode on the stack, it will not be executed.

21:31.810 --> 21:38.770
Even if you did everything correctly and corrupted the return address pointing to your shellcode, it

21:38.770 --> 21:41.050
will be blocked at the CPU level.

21:42.690 --> 21:51.610
So this is a very cool technique that is actually very effective because before the uh, NX bit or DEP

21:51.810 --> 21:57.850
in other systems, it was very easy for attackers to place their shellcode in the heap memory or the

21:57.850 --> 22:06.610
stack, just use the memory, throw the shellcode in there and point the PC there in various ways.

22:07.050 --> 22:09.810
Now it's actually harder.

22:09.810 --> 22:11.610
Where can we execute code from?

22:11.610 --> 22:16.370
If I can't place my code in the memory because the memory is not executable?

22:17.290 --> 22:21.810
We will talk more about it when we, uh, hit return oriented programming.

22:23.440 --> 22:24.600
Outliers.

22:24.600 --> 22:30.360
Important to know that there are some systems that need to enable code running from the heap.

22:30.920 --> 22:36.920
In that case, protections can be configured to allow execution of code only from certain segments to

22:36.960 --> 22:44.160
allow this system to execute code from the heap, but still prevent, uh, easy code execution from

22:44.160 --> 22:45.560
other heap segments.

22:46.360 --> 22:52.120
So we have a lot of memory protections allowing to configure different regions in the memory.

22:52.600 --> 22:59.600
So if you ever wondered about why configuring what's allowed to be executed, what's not.

22:59.600 --> 23:04.360
So this is part of the reason some real world data.

23:04.480 --> 23:16.320
So, uh, both ARM and x86 environments are widely using Aslr more than 95% actually for modern desktop

23:16.320 --> 23:19.280
and mobile operating system stack canaries.

23:19.840 --> 23:23.390
Nearly universal Implementation across systems.

23:23.670 --> 23:31.910
IoT is a bit lagging and embedded systems are a bit lagging in general, but those security mitigations

23:31.910 --> 23:34.950
are very strong and are widely adopted.

23:35.510 --> 23:43.750
So coming to exploit advanced systems, sorry, the simple exploitation we learned about will probably

23:43.750 --> 23:44.830
not work for us.

23:45.510 --> 23:51.830
And this is why in the next chapters, we are going to learn about advanced techniques like return oriented

23:51.830 --> 23:52.550
programming.

23:53.070 --> 23:59.110
That gives attackers a way to respond to those security mitigations.

23:59.950 --> 24:07.070
And I hope you will enjoy your first hands on lab writing, your first exploitation, and writing your

24:07.070 --> 24:10.270
own code on affordable program.

24:10.750 --> 24:14.070
Thank you and see you in the next chapter.
