WEBVTT

00:03.320 --> 00:04.560
Hello everyone.

00:04.840 --> 00:11.560
Welcome back to Software Exploitation, section three turn oriented program.

00:12.320 --> 00:18.160
In the last session we learned about security mitigation like Aslr.

00:18.280 --> 00:28.920
And if you remember, the last piece was an NXP which prevented us as attackers to execute code from

00:28.960 --> 00:35.480
stack memory, heap memory, or generally speaking, from Unallowed locations.

00:36.240 --> 00:45.080
Now we are going to talk about an attacker's technique to bypass these mitigations, and you probably

00:45.080 --> 00:50.640
heard about it as ROP return oriented programming or return to libc.

00:52.000 --> 00:54.040
So a bit of introduction.

00:54.040 --> 01:00.420
What is OP return oriented programming is an exploitation technique that bypasses security Protections

01:00.420 --> 01:01.460
like NX bit.

01:02.300 --> 01:08.100
But what it actually means, in fact, is that instead of injecting malicious code, attackers actually

01:08.100 --> 01:17.500
reuse code that is already existing in the device to chain this code together in what we call gadgets,

01:17.540 --> 01:24.300
to build a complex exploit payload that is actually performing their intended behavior.

01:24.780 --> 01:30.300
So that was a very long explanation of something that is pretty much simple.

01:30.300 --> 01:38.180
Instead of placing new code into the system, we are actually using already executable code regions

01:38.540 --> 01:47.660
only in a different order to make this code work for us and perform unintended behavior in real life.

01:47.660 --> 01:49.940
We also have types of OP.

01:49.940 --> 01:56.980
So let's imagine a politician giving an interview that is being recorded and she is saying something

01:56.980 --> 02:00.680
like, I've never supported policies that harm the environment.

02:01.160 --> 02:03.680
And you can read the sentence out of the slides.

02:04.120 --> 02:13.280
And now the editor is actually taking that recording and edit it to actually make it sound like she

02:13.280 --> 02:16.680
said, I support policies that harm the environment.

02:17.320 --> 02:19.920
What you've seen here is actually pretty simple.

02:20.160 --> 02:28.960
It's about taking selectively pieces of words that she actually said only in a different order, or

02:28.960 --> 02:33.800
eliminating certain pieces of it to construct a new meaning.

02:34.800 --> 02:42.560
This is exactly how it works also in software, where instead of words we have code segments.

02:43.320 --> 02:49.360
So going step by step, what we do in OP is starting with existing code.

02:49.800 --> 02:51.600
We add no new code.

02:51.600 --> 02:59.900
Just like in our example, we didn't add any new words to what the politicians said we find reusable

02:59.900 --> 03:07.900
fragments, meaning identifying executable code snippets or gadgets that already present in the program

03:07.900 --> 03:10.420
and can support what we are trying to achieve.

03:11.060 --> 03:17.900
Then we chain them together to create this new meaning or malicious functionality from permitted code,

03:17.940 --> 03:20.340
only executed in a different way.

03:21.260 --> 03:27.420
Then this code or change gadgets is usually being referred to as the shellcode.

03:27.740 --> 03:35.260
The shellcode is just the initial malicious code running on the system that then can legitimately load

03:35.260 --> 03:37.060
and run the full malware.

03:37.660 --> 03:45.900
So what I'm trying to say here is that usually we are not going to execute a full blown malware just

03:45.900 --> 03:47.340
from op gadgets.

03:47.340 --> 03:49.620
That will be pretty much a nightmare.

03:50.020 --> 03:58.850
But what ROP can allow us to do is to execute a very small pieces of software named shellcode to actually

03:58.850 --> 04:07.410
open a legitimate reverse shell, or doing something that will then allow us to legitimately use existing

04:07.410 --> 04:10.330
system functionalities to install the malware.

04:13.090 --> 04:14.450
What is it technically?

04:15.010 --> 04:20.810
So you need to remember that usually the stack pointer is becoming the new PC.

04:21.210 --> 04:29.570
And by addressing the stack pointer as such, we can understand how the chaining of judges actually

04:29.770 --> 04:30.450
works.

04:31.010 --> 04:38.690
Don't worry, we are going to take this step by step with a real example on an ARM architecture.

04:40.050 --> 04:47.610
So in this example, let's assume that we as an attackers are trying to execute system bin sh.

04:48.010 --> 04:49.690
This is what we really want to do.

04:50.170 --> 04:55.530
But we cannot introduce new code into the memory and execute from memory.

04:55.530 --> 04:56.850
So what are we going to do?

04:57.670 --> 05:03.750
First, let's assume we are getting all of the objects we need and all the relevant addresses using

05:03.750 --> 05:06.270
some gadget finder tools like copper.

05:07.230 --> 05:13.350
You will actually use such tools in your hands on exercise during this course.

05:14.670 --> 05:20.670
Then, assuming we have all we need, we have all these addresses and all these gadgets, we overflow

05:20.710 --> 05:23.990
the buffer with the rope chain that we have built.

05:25.070 --> 05:28.870
Let's assume this is the rope chain you see here.

05:28.910 --> 05:31.070
The buffer is filled with B's.

05:31.590 --> 05:34.270
And then this is the option we created.

05:34.710 --> 05:36.510
This is an address within Lipsey.

05:36.910 --> 05:42.790
Another address within Lipsey another address and then string finish.

05:43.390 --> 05:53.430
Now let's assume that in the first address Lipsey plus 18798 we have this binary code pop out three

05:53.470 --> 05:54.070
PC.

05:54.590 --> 05:57.650
So this is what actually is within the address.

05:58.410 --> 06:09.450
Then in leap CE2 c4 we have this chain of commands mov, r0, stack, pointer and branch to our tree

06:09.490 --> 06:10.370
register.

06:10.650 --> 06:14.130
So as you can see those are addresses within the code segment.

06:14.450 --> 06:19.570
And these are the code that actually present in those addresses.

06:19.970 --> 06:27.730
These addresses are allowed to be executed even if you have an NX bit because they are on the code segment.

06:28.170 --> 06:36.490
So let's assume the attacker built this swap chain and put all of these, uh, in the stack memory.

06:37.530 --> 06:39.530
And now let's see what actually happens.

06:41.770 --> 06:46.050
So on the right side you can have uh, the list of registers.

06:46.050 --> 06:52.130
And we will follow that list as we go, uh, step by step into the execution.

06:52.690 --> 07:00.350
So the first thing is PC starts after we successfully overflowed the return address.

07:00.910 --> 07:05.230
Uh, PC is now placed, is now receiving this address.

07:05.270 --> 07:06.710
Let's see this.

07:06.750 --> 07:12.510
You can imagine this as the return address, and we, uh, overwritten the return address with Lipsy

07:12.550 --> 07:14.670
plus 18798.

07:15.150 --> 07:18.590
So now PC starts executing code from this address.

07:19.190 --> 07:24.790
As I previously mentioned, this address contains this code cop r3, pc.

07:25.310 --> 07:33.110
For those of you who doesn't know, pop is actually a command, an assembly command that pops value

07:33.110 --> 07:37.390
from the stack and put it on the registers in the command.

07:37.710 --> 07:46.470
So pop r3 pc will actually take the value that now stack pointer is pointing to, and put it on R3,

07:46.910 --> 07:50.950
and then take the next value and put it on PC.

07:51.310 --> 07:54.030
So let's go through that pop.

07:54.070 --> 07:58.970
R3 will take this address and put it inside our tree.

07:58.970 --> 08:07.050
So you can see here within our array of registers that our tree received this address which points to

08:07.090 --> 08:07.730
system.

08:09.090 --> 08:17.370
Then we perform the second pop which is extracting the next in line address from the stack into PC.

08:17.770 --> 08:22.090
So now PC received the libc address.

08:23.170 --> 08:25.810
Now it means that if you remember.

08:26.010 --> 08:30.210
And of course I assume you remember PC is the program counter.

08:30.210 --> 08:36.370
So now the CPU start executing code from the address that we put in PC.

08:37.530 --> 08:43.250
And this address is this assembly commands.

08:43.650 --> 08:46.570
So PC starts executing from here.

08:46.570 --> 08:57.670
So we jumped from here by controlling PC via popping an address from the this stack and we move to this

08:57.910 --> 08:58.790
command.

08:58.830 --> 09:04.870
So now we are going to perform move our zero stack pointer.

09:04.990 --> 09:11.910
So R0 is going to receive the address that is now present in stack pointer.

09:12.230 --> 09:16.070
Stack pointer as you can see is now pointing to finish.

09:16.470 --> 09:20.990
So R0 received the string bean s.h.

09:22.230 --> 09:30.150
Now PC is progressing to the next command which is branching to register our tree.

09:30.550 --> 09:31.430
Our tree.

09:32.070 --> 09:37.550
We took care of making sure he has the address of system.

09:37.550 --> 09:39.910
We took care of it in our previous command.

09:40.310 --> 09:45.390
So now we are actually making PC jump into the address within system.

09:46.750 --> 09:48.190
So what happened here?

09:48.190 --> 09:51.870
We are now executing code from the function system.

09:52.670 --> 09:55.980
Functions are getting the parameters for R0.

09:56.140 --> 10:02.140
We also took care of the fact that R0 is pointing into being a Qt string.

10:02.700 --> 10:07.940
So what's really being executed here is system been Qt.

10:10.620 --> 10:12.260
And that's it.

10:12.300 --> 10:13.100
We won.

10:13.340 --> 10:16.180
We wanted to execute this piece of code.

10:16.620 --> 10:19.140
We didn't introduce new code to the system.

10:19.140 --> 10:26.580
We just controlled the PC through the stack pointer and through overflowing the return address with

10:26.580 --> 10:29.340
the first address that indicates this command.

10:29.980 --> 10:34.580
And usually, this is what you will look for when you start chaining your gadgets.

10:34.580 --> 10:43.020
You will look for pop commands that you can start controlling PC through popping elements from the stack.

10:44.740 --> 10:46.060
Pretty amazing right?

10:46.260 --> 10:52.460
Uh, sounds complicated, but we have automated tools today to help us build those things out of the

10:52.460 --> 10:57.320
existing, uh, binary of the system that we want to exploit.

10:57.320 --> 11:05.040
So you push the binary, uh, the code segments of the system that you want to exploit and, uh, tools

11:05.040 --> 11:07.440
help you to find relevant gadgets.

11:09.760 --> 11:10.320
Okay.

11:10.360 --> 11:13.800
So now going back to thinking as, uh.

11:16.480 --> 11:21.200
As a defender, could you think of a way to prevent this?

11:25.480 --> 11:30.280
So defender's response to OP is control flow integrity.

11:30.840 --> 11:32.000
What is CFE?

11:32.120 --> 11:36.160
CFI can actually help prevent open job attacks.

11:36.200 --> 11:41.440
Job is jump oriented programming by restricting indirect jumps and calls.

11:41.760 --> 11:43.480
It's not really restricting.

11:43.480 --> 11:50.320
It's more like inspecting that every jump is going to be into allowed locations.

11:50.360 --> 12:00.060
Allowed locations is for example, not within functions and uh, jumps or branches that actually follow

12:00.540 --> 12:05.540
the execution paths that we recognized on the software.

12:06.100 --> 12:09.420
So again, sounds complicated, but let's break it piece by piece.

12:10.060 --> 12:15.940
The first step in a high level of CFI is to instrument the code and create a control flow graph.

12:15.980 --> 12:22.260
A control flow graph at a high level is a graph that shows all the execution paths within the program.

12:22.660 --> 12:29.900
So for example, if function A is calling function b and c and that's it, then the control flow graph

12:29.900 --> 12:34.420
will show that from function A you can only get to function b and function c.

12:36.020 --> 12:39.020
Then there is path validation during execution.

12:39.020 --> 12:49.460
The instrumented code that was inserted uh during the first stage is validating that every uh control

12:49.500 --> 12:54.920
transfer is being verified against the permitted, uh, legitimate paths.

12:55.880 --> 13:01.400
That way, we can actually prevent attacks when there is an unauthorized branch like we did before we

13:01.400 --> 13:05.160
branched, or start executed code just from a middle of a function.

13:05.160 --> 13:13.280
Actually, just some pop up line of code that we jump to, which is probably part of a function.

13:13.680 --> 13:14.920
Middle of a function.

13:15.240 --> 13:16.800
It's going to be blocked.

13:16.800 --> 13:25.920
By blocking this, we can actually prevent, um, rope chains from actually causing unintended behaviors.

13:26.960 --> 13:30.240
Now let's stop for a second and remember section one.

13:30.800 --> 13:37.560
If you remember, exploitation is defined by causing unintended behavior with CFI.

13:37.760 --> 13:46.320
Even if I have a vulnerability, and even if I place a rope chain, that actually works because, uh,

13:46.320 --> 13:52.660
CFI is limiting us to certain Allowed execution paths.

13:53.220 --> 13:54.540
Then my rope chain.

13:54.940 --> 13:59.300
If it violates the legitimate execution paths, I'm going to be blocked.

13:59.700 --> 14:04.660
But if I build a rope chain to actually not violate violate.

14:04.660 --> 14:08.500
Sorry, the execution path, that is legitimate.

14:08.860 --> 14:17.020
On one hand, my rope chain works and running, but on the other hand, the behavior is pretty much

14:17.060 --> 14:19.540
expected because it's legitimate.

14:20.140 --> 14:23.140
And this is really the key element to understand here.

14:24.180 --> 14:33.740
DeFi makes sure that all the branches and jumps and return addresses within the code returns to legitimate

14:33.740 --> 14:36.060
expected code flows.

14:37.820 --> 14:41.300
So how it prevents rope attacks, exactly as we discussed before.

14:41.900 --> 14:43.620
And let's talk about an example.

14:43.620 --> 14:48.980
Without CFE, buffer overflows can redirect execution to objects with CFE.

14:49.020 --> 14:54.560
The system detects that invalid return addresses and terminates, stopping the attack.

14:56.240 --> 15:00.040
Now, step by step of how we calculate what is legitimate.

15:00.360 --> 15:04.720
So the first stage of a CFI is static analysis before execution.

15:05.080 --> 15:11.920
This is where the compiler constructs a control flow graph representing all valid control transfers.

15:11.960 --> 15:17.800
This sounds complicated, but since software is deterministic, we can actually create that graph pretty

15:17.800 --> 15:18.560
reliably.

15:19.280 --> 15:25.240
Then the CFI instruments the code to insert checks before each indirect control transfer like function

15:25.240 --> 15:31.800
pointers Vtables indirect jumps like the branch we saw before, and so on.

15:33.120 --> 15:36.920
Then CFI assign labels to indirect calls and returns.

15:36.920 --> 15:42.520
It means that valid function entry points are tagged with metadata or labels.

15:43.160 --> 15:51.510
Then at runtime, whenever we jump or try to do a control transfer, Sphere, we validate that we are

15:51.510 --> 15:55.630
going to jump to something that is labeled, that is allowed.

15:56.870 --> 15:59.430
Then there is actually runtime enforcement.

15:59.470 --> 16:04.190
CFI verifies each indirect branch against the predefined valid targets.

16:04.710 --> 16:12.270
If the redirect is to unexpected destination, the program is immediately terminated, thus stopping

16:12.470 --> 16:13.590
the exploitation.

16:15.910 --> 16:24.030
So in the wild, CFI is very much used in OS level security windows, Linux, Android.

16:24.070 --> 16:29.150
It's implemented in modern compilers, so you can choose to enable it for your systems or code.

16:29.790 --> 16:31.230
There are some limitations though.

16:31.270 --> 16:34.150
Performance overhead is one of the biggest ones.

16:34.950 --> 16:39.870
It could slow down the execution up and sometimes it's not acceptable.

16:40.670 --> 16:48.990
There are some bypass techniques like using memory leaks or exploiting trusted execution flows.

16:49.090 --> 16:53.730
still in a way that is unintended and it requires developer adoption.

16:53.730 --> 16:56.170
So we must enable it in compile time.

16:56.570 --> 17:01.450
And it's hard to enable on third parties, especially binary third parties.

17:02.650 --> 17:09.330
Since the CFI the CFI calculates the CFG based on source code.

17:11.170 --> 17:19.770
Now, just as a quick example of how indirect branch looks like in ARM when CFI is actually enabled.

17:20.130 --> 17:27.930
So let's take this branch for example to uh to x zero that contains the address of target function.

17:28.690 --> 17:31.290
There are no checks here in the branch itself.

17:31.290 --> 17:36.810
But when we go to the function and branch into the address you see this BTI here.

17:37.210 --> 17:39.250
This is actually the protection tag.

17:39.250 --> 17:45.050
And this uh, shows the system that it's okay to branch to this location.

17:45.490 --> 17:50.230
But let's say we overwrite x zero X0 which address to a job.

17:50.230 --> 17:53.350
Let's say we put the address of this command move.

17:53.630 --> 18:02.790
Uh x29 SP then since there is no BTI here, CFI will stop and say no, you cannot jump here because

18:02.790 --> 18:05.590
it's not an allowed, uh, location.

18:06.350 --> 18:11.910
So this is how it looks like after all the steps we discussed are being applied.

18:13.830 --> 18:14.910
What's next?

18:14.910 --> 18:18.030
So you're going to write your first ROP exploitation.

18:18.550 --> 18:20.150
And that will be exciting.

18:20.150 --> 18:25.470
It's, uh, looks like magic at first, but then it works and it's pretty cool.

18:26.270 --> 18:32.510
And next in line we're going to learn about the art of heap feng shui or heap shaping.

18:33.310 --> 18:34.270
Stay tuned.

18:34.310 --> 18:37.630
It's going to be interesting and practical.

18:38.190 --> 18:43.030
Thank you so much and see you in section four.