WEBVTT

00:04.920 --> 00:09.660
So, guys, now in this section of the course, we will start with the thread synchronization.

00:11.100 --> 00:18.240
So thread synchronization is the hardest and the most important aspect of multi threading thread Synchronization

00:18.240 --> 00:26.370
is required in multi threaded programs where multi threads competes to perform conflicting operations

00:26.370 --> 00:28.020
on a shared resource.

00:28.050 --> 00:32.640
Now a conflicting operations is a read write operation or write write operation.

00:32.640 --> 00:33.390
Right?

00:33.390 --> 00:40.200
So consider you have a process P and this process P has two threads the thread T1 and the thread T2

00:40.200 --> 00:40.920
right.

00:40.920 --> 00:47.400
And both the threads, T1 and T2 are running concurrently and trying to perform some operation on a

00:47.400 --> 00:50.880
shared resource represented by this letter D Right.

00:52.020 --> 00:57.720
Now, this letter D is a shared resource, which could be a heap data structure, which could be a global

00:57.720 --> 01:03.690
variable, which could be some file descriptor, let's say open file, or it can represent socket,

01:03.690 --> 01:09.900
or it could be some data which has been received from external sources via multiple inlets.

01:10.470 --> 01:16.920
So if thread T1, for example, let's say the thread T1 is performing read operation on a shared resource

01:16.950 --> 01:22.320
T and the thread T2 is trying to perform a write operation on a shared resource D.

01:22.680 --> 01:28.140
Then these two operations will set to be a conflicting operations, right?

01:30.450 --> 01:34.410
Similarly write write operation are also conflicting operations.

01:34.410 --> 01:40.590
But if the two threads or more than two threads try to perform only read operations on a shared resource,

01:41.100 --> 01:45.090
then those read operations are not conflicting operations.

01:45.210 --> 01:50.400
So only read operation is not a conflicting operation, but read, write or write.

01:50.400 --> 01:50.610
Write.

01:50.610 --> 01:52.910
Operations are conflicting operations.

01:52.920 --> 01:58.650
So let us first try to understand the problems we would have if we don't have a thread synchronization,

01:58.650 --> 01:59.310
right?

01:59.310 --> 02:02.820
So let us try to discuss some sort of problems which will have.

02:02.970 --> 02:10.080
So consider a multi threaded process P which have two threads the thread T1 and thread T to also consider

02:10.080 --> 02:13.860
that a process P maintains a linked list of integers.

02:13.860 --> 02:19.140
So let us say there is a process P and one of the internal data structure of the process P is a linked

02:19.140 --> 02:20.550
list of some integers.

02:20.550 --> 02:21.090
Right?

02:21.750 --> 02:28.320
So now the two threads t one and T two can be scheduled on the CPUs of the machine in any order.

02:28.350 --> 02:33.870
You cannot assume any determinism or pattern of scheduling of threads on the CPU.

02:33.930 --> 02:40.320
Your operating system schedule the threads on the available CPUs of your machine as per their scheduling

02:40.320 --> 02:42.690
algorithm while writing a program.

02:42.690 --> 02:48.840
You just cannot assume or make any assumption that the threads of your program will going to be scheduled

02:48.840 --> 02:50.020
in so and so order.

02:50.040 --> 02:52.110
You just cannot assume anything.

02:52.320 --> 03:00.060
So now let us consider that the thread t one is executing this function on a link list and the thread

03:00.090 --> 03:03.450
t two is executing this particular function on a link list.

03:03.450 --> 03:05.010
That is the same link list.

03:05.040 --> 03:11.360
Now the first thread is trying to look up a node from the link list based on some integer input value.

03:11.370 --> 03:12.080
Right.

03:12.090 --> 03:19.080
So the function is get node from the list and the thread t one simply invokes some internal functions.

03:19.080 --> 03:25.360
Let us say search node which walks over this link list and find the matching node and return that node

03:25.360 --> 03:26.080
right.

03:26.110 --> 03:31.330
Similarly, thread T two is performing a delete operation on the same link list.

03:31.360 --> 03:37.540
That is, it searches the node in the link list and once the node is found it removes the node from

03:37.540 --> 03:40.240
the link list and free the node memory.

03:40.240 --> 03:40.990
Right.

03:41.020 --> 03:47.980
So you can assume that the thread t1 and thread t two are executing concurrently on different CPUs of

03:47.980 --> 03:48.790
the machine.

03:48.790 --> 03:49.540
Right.

03:49.900 --> 03:56.410
Now let us try to see what could go wrong if the thread t one and t two are allowed to execute concurrently

03:56.410 --> 04:01.420
on different CPUs of the machine without having deployed any thread synchronization.

04:01.900 --> 04:08.080
So let us suppose that the thread t one has successfully executed the instruction I1 and i2, but it

04:08.080 --> 04:10.150
has not yet executed the instruction.

04:10.150 --> 04:11.710
I4 Right.

04:12.070 --> 04:18.490
So it means that the thread T1 has executed an instruction i1 i2 or let us say i3, but instruction

04:18.490 --> 04:24.760
I4 is still pending and then the context switching happens and the thread t2 is loaded on the CPU.

04:24.790 --> 04:25.570
Right?

04:26.350 --> 04:33.070
So now when the threat T2 is loaded on the CPU, let us assume that threat T2 has executed all the instruction

04:33.070 --> 04:37.600
that is instruction i6 i7 i8 I9 and I10.

04:37.720 --> 04:38.410
Right.

04:38.440 --> 04:46.060
Let us also assume that the threat T1 invoked its function on the input value, say five right and the

04:46.060 --> 04:52.660
threat T2 was also invoked its function delete node from the list with the input value say five.

04:52.900 --> 04:59.980
It simply means that while threat T1 is trying to look up a node from the link list whose value is five,

04:59.980 --> 05:05.860
whereas threat T2 is trying to delete a same node from the link list whose value is five.

05:05.860 --> 05:06.610
Right.

05:08.570 --> 05:15.410
Now you can see that the thread two when executed it deleted the node from the link list and eventually

05:15.410 --> 05:16.680
freed the node memory.

05:16.700 --> 05:17.430
Right.

05:17.450 --> 05:24.830
Now let us suppose that once the thread T2 has completed its execution, the thread T1 is loaded again

05:24.830 --> 05:25.880
on the CPU.

05:25.910 --> 05:31.460
Now the thread T1 will resume its execution from the same point where it had left.

05:31.490 --> 05:38.480
So now the thread T1 will try to execute the instruction I4 and it will try to return a pointer to the

05:38.480 --> 05:39.020
node.

05:39.050 --> 05:43.880
But now remember the thread T2 has already freed that particular memory.

05:43.910 --> 05:50.120
It means that thread T1 is now returning an address of the memory which is already freed.

05:50.150 --> 05:50.900
Right.

05:50.900 --> 05:57.740
So the caller function which invoked this API was going to crash because it will go into deal with the

05:57.740 --> 05:59.740
address which is already freed.

05:59.750 --> 06:00.470
Right.

06:01.700 --> 06:07.310
So you can see that the thread to perform the right operation on the data structure link list, whereas

06:07.310 --> 06:12.470
the thread T1 is trying to perform read operation on the same data structure that is same link list.

06:12.470 --> 06:16.790
So this is the example where read and write operations are conflicting with each other.

06:17.730 --> 06:19.620
Any program which deploys the threat.

06:19.650 --> 06:27.180
T1 and T2, which performs the conflicting read and write operations could show any undefined behavior

06:27.180 --> 06:29.040
and can crash any time.

06:29.430 --> 06:35.280
So concurrent access of the shared data structures between multiple threads opens a window during which

06:35.280 --> 06:36.930
data is inconsistent.

06:36.930 --> 06:43.440
And the root cause is concurrency because you have allowed the two threads T1 and thread T2 to execute

06:43.440 --> 06:49.350
in a concurrent manner on different CPUs without thread synchronization techniques.

06:49.350 --> 06:56.220
Your program has been exposed to concurrency problems and this is one of many examples of data inconsistency

06:56.250 --> 07:01.890
due to allowing the execution of multiple threads without thread synchronization techniques.

07:01.890 --> 07:07.890
So the reason in code where shared data is accessed by multiple threads are called critical sections.

07:07.890 --> 07:13.530
So in this slide, the yellow color region which you are seeing on the slide are the critical section.

07:13.530 --> 07:17.830
This is the critical section and this is also the critical section.

07:17.830 --> 07:18.280
Why?

07:18.280 --> 07:23.920
It is a critical section because your threads are trying to access the data structure which is shared

07:23.920 --> 07:25.690
between two threads here.

07:25.690 --> 07:28.480
That data structure is this link list, right?

07:30.030 --> 07:36.600
So our goal is that that we should still allow the threads to execute in a concurrent manner, while

07:36.600 --> 07:41.880
at the same time we need to ensure that multiple threads access the shared resources of the process

07:41.880 --> 07:43.380
in a consistent manner.

07:43.650 --> 07:49.470
This is one example where Thread P1 returns the bogus node which was already freed by the thread t2

07:49.500 --> 07:50.220
Right.
