Introduction to Linux Threads

A thread of execution is often regarded as the smallest unit of processing that a scheduler works on.

A process can have multiple threads of execution which are executed asynchronously.

This asynchronous execution brings in the capability of each thread handling a particular work or service independently. Hence multiple threads running in a process handle their services which overall constitutes the complete capability of the process.

In this article we will touch base on the fundamentals of threads and build the basic understanding required to learn the practical aspects of Linux threads.

Linux Threads Series: part 1 (this article), part 2part 3.

Why Threads are Required?

Now, one would ask why do we need multiple threads in a process?? Why can’t a process with only one (default) main thread be used in every situation.

Well, to answer this lets consider an example :

Suppose there is a process, that receiving real time inputs and corresponding to each input it has to produce a certain output. Now, if the process is not multi-threaded ie if the process does not involve multiple threads, then the whole processing in the process becomes synchronous. This means that the process takes an input processes it and produces an output.

 

The limitation in the above design is that the process cannot accept an input until its done processing the earlier one and in case processing an input takes longer than expected then accepting further inputs goes on hold.

To consider the impact of the above limitation, if we map the generic example above with a  socket server process that can accept input connection, process them and provide the socket client with output. Now, if in processing any input if the server process takes more than expected time and in the meantime another input (connection request) comes to the socket server then the server process would not be able to accept the new input connection as its already stuck in processing the old input connection. This may lead to a connection time out at the socket client which is not at all desired.

This shows that synchronous model of execution cannot be applied everywhere and hence was the requirement of asynchronous model of execution felt which is implemented by using threads.

Difference Between threads and processes

Following are some of the major differences between the thread and the processes :

  • Processes do not share their address space while threads executing under same process share the address space.
  • From the above point its clear that processes execute independent of each other and the synchronization between processes is taken care by kernel only while on the other hand the thread synchronization has to be taken care by the process under which the threads are executing
  • Context switching between threads is fast as compared to context switching between processes
  • The interaction between two processes is achieved only through the standard inter process communication while threads executing under the same process can communicate easily as they share most of the resources like memory, text segment etc

User threads Vs Kernel Threads

Threads can exist in user space as well as in kernel space.

user space threads are created, controlled and destroyed using user space thread libraries. These threads are not known to kernel and hence kernel is nowhere involved in their processing. These threads follow co-operative multitasking where-in a thread releases CPU on its own wish ie the scheduler cannot preempt the thread. Th advantages of user space threads is that the switching between two threads does not involve much overhead and is generally very fast while on the negative side since these threads follow co-operative multitasking so if one thread gets block the whole process gets blocked.

kernel space thread is created, controlled and destroyed by the kernel. For every thread that exists in user space there is a corresponding kernel thread. Since these threads are managed by kernel so they follow preemptive multitasking where-in the scheduler can preempt a thread in execution with a higher priority thread which is ready for execution. The major advantage of kernel threads is that even if one of the thread gets blocked the whole process is not blocked as kernel threads follow preemptive scheduling while on the negative side the context switch is not very fast as compared to user space threads.

If we talk of Linux then kernel threads are optimized to such an extent that they are considered better than user space threads and mostly used in all scenarios except where prime requirement is that of cooperative multitasking.

Problem with Threads

There are some major problems that arise while using threads :

  • Many operating system does not implement threads as processes rather they see threads as part of parent process. In this case, what would happen if a thread calls fork() or even worse what if a thread execs a new binary?? These scenarios may have dangerous consequences for example in the later problem the whole parent process could get replaced with the address space of the newly exec’d binary. This is not at all desired.  Linux which is POSIX complaint makes sure that calling a fork() duplicates only the thread that has called the fork() function while an exec from any of the thread would stop all the threads in the parent process.
  • Another problem that may arise is the concurrency problems. Since threads share all the segments (except the stack segment) and can be preempted at any stage by the scheduler than any global variable or data structure that can be left in inconsistent state by preemption of one thread could cause severe problems when the next high priority thread executes the same function and uses the same variables or data structures.

For the problem 1 mentioned above, all we can say is that its a design issue and design for applications should be done in a way that least problems of this kind arise.

For the problem 2 mentioned above, using locking mechanisms programmer can lock a chunk of code inside a function so that even if a context switch happens (when the function global variable and data structures were in inconsistent state) then also next thread is not able to execute the same code until the locked code block inside the function is unlocked by the previous thread (or the thread that acquired it).

How to Create Threads in Linux (With a C Example Program)

In the part I of the Linux Threads series, we discussed various aspects related to threads in Linux.

In this article we will focus on how a thread is created and identified. We will also present a working C program example that will explain how to do basic threaded programming.

Linux Threads Series: part 1, part 2 (this article), part 3.

Thread Identification

Just as a process is identified through a process ID, a thread is identified by a thread ID. But interestingly, the similarity between the two ends here.

  • A process ID is unique across the system where as a thread ID is unique only in context of a single process.
  • A process ID is an integer value but the thread ID is not necessarily an integer value. It could well be a structure
  • A process ID can be printed very easily while a thread ID is not easy to print.

The above points give an idea about the difference between a process ID and thread ID.

Thread ID is represented by the type ‘pthread_t’.  As we already discussed that in most of the cases this type is a structure, so there has to be a function that can compare two thread IDs.

#include <pthread.h>
int pthread_equal(pthread_t tid1, pthread_t tid2);

So as you can see that the above function takes two thread IDs and returns nonzero value if both the thread IDs are equal or else it returns zero.

Another case may arise when a thread would want to know its own thread ID. For this case the following function provides the desired service.

 
#include <pthread.h>
pthread_t pthread_self(void);

So we see that the function ‘pthread_self()’ is used by a thread for printing its own thread ID.

Now, one would ask about the case where the above two function would be required. Suppose there is a case where a link list contains data for different threads. Every node in the list contains a thread ID and the corresponding data. Now whenever a thread tries to fetch its data from linked list, it first gets its own ID by calling ‘pthread_self()’ and then it calls the ‘pthread_equal()’ on every node to see if the node contains data for it or not.

An example of the generic case discussed above would be the one in which a master thread gets the jobs to be processed and then it pushes them into a link list. Now individual worker threads parse the linked list and extract the job assigned to them.

Thread Creation

Normally when a program starts up and becomes a process, it starts with a default thread. So we can say that every process has at least one thread of control.  A process can create extra threads using the following function :

#include <pthread.h>
int pthread_create(pthread_t *restrict tidp, const pthread_attr_t *restrict attr, void *(*start_rtn)(void), void *restrict arg)

The above function requires four arguments, lets first discuss a bit on them :

  • The first argument is a pthread_t type address. Once the function is called successfully, the variable whose address is passed as first argument will hold the thread ID of the newly created thread.
  • The second argument may contain certain attributes which we want the new thread to contain.  It could be priority etc.
  • The third argument is a function pointer. This is something to keep in mind that each thread starts with a function and that functions address is passed here as the third argument so that the kernel knows which function to start the thread from.
  • As the function (whose address is passed in the third argument above) may accept some arguments also so we can pass these arguments in form of a pointer to a void type. Now, why a void type was chosen? This was because if a function accepts more than one argument then this pointer could be a pointer to a structure that may contain these arguments.

A Practical Thread Example

Following is the example code where we tried to use all the three functions discussed above.

#include<stdio.h>
#include<string.h>
#include<pthread.h>
#include<stdlib.h>
#include<unistd.h> pthread_t tid[2]; void* doSomeThing(void *arg)
{
unsigned long i = 0;
pthread_t id = pthread_self(); if(pthread_equal(id,tid[0]))
{
printf("\n First thread processing\n");
}
else
{
printf("\n Second thread processing\n");
} for(i=0; i<(0xFFFFFFFF);i++); return NULL;
} int main(void)
{
int i = 0;
int err; while(i < 2)
{
err = pthread_create(&(tid[i]), NULL, &doSomeThing, NULL);
if (err != 0)
printf("\ncan't create thread :[%s]", strerror(err));
else
printf("\n Thread created successfully\n"); i++;
} sleep(5);
return 0;
}

So what this code does is :

  • It uses the pthread_create() function to create two threads
  • The starting function for both the threads is kept same.
  • Inside the function ‘doSomeThing()’, the thread uses pthread_self() and pthread_equal() functions to identify whether the executing thread is the first one or the second one as created.
  • Also, Inside the same function ‘doSomeThing()’ a for loop is run so as to simulate some time consuming work.

Now, when the above code is run, following was the output :

$ ./threads
Thread created successfully
First thread processing
Thread created successfully
Second thread processing

As seen in the output, first thread is created and it starts processing, then the second thread is created and then it starts processing. Well one point to be noted here is that the order of execution of threads is not always fixed. It depends on the OS scheduling algorithm.

Note: The whole explanation in this article is done on Posix threads. As can be comprehended from the type, the pthread_t type stands for POSIX threads. If an application wants to test whether POSIX threads are supported or not, then the application can use the macro _POSIX_THREADS for compile time test. To compile a code containing calls to posix APIs, please use the compile option ‘-pthread’.

How to Terminate a Thread in C Program ( pthread_exit Example )

In the part-II (Thread creation and Identification) of the Linux Thread series, we discussed about thread IDs, how to compare two thread IDs and how to create a thread.

In this article we will mainly focus on how a thread is terminated.

Linux Threads Series: part 1part 2, part 3 (this article).

C Thread Example Program

If we take the same example as discussed in part-II of this series :

#include<stdio.h>
#include<string.h>
#include<pthread.h>
#include<stdlib.h>
#include<unistd.h> pthread_t tid[2]; void* doSomeThing(void *arg)
{
unsigned long i = 0;
pthread_t id = pthread_self(); if(pthread_equal(id,tid[0]))
{
printf("\n First thread processing\n");
}
else
{
printf("\n Second thread processing\n");
} for(i=0; i<(0xFFFFFFFF);i++); return NULL;
} int main(void)
{
int i = 0;
int err; while(i < 2)
{
err = pthread_create(&(tid[i]), NULL, &doSomeThing, NULL);
if (err != 0)
printf("\ncan't create thread :[%s]", strerror(err));
else
printf("\n Thread created successfully\n"); i++;
} sleep(5);
return 0;
}

Did you observe the ‘sleep()’ function being used? Did you get a question about why sleep() is being used? Well if you did then you are at the correct place to get the answer and if you did not then also its going to be a good read ahead.

If I remove the sleep() function from the code above and then try to compile and run it, I see the following output :

$ ./threads
Thread created successfully
First thread processing
Thread created successfully

But if I run it with sleep() enabled then I see the output as  :

$ ./threads
Thread created successfully
First thread processing
Thread created successfully
Second thread processing

So we see that the log ‘Second thread processing’ is missing in case we remove the sleep() function.

 

So, why does this happen? Well, this happened because just before the second thread is about to be scheduled, the parent thread (from which the two threads were created) completed its execution. This means that the default thread in which the main() function was running got completed and hence the process terminated as main() returned.

Thread Termination

As already discussed above that each program starts with at least one thread which is the thread in which main() function is executed. So maximum lifetime of every thread executing in the program is that of the main thread. So, if we want that the main thread should wait until all the other threads are finished then there is a function pthread_join().

#include <pthread.h>
int pthread_join(pthread_t thread, void **rval_ptr);

The function above makes sure that its parent thread does not terminate until it is done. This function is called from within the parent thread and the first argument is the thread ID of the thread to wait on and the second argument is the return value of the thread on which we want to the parent thread to wait. If we are not interested in the return value then we can set this pointer to be NULL.

If we classify on a broader level, then we see that a thread can terminate in three ways :

  1. If the thread returns from its start routine.
  2. If it is canceled by some other thread. The function used here is pthread_cancel().
  3. If its calls pthread_exit() function from within itself.

The focus here would be on pthread_exit(). Its prototype is as follows :

#include <pthread.h>
void pthread_exit(void *rval_ptr);

So we see that this function accepts only one argument, which is the return from the thread that calls this function. This return value is accessed by the parent thread which is waiting for this thread to terminate. The return value of the thread terminated by pthread_exit() function is accessible in the second argument of the pthread_join which just explained above.

C Thread Termination Example

Lets take an example where we use the above discussed functions :

#include<stdio.h>
#include<string.h>
#include<pthread.h>
#include<stdlib.h>
#include<unistd.h> pthread_t tid[2];
int ret1,ret2; void* doSomeThing(void *arg)
{
unsigned long i = 0;
pthread_t id = pthread_self(); for(i=0; i<(0xFFFFFFFF);i++); if(pthread_equal(id,tid[0]))
{
printf("\n First thread processing done\n");
ret1  = 100;
pthread_exit(&ret1);
}
else
{
printf("\n Second thread processing done\n");
ret2  = 200;
pthread_exit(&ret2);
} return NULL;
} int main(void)
{
int i = 0;
int err;
int *ptr[2]; while(i < 2)
{
err = pthread_create(&(tid[i]), NULL, &doSomeThing, NULL);
if (err != 0)
printf("\ncan't create thread :[%s]", strerror(err));
else
printf("\n Thread created successfully\n"); i++;
} pthread_join(tid[0], (void**)&(ptr[0]));
pthread_join(tid[1], (void**)&(ptr[1])); printf("\n return value from first thread is [%d]\n", *ptr[0]);
printf("\n return value from second thread is [%d]\n", *ptr[1]); return 0;
}

In the code above :

  • We created two threads using pthread_create()
  • The start function for both the threads is same ie doSomeThing()
  • The threads exit from the start function using the pthread_exit() function with a return value.
  • In the main function after the threads are created, the pthread_join() functions are called to wait for the two threads to complete.
  • Once both the threads are complete, their return value is accessed by the second argument in the pthread_join() call.

The output of the above code comes out as  :

$ ./threads
Thread created successfully
Thread created successfully
First thread processing done
Second thread processing done
return value from first thread is [100]
return value from second thread is [200]

So we see that both the threads execute completely and their return value is accessed in the main function.

本篇教程摘抄自:https://www.thegeekstuff.com/2012/03/linux-threads-intro/

Introduction to Linux Threads的更多相关文章

  1. A Quick Introduction to Linux Policy Routing

    A Quick Introduction to Linux Policy Routing 29 May 2013 In this post, I’m going to introduce you to ...

  2. [转帖]Introduction to Linux monitoring and alerting

    Introduction to Linux monitoring and alerting https://www.redhat.com/sysadmin/linux-monitoring-and-a ...

  3. Linux LSM(Linux Security Modules) Hook Technology

    目录 . 引言 . Linux Security Module Framework Introduction . LSM Sourcecode Analysis . LSMs Hook Engine: ...

  4. Linux posix线程库总结

    由于历史原因,2.5.x以前的linux对pthreads没有提供内核级的支持,所以在linux上的pthreads实现只能采用n:1的方式,也称为库实现. 线程的实现,经历了如下发展阶段: Linu ...

  5. 【转载】Linux启动过程

    转自:http://cizixs.com/2015/01/18/linux-boot-process 简介 我们都知道:操作系统运行的代码是在硬盘上的,最终要跑到内存和 CPU 上,才能被我们使用. ...

  6. 《linux内核设计与实现》读书笔记第三章

    第3章 进程管理 3.1 进程 1.进程 进程就是处于执行期的程序. 进程包括: 可执行程序代码 打开的文件 挂起的信号 内核内部数据 处理器状态 一个或多个具有内存映射的内存地址空间 一个或多个执行 ...

  7. Docker基础技术:Linux Namespace(下)

    在 Docker基础技术:Linux Namespace(上篇)中我们了解了,UTD.IPC.PID.Mount 四个namespace,我们模仿Docker做了一个相当相当山寨的镜像.在这一篇中,主 ...

  8. NDK(5) Android JNI官方综合教程[JavaVM and JNIEnv,Threads ,jclass, jmethodID, and jfieldID,UTF-8 and UTF-16 Strings,Exceptions,Native Libraries等等]

    JNI Tips In this document JavaVM and JNIEnv Threads jclass, jmethodID, and jfieldID Local and Global ...

  9. Linux多线程(三)(同步互斥)

    1. 线程的同步与互斥 1.1. 线程的互斥 在Posix Thread中定义了一套专门用于线程互斥的mutex函数.mutex是一种简单的加锁的方法来控制对共享资源的存取,这个互斥锁只有两种状态(上 ...

随机推荐

  1. HTTP、HTTP2.0、SPDY、HTTPS 你应该知道的一些事

    参考: https://www.cnblogs.com/wujiaolong/p/5172e1f7e9924644172b64cb2c41fc58.html

  2. DDD(Domain Driven Design) 架构设计

    一.为什么要分层 分层架构是所有架构的鼻祖,分层的作用就是隔离,不过,我们有时候有个误解,就是把层和程序集对应起来,就比如简单三层架构中,在你的解决方案中,一般会有三个程序集项目:XXUI.dll.X ...

  3. Spring Boot启动命令参数详解及源码分析

    使用过Spring Boot,我们都知道通过java -jar可以快速启动Spring Boot项目.同时,也可以通过在执行jar -jar时传递参数来进行配置.本文带大家系统的了解一下Spring ...

  4. TaskTimer

    什么是调度 任务:就是事情 调度:在不同的时间点或者在指定的时间点或者间隔多长时间去运行这个任务.就是生活中的闹钟 相关的类Timer 类:位于 java.util 包中 案例 实现时间的动态刷新 任 ...

  5. 可靠性、幂等性和事务 Kafka

    Kafka笔记—可靠性.幂等性和事务   分类: 消息队列 标签: kafka 这几天很忙,但是我现在给我的要求是一周至少要出一篇文章,所以先拿这篇笔记来做开胃菜,源码分析估计明后两天应该能写一篇.给 ...

  6. PG数据库CPU和内存满负荷运转优化案

    1.问题描述 某客户系统采用三层架构:数据库—应用服务—前端应用.其中数据库使用PostgreSQL 10.0作为数据库软件.自周四起,服务器的CPU与内存使用率持续处于过饱合状态,并因此导致了数次宕 ...

  7. CentOS7安装RabbitMQ,并设置远程访问

      如果网速慢 可以直接到百度云分享中下载,然后拉到centerOS中,进行第二步即可    两个人安装包地址,提取码:z1oz 1.安装erlang环境 wget http://www.rabbit ...

  8. SQL系列(四)—— 唯一值(distinct)

    有时需要查询某列上的不重复的数据,如: SELECT name FROM student; 结果: name lxy lxy lxy lxy 这样的结果显然不符合我们的需求.如何对列数据进行去重,查询 ...

  9. bootstrap table--面相配置、hook、适配的表格框架

    bootstrap table--面相配置.hook.适配的表格框架

  10. VS代码调试出现:当前不会命中断点。还没有为该文档加载任何符号。

    第一步:一定要检查最顶部自己设置的是 Release模式还是Debug模式!!!下面这个图就是在我搜了好多解决方式之后,突然发现自己开的是Release模式!!!吐血. 第二步:如果你已经确定了自己是 ...