ref:http://www.ibm.com/developerworks/linux/library/l-linux-kernel/?S_TACT=105AGX52&S_CMP=cn-a-l

Given that the goal of this article is to introduce you to the Linux kernel and explore its architecture and major components, let's start with a short tour of Linux kernel history, then look at the Linux kernel architecture from 30,000 feet, and, finally, examine its major subsystems. The Linux kernel is over six million lines of code, so this introduction is not exhaustive. Use the pointers to more content to dig in further.

A short tour of Linux history

Linux or GNU/Linux?

You've probably noticed that Linux as an operating system is referred to in some cases as "Linux" and in others as "GNU/Linux." The reason behind this is that Linux is the kernel of an operating system. The wide range of applications that make the operating system useful are the GNU software. For example, the windowing system, compiler, variety of shells, development tools, editors, utilities, and other applications exist outside of the kernel, many of which are GNU software. For this reason, many consider "GNU/Linux" a more appropriate name for the operating system, while "Linux" is appropriate when referring to just the kernel.

While Linux is arguably the most popular open source operating system, its history is actually quite short considering the timeline of operating systems. In the early days of computing, programmers developed on the bare hardware in the hardware's language. The lack of an operating system meant that only one application (and one user) could use the large and expensive device at a time. Early operating systems were developed in the 1950s to provide a simpler development experience. Examples include the General Motors Operating System (GMOS) developed for the IBM 701 and the FORTRAN Monitor System (FMS) developed by North American Aviation for the IBM 709.

In the 1960s, Massachusetts Institute of Technology (MIT) and a host of companies developed an experimental operating system called Multics (or Multiplexed Information and Computing Service) for the GE-645. One of the developers of this operating system, AT&T, dropped out of Multics and developed their own operating system in 1970 called Unics. Along with this operating system was the C language, for which C was developed and then rewritten to make operating system development portable.

Twenty years later, Andrew Tanenbaum created a microkernel version of UNIX®, called MINIX (for minimal UNIX), that ran on small personal computers. This open source operating system inspired Linus Torvalds' initial development of Linux in the early 1990s (see Figure 1).

Figure 1. Short history of major Linux kernel releases

Linux quickly evolved from a single-person project to a world-wide development project involving thousands of developers. One of the most important decisions for Linux was its adoption of the GNU General Public License (GPL). Under the GPL, the Linux kernel was protected from commercial exploitation, and it also benefited from the user-space development of the GNU project (of Richard Stallman, whose source dwarfs that of the Linux kernel). This allowed useful applications such as the GNU Compiler Collection (GCC) and various shell support.

 

Back to top

Introduction to the Linux kernel

Now on to a high-altitude look at the GNU/Linux operating system architecture. You can think about an operating system from two levels, as shown in Figure 2.

Figure 2. The fundamental architecture of the GNU/Linux operating system

Methods for system call interface (SCI)

In reality, the architecture is not as clean as what is shown in Figure 2. For example, the mechanism by which system calls are handled (transitioning from the user space to the kernel space) can differ by architecture. Newer x86 central processing units (CPUs) that provide support for virtualization instructions are more efficient in this process than older x86 processors that use the traditional int 80h method.

At the top is the user, or application, space. This is where the user applications are executed. Below the user space is the kernel space. Here, the Linux kernel exists.

There is also the GNU C Library (glibc). This provides the system call interface that connects to the kernel and provides the mechanism to transition between the user-space application and the kernel. This is important because the kernel and user application occupy different protected address spaces. And while each user-space process occupies its own virtual address space, the kernel occupies a single address space. For more information, see the links in the Resources section.

The Linux kernel can be further divided into three gross levels. At the top is the system call interface, which implements the basic functions such as read and write. Below the system call interface is the kernel code, which can be more accurately defined as the architecture-independent kernel code. This code is common to all of the processor architectures supported by Linux. Below this is the architecture-dependent code, which forms what is more commonly called a BSP (Board Support Package). This code serves as the processor and platform-specific code for the given architecture.

 

Back to top

Properties of the Linux kernel

When discussing architecture of a large and complex system, you can view the system from many perspectives. One goal of an architectural decomposition is to provide a way to better understand the source, and that's what we'll do here.

The Linux kernel implements a number of important architectural attributes. At a high level, and at lower levels, the kernel is layered into a number of distinct subsystems. Linux can also be considered monolithic because it lumps all of the basic services into the kernel. This differs from a microkernel architecture where the kernel provides basic services such as communication, I/O, and memory and process management, and more specific services are plugged in to the microkernel layer. Each has its own advantages, but I'll steer clear of that debate.

Over time, the Linux kernel has become efficient in terms of both memory and CPU usage, as well as extremely stable. But the most interesting aspect of Linux, given its size and complexity, is its portability. Linux can be compiled to run on a huge number of processors and platforms with different architectural constraints and needs. One example is the ability for Linux to run on a process with a memory management unit (MMU), as well as those that provide no MMU. The uClinux port of the Linux kernel provides for non-MMU support. See the Resources section for more details.

 

Back to top

Major subsystems of the Linux kernel

Now let's look at some of the major components of the Linux kernel using the breakdown shown in Figure 3 as a guide.

Figure 3. One architectural perspective of the Linux kernel

System call interface

The SCI is a thin layer that provides the means to perform function calls from user space into the kernel. As discussed previously, this interface can be architecture dependent, even within the same processor family. The SCI is actually an interesting function-call multiplexing and demultiplexing service. You can find the SCI implementation in ./linux/kernel, as well as architecture-dependent portions in ./linux/arch. More details for this component are available in the Resources section.

Process management

What is a kernel?

As shown in Figure 3, a kernel is really nothing more than a resource manager. Whether the resource being managed is a process, memory, or hardware device, the kernel manages and arbitrates access to the resource between multiple competing users (both in the kernel and in user space).

Process management is focused on the execution of processes. In the kernel, these are called threads and represent an individual virtualization of the processor (thread code, data, stack, and CPU registers). In user space, the term process is typically used, though the Linux implementation does not separate the two concepts (processes and threads). The kernel provides an application program interface (API) through the SCI to create a new process (fork, exec, or Portable Operating System Interface [POSIX] functions), stop a process (kill, exit), and communicate and synchronize between them (signal, or POSIX mechanisms).

Also in process management is the need to share the CPU between the active threads. The kernel implements a novel scheduling algorithm that operates in constant time, regardless of the number of threads vying for the CPU. This is called the O(1) scheduler, denoting that the same amount of time is taken to schedule one thread as it is to schedule many. The O(1) scheduler also supports multiple processors (called Symmetric MultiProcessing, or SMP). You can find the process management sources in ./linux/kernel and architecture-dependent sources in ./linux/arch). You can learn more about this algorithm in the Resources section.

Memory management

Another important resource that's managed by the kernel is memory. For efficiency, given the way that the hardware manages virtual memory, memory is managed in what are called pages (4KB in size for most architectures). Linux includes the means to manage the available memory, as well as the hardware mechanisms for physical and virtual mappings.

But memory management is much more than managing 4KB buffers. Linux provides abstractions over 4KB buffers, such as the slab allocator. This memory management scheme uses 4KB buffers as its base, but then allocates structures from within, keeping track of which pages are full, partially used, and empty. This allows the scheme to dynamically grow and shrink based on the needs of the greater system.

Supporting multiple users of memory, there are times when the available memory can be exhausted. For this reason, pages can be moved out of memory and onto the disk. This process is called swapping because the pages are swapped from memory onto the hard disk. You can find the memory management sources in ./linux/mm.

Virtual file system

The virtual file system (VFS) is an interesting aspect of the Linux kernel because it provides a common interface abstraction for file systems. The VFS provides a switching layer between the SCI and the file systems supported by the kernel (see Figure 4).

Figure 4. The VFS provides a switching fabric between users and file systems

At the top of the VFS is a common API abstraction of functions such as open, close, read, and write. At the bottom of the VFS are the file system abstractions that define how the upper-layer functions are implemented. These are plug-ins for the given file system (of which over 50 exist). You can find the file system sources in ./linux/fs.

Below the file system layer is the buffer cache, which provides a common set of functions to the file system layer (independent of any particular file system). This caching layer optimizes access to the physical devices by keeping data around for a short time (or speculatively read ahead so that the data is available when needed). Below the buffer cache are the device drivers, which implement the interface for the particular physical device.

Network stack

The network stack, by design, follows a layered architecture modeled after the protocols themselves. Recall that the Internet Protocol (IP) is the core network layer protocol that sits below the transport protocol (most commonly the Transmission Control Protocol, or TCP). Above TCP is the sockets layer, which is invoked through the SCI.

The sockets layer is the standard API to the networking subsystem and provides a user interface to a variety of networking protocols. From raw frame access to IP protocol data units (PDUs) and up to TCP and the User Datagram Protocol (UDP), the sockets layer provides a standardized way to manage connections and move data between endpoints. You can find the networking sources in the kernel at ./linux/net.

Device drivers

The vast majority of the source code in the Linux kernel exists in device drivers that make a particular hardware device usable. The Linux source tree provides a drivers subdirectory that is further divided by the various devices that are supported, such as Bluetooth, I2C, serial, and so on. You can find the device driver sources in ./linux/drivers.

Architecture-dependent code

While much of Linux is independent of the architecture on which it runs, there are elements that must consider the architecture for normal operation and for efficiency. The ./linux/arch subdirectory defines the architecture-dependent portion of the kernel source contained in a number of subdirectories that are specific to the architecture (collectively forming the BSP). For a typical desktop, the i386 directory is used. Each architecture subdirectory contains a number of other subdirectories that focus on a particular aspect of the kernel, such as boot, kernel, memory management, and others. You can find the architecture-dependent code in ./linux/arch.

 

Back to top

Interesting features of the Linux kernel

If the portability and efficiency of the Linux kernel weren't enough, it provides some other features that could not be classified in the previous decomposition.

Linux, being a production operating system and open source, is a great test bed for new protocols and advancements of those protocols. Linux supports a large number of networking protocols, including the typical TCP/IP, and also extension for high-speed networking (greater than 1 Gigabit Ethernet [GbE] and 10 GbE). Linux also supports protocols such as the Stream Control Transmission Protocol (SCTP), which provides many advanced features above TCP (as a replacement transport level protocol).

Linux is also a dynamic kernel, supporting the addition and removal of software components on the fly. These are called dynamically loadable kernel modules, and they can be inserted at boot when they're needed (when a particular device is found requiring the module) or at any time by the user.

A recent advancement of Linux is its use as an operating system for other operating systems (called a hypervisor). Recently, a modification to the kernel was made called the Kernel-based Virtual Machine (KVM). This modification enabled a new interface to user space that allows other operating systems to run above the KVM-enabled kernel. In addition to running another instance of Linux, Microsoft® Windows® can also be virtualized. The only constraint is that the underlying processor must support the new virtualization instructions. See the Resources section for more information.

 

Back to top

Going further

This article just scratched the surface of the Linux kernel architecture and its features and capabilities. You can check out the Documentation directory that's provided in every Linux distribution for detailed information about the contents of the kernel. Be sure to check out the Resourcessection at the end of this article for more detailed information about many of the topics discussed here.

Anatomy of the Linux kernel--转的更多相关文章

  1. Linux kernel make 常用选项介绍

    Linux kernel 编译方法大全记录 一.这是一个我自己写的自动make脚本: #!/bin/sh export ARCH=arm export CROSS_COMPILE=arm-linux- ...

  2. Linux Kernel代码艺术——系统调用宏定义

    我们习惯在SI(Source Insight)中阅读Linux内核,SI会建立符号表数据库,能非常方便地跳转到变量.宏.函数等的定义处.但在处理系统调用的函数时,却会遇到一些麻烦:我们知道系统调用函数 ...

  3. Linux Kernel 代码艺术——编译时断言

    本系列文章主要写我在阅读Linux内核过程中,关注的比较难以理解但又设计巧妙的代码片段(不关注OS的各个模块的设计思想,此部分我准备写在“深入理解Linux Kernel” 系列文章中),一来通过内核 ...

  4. linux kernel的中断子系统 softirq

    linux kernel的中断子系统之(八):softirq http://www.wowotech.net/irq_subsystem/soft-irq.html http://www.ibm.co ...

  5. karottc A Simple linux-virus Analysis、Linux Kernel <= 2.6.37 - Local Privilege Escalation、CVE-2010-4258、CVE-2010-3849、CVE-2010-3850

    catalog . 程序功能概述 . 感染文件 . 前置知识 . 获取ROOT权限: Linux Kernel <= - Local Privilege Escalation 1. 程序功能概述 ...

  6. linux kernel 字符设备详解

    有关Linux kernel 字符设备分析: 参考:http://blog.jobbole.com/86531/ 一.linux kernel 将设备分为3大类,字符设备,块设备,网络设备. 字符设备 ...

  7. linux kernel 杂谈

    首先介绍一下背景吧,工作三个星期了.复习了一波u-boot,跟了一下事件上报,搞了下平台设备,扣了一个内存检查代码. 想想生活是不是有点无聊.对啊,真的很无聊!!!! 无聊也没有办法啊,所以找点方法去 ...

  8. 与您共享Linux Kernel 4.8分支首个维护版本

    导读 Linux Kernel 4.8正式版于10月2日由Linus Torvalds发布,带来了包括AMDGPU OverDrive支持.NVIDIA Pascal支持.AMDGPU PowerPl ...

  9. Linux Kernel Version Numbering

    Because there are numerous revisions and releases of the Linux kernel and new ones are developed at ...

  10. SourceInsight 精确导入Linux kernel源码的方法

    相信有很多人用 SourceInsight 查看 Linux Kernel 源码,但导入源码时会遇到一些问题.1.如果把整个源码树都导入进去,查看符号定义的时候,会发现有大量重复定义,很难找到正确的位 ...

随机推荐

  1. Hadoop第三天---分布式文件系统HDFS(大数据存储实战)

    1.开机启动Hadoop,输入命令:  检查相关进程的启动情况: 2.对Hadoop集群做一个测试:   可以看到新建的test1.txt和test2.txt已经成功地拷贝到节点上(伪分布式只有一个节 ...

  2. geeksforgeeks@ Find sum of different corresponding bits for all pairs (Bit manipulation)

    http://www.practice.geeksforgeeks.org/problem-page.php?pid=387 Find sum of different corresponding b ...

  3. WebRTC源码分析:音频模块结构分析

    一.概要介绍WebRTC的音频处理流程,见下图: webRTC将音频会话抽象为一个通道Channel,譬如A与B进行音频通话,则A需要建立一个Channel与B进行音频数据传输.上图中有三个Chann ...

  4. Model&Animation

    [Model&Animation] 1.FBX文件是一个完整的模型,通常内含Mesh,Material,Texture,Animation,即内含构成一个完成GameObject所需要的一切组 ...

  5. Mongo技巧-连接数据库与修改表结构

    1. 连接非本机数据库 mongo.exe之后直接输入ip地址即可 mongo.exe 192.168.163.203 2. 修改表结构 mongo里面没有表结构这个概念,现在采用类似关系型数据库的形 ...

  6. STL学习系列九:Map和multimap容器

    1.map/multimap的简介 map是标准的关联式容器,一个map是一个键值对序列,即(key,value)对.它提供基于key的快速检索能力. map中key值是唯一的.集合中的元素按一定的顺 ...

  7. C#POP3协议实现SSL验证登陆GMAIL

    最近在折腾POP3协议,登陆pop.qq.com和pop.163.com没有什么问题,于是就想着登陆pop.gmail.com,结果失败了.经查,发现gmail的pop3端口不是110,而是995.于 ...

  8. Jquery基础:append、prepend、after、before、appendTo的区别

    append() 是在被选元素的结束标签前面(即改被选元素的内部)插入指定内容. after() 是在被选元素的结束标签后面(即该被选元素的外部)插入指定的内容. appendTo() 仍然是在被选元 ...

  9. 关于div的居中的问题

    (从已经死了一次又一次终于挂掉的百度空间人工抢救出来的,发表日期2014-01-11) div水平和垂直居中,text-align和vertical-align不起作用,因为标签div没有这两个属性, ...

  10. jquery 新建的元素事件绑定问题

    js的事件监听跟css不一样,css只要设定好了样式,不论是原来就有的还是新添加的,都有一样的表现.而事件监听不是,你必须给每一个元素单独绑定事件. 常见的例子是处理表格的时候.每行行末有个删除按钮, ...