flashcache中内存与磁盘，磁盘与磁盘的io

flashcache中跟磁盘相关的读写分为以下两类：

1）磁盘跟内存的交互

2）磁盘跟磁盘之前的交互

比如说读不命中时就是直接从磁盘读，属于第1种情况，那读命中呢？也是属于第1种情况，不过这时候是从SSD读。磁盘跟磁盘之间交互是用于写脏数据，将SSD中脏cache块拷贝到磁盘上去。现在介绍下两种情况使用的接口函数，这样后面在看读写流程时看到这两个函数就十分亲切了，并且清楚地知道数据是从哪里流向哪里。

对于情况1，主要是两个函数dm_io_async_bvec和flashcache_dm_io_async_vm。

int dm_io_async_bvec(unsigned int num_regions,

#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,26)

                struct dm_io_region *where,

#else

                struct io_region *where,

#endif

                int rw,

                struct bio_vec *bvec, io_notify_fn fn,

                void *context)

{

    struct dm_io_request iorq;

    iorq.bi_rw = rw;

    iorq.mem.type = DM_IO_BVEC;

    iorq.mem.ptr.bvec = bvec;

    iorq.notify.fn = fn;

    iorq.notify.context = context;

    iorq.client = flashcache_io_client;

    return dm_io(&iorq, num_regions, where, NULL);

}

#endif

int

flashcache_dm_io_async_vm(struct cache_c *dmc, unsigned int num_regions,

#if LINUX_VERSION_CODE < KERNEL_VERSION(2,6,26)

              struct io_region *where,

#else

              struct dm_io_region *where,

#endif

              int rw,

              void *data, io_notify_fn fn, void *context)

{

    unsigned long error_bits = ;

    int error;

    struct dm_io_request io_req = {

        .bi_rw = rw,

        .mem.type = DM_IO_VMA,

        .mem.ptr.vma = data,

        .mem.offset = ,

        .notify.fn = fn,

        .notify.context = context,

        .client = flashcache_io_client,

    };

    error = dm_io(&io_req, , where, &error_bits);

    if (error)

        return error;

    if (error_bits)

        return error_bits;

    return ;

}

#endif

上面两个函数都使用struct dm_io_request 来包装了请求，其中的只有两种请求的类型是不一样的，第一个函数对应的是DM_IO_BVEC，第二个函数是DM_IO_VMA。

其实我开始一直不明白，为什么要使用这两个函数让硬盘与内存打交道，不过后来看了dm_io发现其中的io服务类型有多种不同类型，这两个函数的调用分别对应不同的io类型。下面先看一下dm_io相关的数据结构。

dm_io

dm-io为device mapper提供同步或者异步的io服务。

使用dm-io必须设置dm_io_region结构（2.6.26版本以前叫io_region），该结构定义了io操作的区域，读一般针对一个dm_io_region区，而写可以针对一组dm_io_region区。

struct dm_io_region {

    struct block_device *bdev;

    sector_t sector;

    sector_t count;         /* If this is zero the region is ignored. */

};

老版本的内核，用户必须设置一个io_region结构来描述预期的I/O所在地。每个io_region说明了一个在区域上的有起始位置和长度的块设备。

struct io_region {

      struct block_device *bdev;

      sector_t sector;

      sector_t count;

   };

Dm-io 可以从一个io_region中读取或者写入到一个或者多个io_region中去。一个io_region结构数组指定了写入到多个区域。

dm-io一共有四种dm_io_mem_type类型（老一点的内核版本只有前面三种，Flashcache主要使用DM_IO_BVEC）:

enum dm_io_mem_type {

    DM_IO_PAGE_LIST,/* Page list */

    DM_IO_BVEC,     /* Bio vector */

    DM_IO_VMA,      /* Virtual memory area */

    DM_IO_KMEM,     /* Kernel memory */

};

struct dm_io_memory {

    enum dm_io_mem_type type;

    union {

            struct page_list *pl;

            struct bio_vec *bvec;

            void *vma;

            void *addr;

    } ptr;

    unsigned offset;

};

Dm-io 提供同步和异步I/O服务。老一点的内核它提供了3种I/O服务，每种服务都有一个同步和一个异步的版本。

DM_IO_PAGE_LIST

第一个I/O服务类型使用了一串内存页作为缓冲区，伴随着一个首页面的偏移。

   struct page_list {

      struct page_list *next;

      struct page *page;

   };

 int dm_io_sync(unsigned int num_regions, struct io_region *where, int rw,

                  struct page_list *pl, unsigned int offset,

                  unsigned long *error_bits);

   int dm_io_async(unsigned int num_regions, struct io_region *where, int rw,

                   struct page_list *pl, unsigned int offset,

                   io_notify_fn fn, void *context);

DM_IO_BVEC

第二种I/O服务类型把一组bio载体当着I/O的数据缓冲。如果调用者提前拼装了bio，这个服务可以很顺利地完成。但是需要将不同的bio页指向不同的设备。

   int dm_io_sync_bvec(unsigned int num_regions, struct io_region *where,

                       int rw, struct bio_vec *bvec,

                       unsigned long *error_bits);

   int dm_io_async_bvec(unsigned int num_regions, struct io_region *where,

                        int rw, struct bio_vec *bvec,

                        io_notify_fn fn, void *context);

DM_IO_VMA

第三种I/O服务类型把一个指向虚拟动态内存缓冲区的的指针当作I/O的数据缓冲。如果调用者需要在很大的块设备上进行I/O操作又不想分配大量的个人内存页，那么这种服务可以胜任。

 int dm_io_sync_vm(unsigned int num_regions, struct io_region *where, int rw,

                     void *data, unsigned long *error_bits);

   int dm_io_async_vm(unsigned int num_regions, struct io_region *where, int rw,

                      void *data, io_notify_fn fn, void *context);

异步I/O服务的调用者必须包含一个完成的回调函数和一个指向一些这个I/O内容数据的指针。

typedef void (*io_notify_fn)(unsigned long error, void *context);

这个"error"参数，就像这个"*error"参数在任何同步版本中一样，在这个回调函数中就象一个位集合（而不是一个简单的错误值）。

在写I/O到多个目标区域的情况下，这个位集合允许dm-io说明在每个单独的区域上的成功或者失败。

在使用任何dm-io服务之前，用户必须调用dm_io_get()、同时指定他们想要的页数来执行I/O.

DM-io将尝试着更改自己的内存池的大小来确认在执行i/o时为了避免不必要的等待而有足够的页面来供给。
当用户完成了使用I/O服务，他们将调用dm_io_put(),并指定和给dm_io_get()的相同数量的页面。

dm-io通过dm_io_request结构来封装请求的类型，如果设置了dm_io_notify.fn则是异步IO，否则是同步IO。

struct dm_io_request {

    int bi_rw;                      /* READ|WRITE - not READA */

    struct dm_io_memory mem;        /* Memory to use for io */

    struct dm_io_notify notify;     /* Synchronous if notify.fn is NULL */

    struct dm_io_client *client;    /* Client memory handler */

};

使用dm_io服务前前需要通过dm_io_client_create函数（在2.6.22版本前是dm_io_get）先创建dm_io_client结构，为dm-io的执行过程中分配内存池。使用dm-io服务完毕后，则需要调用dm_io_client_destroy函数（在2.6.22版本前是dm_io_put）释放内存池。

struct dm_io_client {

    mempool_t *pool;

    struct bio_set *bios;

};

dm-io函数执行具体的io请求。

int dm_io(struct dm_io_request *io_req, unsigned num_regions,

      struct dm_io_region *where, unsigned long *sync_error_bits)

{

    int r;

    struct dpages dp;

    r = dp_init(io_req, &dp);

    if (r)

            return r;

    if (!io_req->notify.fn)

            return sync_io(io_req->client, num_regions, where,

                           io_req->bi_rw, &dp, sync_error_bits);

    return async_io(io_req->client, num_regions, where, io_req->bi_rw,

                    &dp, io_req->notify.fn, io_req->notify.context);

}

对于第二种情况，磁盘跟磁盘之前的交互。这种情况只用于将ssd中脏块写入disk中。

int dm_kcopyd_copy(struct dm_kcopyd_client *kc, struct dm_io_region *from,

                   unsigned int num_dests, struct dm_io_region *dests,

                   unsigned int flags, dm_kcopyd_notify_fn fn, void *context)

第一个参数dm_kcopyd_client，在使用kcopyd异步拷贝服务时，必须先创建一个对应的client，首先要分配“kcopyd客户端”结构，调用函数如下：

kcopyd_client_create(FLASHCACHE_COPY_PAGES, &flashcache_kcp_client);

创建dm_kcopyd_client结构。

第二个参数dm_io_region是源地址，第四个参数是目的地址，定义如下

struct dm_io_region {
     struct block_device *bdev;
     sector_t sector;
     sector_t count;          /* If this is zero the region is ignored. */
};

dm_kcopyd_notify_fn fn是kcopyd处理完请求的回调函数

context 是回调函数参数，在flashcache都设置对应的kcached_job。

flashcache中内存与磁盘，磁盘与磁盘的io的更多相关文章

CentOS查看CPU、内存、网络流量和磁盘 I/O
安装 yum install -y sysstat sar -d 1 1 rrqm/s: 每秒进行 merge 的读操作数目.即 delta(rmerge)/swrqm/s: 每秒进行 merge 的 ...
CentOS查看CPU、内存、网络流量和磁盘 I/O【详细】
安装 yum install -y sysstat sar -d 1 1 rrqm/s: 每秒进行 merge 的读操作数目.即 delta(rmerge)/swrqm/s: 每秒进行 merge 的 ...
linux硬件资源问题排查：cpu负载、内存使用情况、磁盘空间、磁盘IO
在使用过程中之前正常的功能,突然无法使用,性能变慢,通常都是资源消耗问题,资源消耗可以从以下几个方面去排查.对于已经安装硬件资源监控软件(zabbix)的环境,直接使用硬件资源监控软件(zabbix) ...
请确认 <Import> 声明中的路径正确，且磁盘上存在该文件。
在网上下了个源码打开报错. 请确认 <Import> 声明中的路径正确,且磁盘上存在该文件. 一查,原来是路径错误. 解决办法:将项目文件(.csproj)用记事本打开,然后找到<I ...
升级CUDA版本导致VS2010错误：未找到导入的项目XXX，请确认<Import>声明中的路径正确，且磁盘上存在该文件
转自:http://www.cnblogs.com/yeahgis/p/3853420.html VS2010错误:未找到导入的项目XXX,请确认<Import>声明中的路径正确,且磁盘上 ...
Linux学习（十四）磁盘格式化、磁盘挂载、手动增加swap空间
一.磁盘格式化分好去的磁盘需要格式化之后才可以使用.磁盘分区一般用mke2fs命令或者mkfs.filesystemtype.这个filesystemtype分为ext4,ext3,xfs等等.xf ...
Linux学习之CentOS(十二)------磁盘管理之磁盘的分区、格式化、挂载(转)
磁盘分区.格式化.挂载磁盘分区新增分区查询分区删除分区磁盘格式化 mkfs mke2fs磁盘挂载与卸载 mount umount 磁盘的分区.格式化.挂 ...
（转）GPT磁盘与MBR磁盘区别
摘要: Windows 2008磁盘管理器中,在磁盘标签处右击鼠标,随磁盘属性的不同会出现“转换到动态磁盘”,“转换到基本磁盘”“转换成GPT磁盘”,“转换成MBR磁盘”等选项,在此做简单介绍.部 ...
Linux磁盘系统——管理磁盘的命令
Linux磁盘系统——管理磁盘的命令摘要:本文主要学习了Linux系统中管理磁盘的命令,包括查看磁盘使用情况.磁盘挂载相关.磁盘分区相关.磁盘格式化等操作. df命令 df命令用于显示Linux系统 ...

随机推荐

VC+++ 操作word
最近完成了一个使用VC++ 操作word生成扫描报告的功能,在这里将过程记录下来,开发环境为visual studio 2008 导入接口首先在创建的MFC项目中引入word相关组件右键点击项目 ...
windows查看当前python的版本
1.Ctrl+R打开控制台输入python之后回车
BZOJ1968 [Ahoi2005] 约数研究
Description Input 只有一行一个整数 N(0 < N < 1000000). Output 只有一行输出,为整数M,即f(1)到f(N)的累加和. Sample Input ...
cf1088D. Ehab and another another xor problem(思维)
题意题目链接系统中有两个数$(a, b)$,请使用$62$以内次询问来确定出$(a, b)$ 每次可以询问两个数$(c, d)$ 若\(a \oplus c > b \opl ...
JavaWeb学习总结（一）：基本概念
一.基本概念 1.1.WEB开发的相关知识 WEB,在英语中web即表示网页的意思,它用于表示Internet主机上供外界访问的资源. Internet上供外界访问的Web资源分为: 静态web资源( ...
Python之正则表达式模块
正则表达式符号: . ^ $ * + ? {} () | [] .一个点代表一个字符 ^代表开头 $代表结尾 *代表有0到无数个 [0,+00] ?代表有0到1个 [0,1] +代表有1到无 ...
Google APAC----Africa 2010, Qualification Round(Problem B. Reverse Words)----Perl 解法
原题地址链接:https://code.google.com/codejam/contest/351101/dashboard#s=p1 问题描述: Problem Given a list of s ...
The String class's judging function
字符串的判断功能: package com.itheima_03; /* * Object:是类层次结构中的根类,所有的类都直接或者间接的继承自该类. * 如果一个方法的形式参数是Object,那么这 ...
AD账户锁定策略
AD账户锁定策略在一个域中可以有多套,密码策略只能有一套
Python学习---Django的验证码
[更多参考] http://www.cnblogs.com/wupeiqi/articles/4786251.html

flashcache中内存与磁盘，磁盘与磁盘的io

flashcache中内存与磁盘，磁盘与磁盘的io的更多相关文章

随机推荐

热门专题