Boost - 从Coroutine2 到Fiber

协程引子

我开始一直搞不懂协程是什么，网上搜一搜，(尤其是Golang的goroutine)感觉从概念上听起来有点像线程池，尤其是类似Java的ExcutorService类似的东西

package helloworld;

import java.util.Calendar;

import java.util.Date;

import java.util.concurrent.Callable;

import java.util.concurrent.ExecutorService;

import java.util.concurrent.Executors;

import java.util.concurrent.Future;

public class CallMe {

	static class Call implements Callable<String>{

		@Override

		public String call() throws Exception {

			Date d = Calendar.getInstance().getTime();

			return d.toString();

		}

	}

	public static void main(String[] args) throws Exception{

		ExecutorService pool = Executors.newSingleThreadExecutor();

		Future<String> str = pool.submit(new Call());

		pool.shutdown();

		String ret = str.get();

		System.out.println(ret);

	}

}

package main

import (

	"fmt"

	"time"

)

func CallMe(pipe chan string) {

	t := time.Now()

	pipe <- t.String()

}

func main() {

	pipe := make(chan string, 1)

	defer close(pipe)

	go CallMe(pipe)

	select {

	case v, ok := <-pipe:

		if !ok {

			fmt.Println("Read Error")

		}

		fmt.Println(v)

	}

}

是的，协程除了它要解决的问题上，其他可以说就是线程。

那么协程要解决什么问题呢？

这要从协程为什么火起来说起。线程池很好，但线程是由操作系统调度的，并且线程切换代价太大，往往需要耗费数千个CPU周期。

在同步阻塞的编程模式下，

当并发量很大、IO密集时，往往一个任务刚进入线程池就阻塞在IO，就可能(因为线程切换是不可控的)需要切换线程，这时线程切换的代价就不可以忽视了。

后来人们发现异步非阻塞的模型能解决这个问题，当被IO阻塞时，直接调用非阻塞接口，注册一个回调函数，当前线程继续进行，也就不用切换线程了。但理想很丰满，现实很骨感，异步的回调各种嵌套让程序员的人生更加悲惨。

于是协程应运重生。

协程就是由程序员控制跑在线程里的“微线程”。它可以由程序员调度，切换协程时代价小(切换根据实现不同，消耗的CPU周期从几十到几百不等)，创建时耗费资源小。十分适用IO密集的场景。

Boost::Coroutine2

boost的Coroutine2不同于Goroutine，golang的协程调度是由Go语言完成，而boost::coroutine2的协程需要自己去调度。

#include <boost\coroutine2\all.hpp>

#include <cstdlib>

#include <iostream>

using namespace boost;

using namespace std;

class X {

public:

	X() {

		cout << "X()\n";

	}

	~X() {

		cout << "~X()\n";

		system("pause");

	}

};

void foo(boost::coroutines2::coroutine<void>::pull_type& pull) {

	X x;

	cout << "a\n";

	pull();

	cout << "b\n";

	pull();

	cout << "c\n";

}

int main() {

	coroutines2::coroutine<void>::push_type push(foo);

	cout << "1\n";

	push();

	cout << "2\n";

	push();

	cout << "3\n";

	push();

	return 0;

}

调用push_type和pull_type的operator()就会让出当前执行流程给对应的协程。push_type可以给pull_type传递参数，而pull_type通过调用get来获取。

你也可以写成这样

boost::coroutines2::coroutine<void>::pull_type pull([](coroutine<void>::push_type &push){...})

它和上面的区别是，新建的pull_type会立即进入绑定的函数中(哪里可以调用push()，哪个协程先执行)

那如果在main结束之前，foo里没有执行完，那foo里的X会析构吗？

Boost文档里说会的，这个叫做Stack unwinding。

我们不妨把main函数里最后一个push();去掉，这样后面就不会切换到foo的context了。会发现虽然foo中的"c"没有输出，但X还是析构了的。

Fiber

在实际生产中，我们更适合用fiber来解决问题。fiber有调度器，使用简单，不需要手动控制执行流程。

#include <boost\fiber\all.hpp>

#include <chrono>

#include <string>

#include <ctime>

#include <iostream>

#include <cstdlib>

using namespace std;

using namespace boost;

void callMe(fibers::buffered_channel<string>& pipe) {

	std::time_t result = std::time(nullptr);

	string timestr = std::asctime(std::localtime(&result));

	pipe.push(timestr);

}

int main() {

	fibers::buffered_channel<string> pipe(2);

	fibers::fiber f([&]() {callMe(pipe); });

	f.detach();

	string str;

	pipe.pop(str);

	cout << str << "\n";

	system("pause");

	return 0;

}

boost::fibers是一个拥有调度器的协程。看上去fiber已经和goroutine完全一样了。在fiber里不能调用任何阻塞线程的接口，因为一旦当前fiber被阻塞，那意味着当前线程的所有fiber都被阻塞了。因此所有跟协程相关的阻塞接口都需要自己实现一套协程的包装，比如this_fiber::sleep_for()。这也意味着数据库之类的操作没办法被fiber中直接使用。但好在fiber提供了一系列方法去解决这个问题。

使用非阻塞IO

int read_chunk( NonblockingAPI & api, std::string & data, std::size_t desired) {

    int error;

    while ( EWOULDBLOCK == ( error = api.read( data, desired) ) ) {

        boost::this_fiber::yield();

    }

    return error;

}

主要思想就是，当前fiber调用非阻塞api轮询，一旦发现该接口会阻塞，就调用boost::this_fiber::yield()让出执行权限给其他协程，知道下次获得执行权限，再次查看是否阻塞。

异步IO

std::pair< AsyncAPI::errorcode, std::string > read_ec( AsyncAPI & api) {

    typedef std::pair< AsyncAPI::errorcode, std::string > result_pair;

    boost::fibers::promise< result_pair > promise;

    boost::fibers::future< result_pair > future( promise.get_future() );

    // We promise that both 'promise' and 'future' will survive until our lambda has been called.

    // Need C++14

    api.init_read([promise=std::move( promise)]( AsyncAPI::errorcode ec, std::string const& data) mutable {

                            promise.set_value( result_pair( ec, data) );

                  });

    return future.get();

}

这种实现方法主要是利用了异步IO不会阻塞当前fiber，在异步的回调中给fibers::promise设值。当异步操作未返回时，如果依赖到异步的结果，在调用future.get()时就会让出执行权限给其他协程。

同步IO

同步IO不可以直接应用到fiber中，因为会阻塞当前线程而导致线程所有的fiber都阻塞。

如果一个接口只有同步模式，比如官方的Mysql Connector，那我们只能先利用多线程模拟一个异步接口，然后再把它当做异步IO去处理。

如何用多线程把同步接口包装成异步接口呢？

如下，包装好后就可以利用上面异步IO的方法再包装一个fiber可以使用的IO接口。

#include <boost/asio.hpp>

#include <boost/fiber/all.hpp>

#include <string>

#include <thread>

#include <cstdlib>

#include <stdio.h>

using namespace std;

using namespace boost;

class AsyncWrapper

{

public:

	void async_read(const string fileName, function<void (const string &)> callback) {

		auto fun = [=]() {

			FILE* fp = fopen(fileName.c_str(), "r");

			char buff[1024];

			string tmp;

			while (nullptr != fgets(buff, 1024, fp)) {

				tmp += buff;

			}

			fclose(fp);

			callback(tmp);

		};

		asio::post(pool, fun);

	}

	static void wait() {

		pool.join();

	}

protected:

	static asio::thread_pool pool;

};

asio::thread_pool AsyncWrapper::pool(2);

int main()

{

	AsyncWrapper wrap;

	string file = "./temp.txt";

	wrap.async_read(file, [](const string& result) {printf("%s\n", result.c_str()); });

	AsyncWrapper::wait();

	std::system("pause");

	return 0;

}

golang方便在哪

golang的好处就在于它已经帮你完成了上述的封装过程，它把所有的IO操作都封装成了阻塞同步调用模式，无非也是通过上面两种方法。这样程序员调用的时候感觉自己在写同步的代码，但却能享受异步/非阻塞带来的好处。

例如

package main

import (

	"log"

	"os"

	"sync"

	"time"

)

var wg sync.WaitGroup

func blockSleep() {

	log.Printf("blockSleep Before bock----------")

	time.Sleep(1 * time.Second)

	log.Printf("blockSleep After bock----------")

	wg.Done()

}

func writeFile() {

	log.Printf("writeFile Before bock+++++++++++")

	f, _ := os.Create("./temp.txt")

	defer f.Close()

	log.Printf("writeFile Before Write+++++++++++")

	f.WriteString("Hello World")

	log.Printf("writeFile After bock+++++++++++")

	wg.Done()

}

func main() {

	go writeFile()

	wg.Add(1)

	for i := 1; i < 5; i++ {

		go blockSleep()

		wg.Add(1)

	}

	wg.Wait()

}

这段代码time.Sleep和os.Create都会造成当前协程让出CPU。其输出如下

2018/05/31 13:53:19 blockSleep Before bock----------

2018/05/31 13:53:19 blockSleep Before bock----------

2018/05/31 13:53:19 writeFile Before bock+++++++++++

2018/05/31 13:53:19 blockSleep Before bock----------

2018/05/31 13:53:19 writeFile Before Write+++++++++++

2018/05/31 13:53:19 writeFile After bock+++++++++++

2018/05/31 13:53:19 blockSleep Before bock----------

2018/05/31 13:53:20 blockSleep After bock----------

2018/05/31 13:53:20 blockSleep After bock----------

2018/05/31 13:53:20 blockSleep After bock----------

2018/05/31 13:53:20 blockSleep After bock----------