这篇文章是我来Hulu这一年做的主要工作,结合当下流行的两个开源方案Docker和YARN,提供了一套灵活的编程模型,目前支持DAG编程模型,将会支持长服务编程模型。

基于Voidbox,开发者可以很容易的写出一个分布式的框架,Docker作为运行的执行引擎,YARN作为集群资源的管理系统。

同时这篇文章也发表在Hulu官方的技术博客上:http://tech.hulu.com/blog/2015/08/06/voidbox-docker-on-yarn/

csdn在线:http://huiyi.csdn.net/activity/closed?project_id=2332

1. Voidbox Motivation

YARN is the distributed resource management system in Hadoop 2.0, which is able to schedule cluster resources for diverse high-level applications such as MapReduce, Spark. However, nowadays, all existing framework on top of YARN are designed with assumption of specific system environment. How to support user applications with arbitrary complex environment dependencies is still an open question. Docker gives the answer.

Docker is a very popular container virtualization technology. It provides a way to run almost any application isolated in a container. Docker is an open platform for developing, shipping, and running applications. Docker automates the deployment of any application as a lightweight, portable, self-sufficient container that will run virtually anywhere.

In order to integrate the unique advantages of Docker and YARN, the Hulu engineering team developed Voidbox. Voidbox enables any application encapsulated in docker image running on YARN cluster along with MapReduce and Spark. Voidbox brings the following benefits:

  • Ease creating distributed application

    • Voidbox handles most common issues in distributed computation system, say it, cluster discovery, elastic resource allocation, task coordination, disaster recovery. With its well-designed interface, it’s easy to implement a distributed application.
  • Simplify deployment
    • Without Voidbox, we need to create and maintain dedicated VM for application with complex environment even though the VM image is huge and not easy to deploy. With Voidbox, we could easily get resource allocated and make app run right the time we need it. Additional maintenance work is eliminated.
  • Improve cluster efficiency
    • As we could deploy Spark/MR and all kinds of Voidbox applications from different department together, we could maximize cluster usage.

Thus, YARN as a big data operating platform has been further consolidated and enhanced.

Voidbox supports Docker container-based DAG(Directed Acyclic Graph) tasks in execution. Moreover, Voidbox provides several ways to submit applications considering demands of the production environment and the debugging environment. In addition, Voidbox can cooperate with Jenkins, GitLab and private Docker Registry to set up a set of developing, testing, automatic release process.

2.Voidbox Architecture

2.1 YARN Architecture Overview

YARN enables multiple applications to share resources dynamically in a cluster. Here is the architecture of applications running in YARN cluster:

Figure 1. YARN Architecture

As shown in figure 1, a client submits a job to Resource Manager. The Resource Manager performs its scheduling function according to the resource requirements of the application. Application Master is responsible for the application tasks scheduling and execution of an application’s lifecycle.

Functionality of each modules:

  • Resource Manager: Responsible for resource management and scheduling in cluster.
  • NodeManager: Running on the compute nodes in cluster, taking care of task execution in the individual machine, collecting informations and keeping heartbeat with Resource Manager.
  • Application Master: Takes care of requesting resources from YARN, then allocates resources to run tasks in Container.
  • Container: Container is an abstract notion which incorporates elements such as memory, cpu, disk, network etc.
  • HDFS: Distributed file system in YARN cluster.

2.2 Voidbox Architecture Design

In Voidbox architecture, YARN is responsible for the cluster’s resource management. Docker acts as the task execution engine above of the operating system, cooperating with Docker Registry. Voidbox helps to translate user programming code into Docker container-based DAG tasks, apply for resources according to requirements and deal with DAG in execution.

Figure 2. Voidbox Architecture

As shown in figure 2, each box stands for one machine with several modules running inside. To make the architecture more clearly, we divide them into three parts, and functionality of Voidbox modules and Docker modules:

  • Voidbox Modules:

    • Voidbox Client: The client program. Through Voidbox Client, users can submit a Voidbox application, stop it, and so on. By the way, Voidbox application contains several Docker jobs and a Docker job contains one or more Docker tasks.
    • Voidbox Master: Actually, it’s an application master in YARN, and takes care of requesting resources from YARN, then allocates resources to Docker tasks.
    • Voidbox Driver: Responsible for task scheduling of a single Voidbox application. Voidbox supports Docker container-based DAG task scheduling and between tasks we can insert some other codes. So Voidbox Driver should handle the order scheduling of DAG task dependencies and execute the user’s code.
    • Voidbox Proxy: The bridge between YARN and Docker engine, responsible for transiting commands from YARN to Docker engine, such as start or kill Docker container, etc.
    • State Server: Maintaining the informations of Docker engine’s health status, providing the list of machines which can run Docker container. So Voidbox Master can apply for resources more efficiently.
  • Docker Modules:
    • Docker Registry: Docker image storage, acting as an internal version control tool of Docker image.
    • Docker Engine: Docker container execution engine, obtaining specified Docker image from Docker Registry and launching Docker container.
    • Jenkins: Cooperating with GitLab, when application codes update, Jenkins will take care of automated testing, packaging, generating the Docker image and uploading to Docker Registry, to complete the application automatically release process.

2.3 Running Mode

Voidbox provides two application running modes: yarn-cluster mode and yarn-client mode.

In yarn-cluster mode, the control component and resource management component are running in the YARN cluster. After we submit the Voidbox application, Voidbox Client can quit at any time without affecting the running time of application. It’s for the production environment.

In yarn-client mode, the control component is running in Voidbox Client, and other components are in the cluster. Users can see much more detailed logs about the application’s status. When Voidbox Client quits, the application in cluster will exit too. So it’s more convenient for debugging.

Here we briefly introduce the implementation architecture of the two modes:

  • yarn-cluster mode

Figure 3. yarn-cluster mode

As shown in figure 3, Voidbox Master and Voidbox Driver are both running in the cluster. Voidbox Driver is responsible for controlling the logic and Voidbox Master takes care of application resource management.

  • yarn-client mode

Figure 4. yarn-client mode

As shown in figure 4, Voidbox Master is running in the cluster, and Voidbox Driver is running in Voidbox Client. Users can submit Voidbox application in IDE for debugging.

2.4 Running Procedure

Here are the procedures of submitting a Voidbox application and its lifecycle:

  1. Users write a Voidbox application by Voidbox SDK and generate a java archive, then submit it to the YARN cluster by Voidbox Client;
  2. After receiving Voidbox application, Resource Manager will allocate resources for Voidbox Master, then launch it.
  3. Voidbox Master starts Voidbox Driver, the latter will decompose Voidbox application into several Docker jobs(a job contains one or more Docker tasks). Voidbox Driver calls Voidbox Master interface to launch the Docker tasks in compute nodes.
  4. Voidbox Master requests resources from Resource Manager, and Resource Manager allocates some YARN containers according to the YARN cluster status. Voidbox Master launches Voidbox Proxy in YARN container, and the latter is responsible for communication with Docker engine to start the Docker container.
  5. User’s Docker task is running in Docker container, and the log output to a local file. User can see real-time application logs through YARN Web Portal.
  6. After all Docker tasks are done, the logs will be aggregate to HDFS, so user still can get the application logs by history server.

2.5 Docker integrating with YARN in resource management

YARN acts as a uniform resource manager in the cluster, and is responsible for resource management on all machines. Docker as a container engine also has the function of resource management. So how to integrate their resource management function is particularly important.

In YARN, the user task can only run in the YARN container, while Docker container can only be handled by Docker engine. This case would get out of the management of YARN and damage the unified management and scheduling principle of YARN, which could produce resource leaks risk issue. In order to enable YARN to manage and schedule Docker container, we need to build a proxy layer between YARN and Docker engine. This is why Docker Proxy is introduced. Through Voidbox Proxy, YARN can manage the container lifecycle including start, stop, etc.

In order to understand Voidbox Proxy more clearly, we take stopping Voidbox application as an example. When a user needs to kill Voidbox application, YARN will recycle all the resources of the application. At this point, YARN will send a kill signal to the related machines. The corresponding Voidbox Proxy will catch the kill signal, then stop Docker container in Docker engine to do the resource recycling. So with the help of Voidbox Proxy, it can not only stop YARN container, but also stop the Docker container to avoid resources leaks issue(This is the problem existing in open source version, see YARN-1964).

3. Fault Tolerance

Although Docker has some stable releases, the enterprise production environment has a variety versions of operating system or kernel, so it brings unstable factors. We consider multiple levels in Voidbox fault-tolerant design to ensure Voidbox’s high availability.

  • Voidbox Master fault tolerance

    • If Resource Manager finds Voidbox Master crashes, it will notify NodeManager to recycle all the YARN containers belonging to this Voidbox application, then restart Voidbox Master.
  • Voidbox Proxy fault tolerance
    • If Voidbox Master finds Voidbox Proxy crashes, it will recycle Docker containers on behalf of Voidbox Proxy.
  • Docker container fault tolerance
    • Each Voidbox application can configure the maximum retry times on failure, when the Docker container crashes, Voidbox Master will do some work according to the exit code of Docker container.

4. Programming model

4.1 DAG Programming model

Voidbox Provides Docker container-based DAG programming model. A sample would look similar to this:

Figure 5. Docker container-based DAG programming model

As shown in figure 5, there are four jobs in this Voidbox application, and each job can configure its requirements of CPU, Memory, Docker image, parallelism and so on. Job3 will start when job1 and job2 both complete. Job1, job2 and job3 make a stage, so user can insert their codes after this stage is done, and finally start running job4.

4.2 Shell mode to submit one task

In most cases, we would like to run a single Docker container-based task without programming. So Voidbox supports shell mode to describe and submit the Docker container-based task, actually it’s a implementation based on DAG programming mode.The example usage of Voidbox in shell mode:

docker-submit.sh \

-docker_image centos \

-shell_command “echo Hello Voidbox” \

-container_memory 1000 \

-cpu_shares 2

The shell script above will submit a task to run “echo Hello Voidbox” in a docker image named ‘centos’, and the resource requirement is 1000Mb memory, 2 cpu virtual cores. 

5. Voidbox in Action

At present we can run Docker, MapReduce, Spark and other applications in YARN cluster. There has been lots of short tasks using Voidbox within HULU.

  • Automation testing process

    • Cooperating with Jenkins, GitLab and private Docker registry, when the application codes update, Jenkins will complete automatic test, package program, regenerate Docker image and push it to the private Docker Registry. It’s a process of development, testing and automatically release.
  • Complex tasks in parallel
    • Test Framework is used to do some testings to detect the availability of some components. The project is implemented by Ruby/Java and has complex dependencies. So we maintain two layers of Docker image, the first layer is the system software as a base image, and the second layer is the business level. We publish a test framework Docker image and use some timing scheduling software to start Voidbox application regularly. Thanks to Voidbox, we solve the issues such as the complex dependencies and the multitasking parallelism.
    • Facematch(link:http://tech.hulu.com/blog/2014/05/03/face-match-system-overview/) is a video analysis application. It’s implemented by C and has lots of graphics libraries. That can be optimized by Voidbox: first of all we need to package all face match program into a Docker image, then write Voidbox application to handle the multiple videos. Voidbox solves the complex machine environment and the parallelism control problem.
  • Building complex workflow
    • Some tasks have a dependent with each other, such as it needs to load user behaviors first, then do the analysis of user behaviors. These two steps have successively dependencies. We use Voidbox container-based programming model to handle this case easily.

6. Different from DockerContainerExecutor in YARN 2.6.0

  • DockerContainerExecutor(link:https://issues.apache.org/jira/browse/YARN-1964) is released in YARN 2.6.0 and it’s alpha version. Not mature enough, and it is only an encapsulation layer above the default executor.
  • DockerContainerExecutor is difficult to coexist with other ContainerExecutor in one YARN cluster.
  • Voidbox features
    • DAG programming model
    • Configurable container level of fault tolerance
    • A variety of running modes, considering development environment and production environment
    • Share YARN cluster resources with other Hadoop job
    • Graphical log view tool

7. Future work

  • Support more versions of YARN

    • Voidbox would like to support more versions in the future besides YARN 2.6.0.
  • Voidbox Master fault tolerance, persistent metadata to reduce the cost in case of retry
    • Currently, if a Voidbox Master crashes, YARN will recycle resources belonging to this Voidbox application and restart Voidbox Master to do some tasks from the very beginning. It’s not necessary to impact tasks which are already done or running. We might keep some metadatas in the State Server to reduce the cost in case of Voidbox Master on-failure.
  • Voidbox Master as a permanent service
    • Voidbox will support long running Voidbox Master to receive streaming tasks.
  • Support long service
    • Voidbox will support long running service if Voidbox Master’s downtime doesn’t influence running task.

Docker on YARN在Hulu的实现的更多相关文章

  1. 【翻译】Voidbox: Docker on YARN

    原文链接:Voidbox – Docker on YARN 读了此文,收获良多,翻译之,方便以后查看~ 文章介绍了Hulu北京大数据团队开发的Docker On YARN实现:Voidbox,一种基于 ...

  2. docker 与 yarn

    有时我们的项目是使用yarn去发布的,当需要使用docker发布这个项目时,安装yarn是必须的,但是平时使用的npm install -g yarn此时却不可用 从网站上找到解决的方法 地址:htt ...

  3. Vagrant Docker Composer Yarn 国外资源下载慢或失败的问题

    1 问题 有时,我们请求国外资源时,下载巨慢,甚至失败.如: cd vue-devtools/ $ yarn install 进行到 cypress.... 时,可能失败. 2 解决 次日凌晨(7-8 ...

  4. 初试docker以及搭建mysql on docker

    前一阵阅读了google的borg论文,在最后的related works和总结中发现了kubernetes.从论文中了解的kubernetes这个东西很有意思,按照论文所说,它的实现有希望解决an ...

  5. 资源管理与调度系统-YARN资源隔离及以YARN为核心的生态系统

    资源管理与调度系统-YARN资源隔离及以YARN为核心的生态系统 作者:尹正杰 版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.什么是资源隔离 资源隔离是指为不同任务提供可独立使用的计算资源以 ...

  6. 使用 Yarn workspace,TypeScript,esbuild,React 和 Express 构建 K8S 云原生应用(一)

    本文将指导您使用 K8S ,Docker,Yarn workspace ,TypeScript,esbuild,Express 和 React 来设置构建一个基本的云原生 Web 应用程序. 在本教程 ...

  7. 使用 Docker 部署 Node 应用

    容器将应用与环境打包整合,解决了应用外部依赖的痛点,打包后通过窗口可方便地部署到任意环境,用过就知道很香. 创建示例应用 以 NestJS 为例,先创建一个示例应用. $ npm i -g @nest ...

  8. vue+ typescript 使用parcel 构建

    parcel 是一个零配置的前端构建工具,相比webpack 更快,同时使用简单以下是 一个简单的使用typescript 开发vue 应用,同时使用parcel 构建,同时集成了docker 构建, ...

  9. 企业实践 | 如何更好地使用 Apache Flink 解决数据计算问题?

    业务数据的指数级扩张,数据处理的速度可不能跟不上业务发展的步伐.基于 Flink 的数据平台构建.运用 Flink 解决业务场景中的具体问题等随着 Flink 被更广泛的应用于广告.金融风控.实时 B ...

随机推荐

  1. Elasticsearch集群状态脚本及grafana监控面板导出的json文件

    脚本文件: #!/usr/bin/env python import datetime import time import urllib import json import urllib2 imp ...

  2. 第二轮冲刺-Runner站立会议05

    今天:将baseadapter的原理弄清楚了 明天:解决适配问题 困难:程序会停止运行

  3. Java排序算法——希尔排序

    package sort; //================================================= // File Name : ShellSort //------- ...

  4. Java类中各种静态变量的加载顺序的学习

    最近在补<thinking in java>...有一节提到了加载类需要做的一些准备...我照着书本敲了一下代码...同时稍微修改了一下书本上的代码.... package charpte ...

  5. ab 轻量的压测工具

    阅读:http://www.cnblogs.com/luminji/archive/2011/09/02/2163525.html

  6. 深入理解javascript原型和闭包(15)——闭包

    前面提到的上下文环境和作用域的知识,除了了解这些知识之外,还是理解闭包的基础. 至于“闭包”这个词的概念的文字描述,确实不好解释,我看过很多遍,但是现在还是记不住. 但是你只需要知道应用的两种情况即可 ...

  7. tyvj4541 zhx 提高组P1

    背景 提高组 描述 在一个N×M的棋盘上,要求放置K个车,使得不存在一个车同时能被两个车攻击.问方案数. 输入格式 一行三个整数,N,M,K. 输出格式 一行一个整数,代表答案对1000001取模之后 ...

  8. 与或左移右移操作在ARM寄存器配置中的作用

    逻辑运算: 与运算&:与0清零  清零用与运算 或运算 |:或1置一  置一用或运算 异或 ^:不同为1  /*****单个寄存器清零置一*************************** ...

  9. Linux 等待进程结束 wait() 和 waitpid()

    若子进程先于父进程结束时,父进程调用wait()函数和不调用wait()函数会产生两种不同的结果: --> 如果父进程没有调用wait()和waitpid()函数,子进程就会进入僵死状态. -- ...

  10. ORACLE "ORA--22992:无法使用远程表选择的LOB定位器,database link"

    解决办法:    先创建一个临时表,然后把远程的含CLOB字段的表导入到临时表中,再倒入本表. create global temporary table demo_temp as select * ...