hadoop Yarn 编程API

客户端编程库：

所在jar包： org.apache.hadoop.yarn.client.YarnClient

使用方法：

1 定义一个YarnClient实例：

   private YarnClient client；

2 构造一个Yarn客户端句柄并初始化
   this.client = YarnClient.createYarnClient();
   client.ini(conf)
3 启动Yarn
  yarnClient.start()
4 获取一个新的application id
  YarnClientApplication app=yarnClient.createApplication();   注解：application id 封装在YarnCLientApplication里面了。
5 构造ApplicationSubmissionContext， 用以提交作业
  ApplicationSubmissionContext appContext = app.getApplicationSubmissionContext();
###################注解：一下会有很多set 属性的东东“程序名称，优先级，所在的队列啊，###################################
  ApplicationId appId = appContext.getApplicationId();
  appContext.setApplicationName(appName)
  ......
  yarnClient.submitApplication(appContext);//将应用程序提交到ResouceManager上    这里通过步骤 2 中的conf 读取yarn-site.xml

ApplicationMaster编程酷

设计思路：
为用户暴露‘回调函数’用户只需要实现这些回调函数，当某种事情发生时，会调用对应的（用户实现的）回调函数
回调机制：
1 定义一个回调函数
2 提供函数实现的一方在初始化的时候，将回调函数的函数指针注册给调用者
3 当特殊条件或事件发生的时候，调用者使用函数指针调用回调函数对事件进行处理
回调机制好处：
可以把调用者和被调用者分开，调用者不关心谁是调用者。它只需知道存在一个具有特定原型和限制条件的被调用函数。

/**

* Licensed to the Apache Software Foundation (ASF) under one

* or more contributor license agreements.  See the NOTICE file

* distributed with this work for additional information

* regarding copyright ownership.  The ASF licenses this file

* to you under the Apache License, Version 2.0 (the

* "License"); you may not use this file except in compliance

* with the License.  You may obtain a copy of the License at

*

*     http://www.apache.org/licenses/LICENSE-2.0

*

* Unless required by applicable law or agreed to in writing, software

* distributed under the License is distributed on an "AS IS" BASIS,

* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

* See the License for the specific language governing permissions and

* limitations under the License.

*/

package org.apache.hadoop.yarn.client.api.async;

import java.io.IOException;

import java.util.Collection;

import java.util.List;

import java.util.concurrent.atomic.AtomicInteger;

import org.apache.hadoop.classification.InterfaceAudience.Private;

import org.apache.hadoop.classification.InterfaceAudience.Public;

import org.apache.hadoop.classification.InterfaceStability.Stable;

import org.apache.hadoop.service.AbstractService;

import org.apache.hadoop.yarn.api.protocolrecords.RegisterApplicationMasterResponse;

import org.apache.hadoop.yarn.api.records.Container;

import org.apache.hadoop.yarn.api.records.ContainerId;

import org.apache.hadoop.yarn.api.records.ContainerStatus;

import org.apache.hadoop.yarn.api.records.FinalApplicationStatus;

import org.apache.hadoop.yarn.api.records.NodeReport;

import org.apache.hadoop.yarn.api.records.Priority;

import org.apache.hadoop.yarn.api.records.Resource;

import org.apache.hadoop.yarn.client.api.AMRMClient;

import org.apache.hadoop.yarn.client.api.AMRMClient.ContainerRequest;

import org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl;

import org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl;

import org.apache.hadoop.yarn.exceptions.YarnException;

import com.google.common.annotations.VisibleForTesting;

/**

 * <code>AMRMClientAsync</code> handles communication with the ResourceManager

 * and provides asynchronous updates on events such as container allocations and

 * completions.  It contains a thread that sends periodic heartbeats to the

 * ResourceManager.

 *

 * It should be used by implementing a CallbackHandler:

 * <pre>

 * {@code

 * class MyCallbackHandler implements AMRMClientAsync.CallbackHandler {

 *   public void onContainersAllocated(List<Container> containers) {

 *     [run tasks on the containers]

 *   }

 *

 *   public void onContainersCompleted(List<ContainerStatus> statuses) {

 *     [update progress, check whether app is done]

 *   }

 *

 *   public void onNodesUpdated(List<NodeReport> updated) {}

 *

 *   public void onReboot() {}

 * }

 * }

 * </pre>

 *

 * The client's lifecycle should be managed similarly to the following:

 *

 * <pre>

 * {@code

 * AMRMClientAsync asyncClient =

 *     createAMRMClientAsync(appAttId, 1000, new MyCallbackhandler());

 * asyncClient.init(conf);

 * asyncClient.start();

 * RegisterApplicationMasterResponse response = asyncClient

 *    .registerApplicationMaster(appMasterHostname, appMasterRpcPort,

 *       appMasterTrackingUrl);

 * asyncClient.addContainerRequest(containerRequest);

 * [... wait for application to complete]

 * asyncClient.unregisterApplicationMaster(status, appMsg, trackingUrl);

 * asyncClient.stop();

 * }

 * </pre>

 */

@Public

@Stable

public abstract class AMRMClientAsync<T extends ContainerRequest>

extends AbstractService {

  protected final AMRMClient<T> client;

  protected final CallbackHandler handler;

  protected final AtomicInteger heartbeatIntervalMs = new AtomicInteger();

  public static <T extends ContainerRequest> AMRMClientAsync<T>

      createAMRMClientAsync(int intervalMs, CallbackHandler callbackHandler) {

    return new AMRMClientAsyncImpl<T>(intervalMs, callbackHandler);

  }

  public static <T extends ContainerRequest> AMRMClientAsync<T>

      createAMRMClientAsync(AMRMClient<T> client, int intervalMs,

          CallbackHandler callbackHandler) {

    return new AMRMClientAsyncImpl<T>(client, intervalMs, callbackHandler);

  }

  protected AMRMClientAsync(int intervalMs, CallbackHandler callbackHandler) {

    this(new AMRMClientImpl<T>(), intervalMs, callbackHandler);

  }

  @Private

  @VisibleForTesting

  protected AMRMClientAsync(AMRMClient<T> client, int intervalMs,

      CallbackHandler callbackHandler) {

    super(AMRMClientAsync.class.getName());

    this.client = client;

    this.heartbeatIntervalMs.set(intervalMs);

    this.handler = callbackHandler;

  }

  public void setHeartbeatInterval(int interval) {

    heartbeatIntervalMs.set(interval);

  }

  public abstract List<? extends Collection<T>> getMatchingRequests(

                                                   Priority priority,

                                                   String resourceName,

                                                   Resource capability);

  /**

   * Registers this application master with the resource manager. On successful

   * registration, starts the heartbeating thread.

   * @throws YarnException

   * @throws IOException

   */

  public abstract RegisterApplicationMasterResponse registerApplicationMaster(

      String appHostName, int appHostPort, String appTrackingUrl)

      throws YarnException, IOException;

  /**

   * Unregister the application master. This must be called in the end.

   * @param appStatus Success/Failure status of the master

   * @param appMessage Diagnostics message on failure

   * @param appTrackingUrl New URL to get master info

   * @throws YarnException

   * @throws IOException

   */

  public abstract void unregisterApplicationMaster(

      FinalApplicationStatus appStatus, String appMessage, String appTrackingUrl)

  throws YarnException, IOException;

  /**

   * Request containers for resources before calling <code>allocate</code>

   * @param req Resource request

   */

  public abstract void addContainerRequest(T req);

  /**

   * Remove previous container request. The previous container request may have

   * already been sent to the ResourceManager. So even after the remove request

   * the app must be prepared to receive an allocation for the previous request

   * even after the remove request

   * @param req Resource request

   */

  public abstract void removeContainerRequest(T req);

  /**

   * Release containers assigned by the Resource Manager. If the app cannot use

   * the container or wants to give up the container then it can release them.

   * The app needs to make new requests for the released resource capability if

   * it still needs it. eg. it released non-local resources

   * @param containerId

   */

  public abstract void releaseAssignedContainer(ContainerId containerId);

  /**

   * Get the currently available resources in the cluster.

   * A valid value is available after a call to allocate has been made

   * @return Currently available resources

   */

  public abstract Resource getAvailableResources();

  /**

   * Get the current number of nodes in the cluster.

   * A valid values is available after a call to allocate has been made

   * @return Current number of nodes in the cluster

   */

  public abstract int getClusterNodeCount();

  public interface CallbackHandler {   注视：这里有一系列的回调函数

    /**

     * Called when the ResourceManager responds to a heartbeat with completed

     * containers. If the response contains both completed containers and

     * allocated containers, this will be called before containersAllocated.

     */

    public void onContainersCompleted(List<ContainerStatus> statuses);

    /**

     * Called when the ResourceManager responds to a heartbeat with allocated

     * containers. If the response containers both completed containers and

     * allocated containers, this will be called after containersCompleted.

     */

    public void onContainersAllocated(List<Container> containers);

    /**

     * Called when the ResourceManager wants the ApplicationMaster to shutdown

     * for being out of sync etc. The ApplicationMaster should not unregister

     * with the RM unless the ApplicationMaster wants to be the last attempt.

     */

    public void onShutdownRequest();

    /**

     * Called when nodes tracked by the ResourceManager have changed in health,

     * availability etc.

     */

    public void onNodesUpdated(List<NodeReport> updatedNodes);

    public float getProgress();

    /**

     * Called when error comes from RM communications as well as from errors in

     * the callback itself from the app. Calling

     * stop() is the recommended action.

     *

     * @param e

     */

    public void onError(Throwable e);

  }

}

用户实现一个MyCallbackHandler，实现AMRMClientAsync.CallbackHandler接口：

class MyCallbackHandler implements AMRMClientAsync.CallbackHandler{

.....................

}

ApplicationMaster编程---AM 与 RM交互

引入jar包， org.apche.hadoop.yarn.client.api.async.AMRMClientAsync;

流程：
1 构造一个MyCallbackHandler对象
  AMRMClientAsync.CallbackHandler allocListener = new MyCallbackHandler();
2 构造一个AMRMClientAsync句柄
  asyncClient = AMRMClientAsync.createAMRMClientAsync(1000, allcoListener);
3 初始化并启动AMRMClientAsync
  asyncClient.init(conf);//通过传入一个YarnConfiguration对象并进行初始化
  asyncClient.start(); //启动asyncClient

4 ApplicationMaster向ResourceManager注册
  RegisterApplicationMasterResponse reponse = asyncClient.registerApplicationMaster(appMasterHostname, appMasterRpcPort, appMasterTrackingUrl);
5 添加Container请求
  asyncClient.addContainerRequest(containerRequest)
6 等待应用程序运行结束
  asyncClient.unregisterApplicationMaster(status, appMsg, null);      [反注册]
  asyncCLient.stop()

/**

 * Licensed to the Apache Software Foundation (ASF) under one

 * or more contributor license agreements.  See the NOTICE file

 * distributed with this work for additional information

 * regarding copyright ownership.  The ASF licenses this file

 * to you under the Apache License, Version 2.0 (the

 * "License"); you may not use this file except in compliance

 * with the License.  You may obtain a copy of the License at

 *

 *     http://www.apache.org/licenses/LICENSE-2.0

 *

 * Unless required by applicable law or agreed to in writing, software

 * distributed under the License is distributed on an "AS IS" BASIS,

 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

 * See the License for the specific language governing permissions and

 * limitations under the License.

 */

package org.apache.hadoop.yarn.client.api.async;

import java.nio.ByteBuffer;

import java.util.Map;

import java.util.concurrent.ConcurrentMap;

import org.apache.hadoop.classification.InterfaceAudience.Private;

import org.apache.hadoop.classification.InterfaceAudience.Public;

import org.apache.hadoop.classification.InterfaceStability.Stable;

import org.apache.hadoop.service.AbstractService;

import org.apache.hadoop.yarn.api.records.Container;

import org.apache.hadoop.yarn.api.records.ContainerId;

import org.apache.hadoop.yarn.api.records.ContainerLaunchContext;

import org.apache.hadoop.yarn.api.records.ContainerStatus;

import org.apache.hadoop.yarn.api.records.NodeId;

import org.apache.hadoop.yarn.api.records.Token;

import org.apache.hadoop.yarn.client.api.NMClient;

import org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl;

import org.apache.hadoop.yarn.client.api.impl.NMClientImpl;

import org.apache.hadoop.yarn.conf.YarnConfiguration;

import com.google.common.annotations.VisibleForTesting;

/**

 * <code>NMClientAsync</code> handles communication with all the NodeManagers

 * and provides asynchronous updates on getting responses from them. It

 * maintains a thread pool to communicate with individual NMs where a number of

 * worker threads process requests to NMs by using {@link NMClientImpl}. The max

 * size of the thread pool is configurable through

 * {@link YarnConfiguration#NM_CLIENT_ASYNC_THREAD_POOL_MAX_SIZE}.

 *

 * It should be used in conjunction with a CallbackHandler. For example

 *

 * <pre>

 * {@code

 * class MyCallbackHandler implements NMClientAsync.CallbackHandler {

 *   public void onContainerStarted(ContainerId containerId,

 *       Map<String, ByteBuffer> allServiceResponse) {

 *     [post process after the container is started, process the response]

 *   }

 *

 *   public void onContainerStatusReceived(ContainerId containerId,

 *       ContainerStatus containerStatus) {

 *     [make use of the status of the container]

 *   }

 *

 *   public void onContainerStopped(ContainerId containerId) {

 *     [post process after the container is stopped]

 *   }

 *

 *   public void onStartContainerError(

 *       ContainerId containerId, Throwable t) {

 *     [handle the raised exception]

 *   }

 *

 *   public void onGetContainerStatusError(

 *       ContainerId containerId, Throwable t) {

 *     [handle the raised exception]

 *   }

 *

 *   public void onStopContainerError(

 *       ContainerId containerId, Throwable t) {

 *     [handle the raised exception]

 *   }

 * }

 * }

 * </pre>

 *

 * The client's life-cycle should be managed like the following:

 *

 * <pre>

 * {@code

 * NMClientAsync asyncClient =

 *     NMClientAsync.createNMClientAsync(new MyCallbackhandler());

 * asyncClient.init(conf);

 * asyncClient.start();

 * asyncClient.startContainer(container, containerLaunchContext);

 * [... wait for container being started]

 * asyncClient.getContainerStatus(container.getId(), container.getNodeId(),

 *     container.getContainerToken());

 * [... handle the status in the callback instance]

 * asyncClient.stopContainer(container.getId(), container.getNodeId(),

 *     container.getContainerToken());

 * [... wait for container being stopped]

 * asyncClient.stop();

 * }

 * </pre>

 */

@Public

@Stable

public abstract class NMClientAsync extends AbstractService {

  protected NMClient client;

  protected CallbackHandler callbackHandler;

  public static NMClientAsync createNMClientAsync(

      CallbackHandler callbackHandler) {

    return new NMClientAsyncImpl(callbackHandler);

  }

  protected NMClientAsync(CallbackHandler callbackHandler) {

    this (NMClientAsync.class.getName(), callbackHandler);

  }

  protected NMClientAsync(String name, CallbackHandler callbackHandler) {

    this (name, new NMClientImpl(), callbackHandler);

  }

  @Private

  @VisibleForTesting

  protected NMClientAsync(String name, NMClient client,

      CallbackHandler callbackHandler) {

    super(name);

    this.setClient(client);

    this.setCallbackHandler(callbackHandler);

  }

  public abstract void startContainerAsync(

      Container container, ContainerLaunchContext containerLaunchContext);

  public abstract void stopContainerAsync(

      ContainerId containerId, NodeId nodeId);

  public abstract void getContainerStatusAsync(

      ContainerId containerId, NodeId nodeId);

  public NMClient getClient() {

    return client;

  }

  public void setClient(NMClient client) {

    this.client = client;

  }

  public CallbackHandler getCallbackHandler() {

    return callbackHandler;

  }

  public void setCallbackHandler(CallbackHandler callbackHandler) {

    this.callbackHandler = callbackHandler;

  }

  /**

   * <p>

   * The callback interface needs to be implemented by {@link NMClientAsync}

   * users. The APIs are called when responses from <code>NodeManager</code> are

   * available.

   * </p>

   *

   * <p>

   * Once a callback happens, the users can chose to act on it in blocking or

   * non-blocking manner. If the action on callback is done in a blocking

   * manner, some of the threads performing requests on NodeManagers may get

   * blocked depending on how many threads in the pool are busy.

   * </p>

   *

   * <p>

   * The implementation of the callback function should not throw the

   * unexpected exception. Otherwise, {@link NMClientAsync} will just

   * catch, log and then ignore it.

   * </p>

   */

  public static interface CallbackHandler {

    /**

     * The API is called when <code>NodeManager</code> responds to indicate its

     * acceptance of the starting container request

     * @param containerId the Id of the container

     * @param allServiceResponse a Map between the auxiliary service names and

     *                           their outputs

     */

    void onContainerStarted(ContainerId containerId,

        Map<String, ByteBuffer> allServiceResponse);

    /**

     * The API is called when <code>NodeManager</code> responds with the status

     * of the container

     * @param containerId the Id of the container

     * @param containerStatus the status of the container

     */

    void onContainerStatusReceived(ContainerId containerId,

        ContainerStatus containerStatus);

    /**

     * The API is called when <code>NodeManager</code> responds to indicate the

     * container is stopped.

     * @param containerId the Id of the container

     */

    void onContainerStopped(ContainerId containerId);

    /**

     * The API is called when an exception is raised in the process of

     * starting a container

     *

     * @param containerId the Id of the container

     * @param t the raised exception

     */

    void onStartContainerError(ContainerId containerId, Throwable t);

    /**

     * The API is called when an exception is raised in the process of

     * querying the status of a container

     *

     * @param containerId the Id of the container

     * @param t the raised exception

     */

    void onGetContainerStatusError(ContainerId containerId, Throwable t);

    /**

     * The API is called when an exception is raised in the process of

     * stopping a container

     *

     * @param containerId the Id of the container

     * @param t the raised exception

     */

    void onStopContainerError(ContainerId containerId, Throwable t);

  }

}

ApplicationMaster编程酷，AM与NM交互酷
用户实现一个MyCallbackHandler，实现NMClientAsync.CallbackHandler接口

class MyCallbackHandler implements NMClientAsync.CallbackHandler{
............
}

引入jar包：
org.apache.hadoop.yarn.client.api.async.NMClientAsync;
org.apache.hadoop.yarn.client.api.async.impl.NMClientAsyncImpl

流程：
1 构造一个NMClientAsync句柄
  NMClientAsync asyncClient = new NMClientAsyncImpl(new MyCallbackhandler())
2 初始化并启动 NMClientAsync
  asyncClient.init(conf);//初始化ansyClient
  asyncClient.start(); //启动asyncClient
3 构造Container
  ContainerLaunchContext ctx = Records.newRecord(ContainerLaunchContext.class);
  ...//设置ctx变量
4 启动Container
  asyncClient.startContainerAsync(container, ctx);
5 获取container状态
  asyncClient.getContainerStatusAsync(container.getId(), container.getNodeId(), container.getContainerToken());
6 停止Container
  asyncClient.stopContainerAsync(container.getId(), container.getNodeId(), container.getContainerToken());
  asyncClient.stop()

hadoop Yarn 编程API的更多相关文章

Hadoop MapReduce编程 API入门系列之挖掘气象数据版本3（九）
不多说,直接上干货! 下面,是版本1. Hadoop MapReduce编程 API入门系列之挖掘气象数据版本1(一) 下面是版本2. Hadoop MapReduce编程 API入门系列之挖掘气象数 ...
Hadoop MapReduce编程 API入门系列之压缩和计数器（三十）
不多说,直接上代码. Hadoop MapReduce编程 API入门系列之小文件合并(二十九) 生成的结果,作为输入源. 代码 package zhouls.bigdata.myMapReduce. ...
Hadoop MapReduce编程 API入门系列之挖掘气象数据版本2（十）
下面,是版本1. Hadoop MapReduce编程 API入门系列之挖掘气象数据版本1(一) 这篇博文,包括了,实际生产开发非常重要的,单元测试和调试代码.这里不多赘述,直接送上代码. MRUni ...
Hadoop Yarn REST API未授权漏洞利用
Hadoop Yarn REST API未授权漏洞利用 Hadoop是一个由Apache基金会所开发的分布式系统基础架构,YARN是hadoop系统上的资源统一管理平台,其主要作用是实现集群资源的统一 ...
Hadoop Yarn REST API未授权漏洞利用挖矿分析
欢迎大家前往腾讯云+社区,获取更多腾讯海量技术实践干货哦~ 一.背景情况 5月5日腾讯云安全曾针对攻击者利用Hadoop Yarn资源管理系统REST API未授权漏洞对服务器进行攻击,攻击者可以在未 ...
Hadoop MapReduce编程 API入门系列之网页排序（二十八）
不多说,直接上代码. Map output bytes=247 Map output materialized bytes=275 Input split bytes=139 Combine inpu ...
Hadoop MapReduce编程 API入门系列之倒排索引（二十四）
不多说,直接上代码. 2016-12-12 21:54:04,509 INFO [org.apache.hadoop.metrics.jvm.JvmMetrics] - Initializing JV ...
Hadoop MapReduce编程 API入门系列之二次排序（十六）
不多说,直接上代码. -- ::, INFO [org.apache.hadoop.metrics.jvm.JvmMetrics] - Initializing JVM Metrics with pr ...
Hadoop MapReduce编程 API入门系列之最短路径（十五）
不多说,直接上代码. ======================================= Iteration: 1= Input path: out/shortestpath/input. ...

随机推荐

保存网页MHT
uses ADODB_TLB, CDO_TLB, ComObj,MSHTML;{$R *.dfm}{能把网页如 WWW.QQ.COM保存为一个单文件 .MHT但不能把一个 A.HTM 保存为一个单文件 ...
android86 监听SD卡状态，勒索软件，监听应用的安装、卸载、更新，无序广播有序广播
* 添加权限 <uses-permission android:name="android.permission.RECEIVE_SMS"/> * 4.0以后广播接收者 ...
vi 快捷键
屏幕翻滚类命令Ctrl+u:向文件首翻半屏Ctrl+d:向文件尾翻半屏Ctrl+f:向文件尾翻一屏Ctrl+b:向文件首翻一屏 shift+g:到尾部 H:跳到第一行.M:跳到中间.L:跳到最后一行. ...
RedHat7搭建MongoDB集群
下载RPM安装包# wget -c -r -N -np -nd -L -nH https://repo.mongodb.org/yum/redhat/7/mongodb-org/stable/x86_ ...
ubuntu自定义服务模板
根据他人代码修改: #!/bin/sh ### BEGIN INIT INFO # Provides: <pragram name> # Required-Start: $local_fs ...
MVC Ajax 提交是防止SCRF攻击
//在View中 <script type="text/javascript"> @functions{ public string ToKenHeaderValue( ...
node安装教程 + git初步
我的系统是win8.1 64位这个是对应的安装包:http://files.cnblogs.com/files/zxyun/node-v0.12.5-x64.zip 安装中有不懂可以参考下面的两 ...
3.SQL*Plus命令
3.1SQL*Plus与数据库的交互主要用来数据库查询和数据处理的工具. 3.2SQL*Plus运行环境设置 3.2.1SET命令概述用户可以使用SET命令设置SQL*Plus的运行环境,SET命 ...
iOS消息推送机制
iOS消息推送的工作机制可以简单的用下图来概括: Provider是指某个iPhone软件的Push服务器,APNS是Apple Push Notification Service的缩写,是苹果的服务 ...
Win32 GDI 非矩形区域剪裁，双缓冲技术
传统的Win32通过GDI提供图形显示的功能,包括了基本的绘图功能,如画线.方块.椭圆等等,高级功能包括了多边形和Bezier的绘制.这样app就不用关心那些图形学的细节了,有点类似于UNIX上的X- ...

hadoop Yarn 编程API

hadoop Yarn 编程API的更多相关文章

随机推荐

热门专题