大数据工具篇之flume1.4-安装部署指南

一、引言

　　flume-ng是一个分布式、高可靠和高效的日志收集系统，flume-ng是flume的新版本的意思，其中“ng”意为new generate(新一代)，目前来说，flume-ng 1.4是最新的版本。flume-ng与flume相比，发生了很大的变化，因为之前一直在flume0.9的版本，一直没有升级到flume-ng，最近因为项目需要，做了一次升级，发现了一些问题，特记录下来，分享给大家。

二、版本说明

　　flume-ng 1.4.0

三、安装步骤

　　下载、解压、安装JDK、设置环境变量部分已经有很多介绍性的问题，不做说明。需要特别说明之处的是，flume-ng不需要要zookeeper，无需设置。

四、flume-ng bug　　

　　安装完成后运行flume-ng会出现错误信息，这主要是因为shell脚本的问题，我将修改后的flume-ng完整的上传如下，其中标注：#zhangzl下面的行是需要修改的部分。完整脚本如下所示：　　

 #!/bin/bash

 #

 #

 # Licensed to the Apache Software Foundation (ASF) under one

 # or more contributor license agreements.  See the NOTICE file

 # distributed with this work for additional information

 # regarding copyright ownership.  The ASF licenses this file

 # to you under the Apache License, Version 2.0 (the

 # "License"); you may not use this file except in compliance

 # with the License.  You may obtain a copy of the License at

 #

 #   http://www.apache.org/licenses/LICENSE-2.0

 #

 # Unless required by applicable law or agreed to in writing,

 # software distributed under the License is distributed on an

 # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY

 # KIND, either express or implied.  See the License for the

 # specific language governing permissions and limitations

 # under the License.

 #

 ################################

 # constants

 ################################

 FLUME_AGENT_CLASS="org.apache.flume.node.Application"

 FLUME_AVRO_CLIENT_CLASS="org.apache.flume.client.avro.AvroCLIClient"

 FLUME_VERSION_CLASS="org.apache.flume.tools.VersionInfo"

 FLUME_TOOLS_CLASS="org.apache.flume.tools.FlumeToolsMain"

 CLEAN_FLAG=

 ################################

 # functions

 ################################

 info() {

   if [ ${CLEAN_FLAG} -ne  ]; then

     local msg=$

     echo "Info: $msg" >&

   fi

 }

 warn() {

   if [ ${CLEAN_FLAG} -ne  ]; then

     local msg=$

     echo "Warning: $msg" >&

   fi

 }

 error() {

   local msg=$

   local exit_code=$

   echo "Error: $msg" >&

   if [ -n "$exit_code" ] ; then

     exit $exit_code

   fi

 }

 # If avail, add Hadoop paths to the FLUME_CLASSPATH and to the

 # FLUME_JAVA_LIBRARY_PATH env vars.

 # Requires Flume jars to already be on FLUME_CLASSPATH.

 add_hadoop_paths() {

   local HADOOP_IN_PATH=$(PATH="${HADOOP_HOME:-${HADOOP_PREFIX}}/bin:$PATH" \

       which hadoop >/dev/null)

   if [ -f "${HADOOP_IN_PATH}" ]; then

     info "Including Hadoop libraries found via ($HADOOP_IN_PATH) for HDFS access"

     # determine hadoop java.library.path and use that for flume

     local HADOOP_CLASSPATH=""

     local HADOOP_JAVA_LIBRARY_PATH=$(HADOOP_CLASSPATH="$FLUME_CLASSPATH" \

         ${HADOOP_IN_PATH} org.apache.flume.tools.GetJavaProperty \

         java.library.path)

     # look for the line that has the desired property value

     # (considering extraneous output from some GC options that write to stdout)

     # IFS = InternalFieldSeparator (set to recognize only newline char as delimiter)

     IFS=$'\n'

     for line in $HADOOP_JAVA_LIBRARY_PATH; do

       #if [[ $line =~ ^java\.library\.path=(.*)$ ]]; then

       if [[ "$line" =~ "^java\.library\.path=(.*)$" ]]; then

         HADOOP_JAVA_LIBRARY_PATH=${BASH_REMATCH[]}

         break

       fi

     done

     unset IFS

     if [ -n "${HADOOP_JAVA_LIBRARY_PATH}" ]; then

       FLUME_JAVA_LIBRARY_PATH="$FLUME_JAVA_LIBRARY_PATH:$HADOOP_JAVA_LIBRARY_PATH"

     fi

     # determine hadoop classpath

     HADOOP_CLASSPATH=$($HADOOP_IN_PATH classpath)

     # hack up and filter hadoop classpath

     local ELEMENTS=$(sed -e 's/:/ /g' <<<${HADOOP_CLASSPATH})

     local ELEMENT

     for ELEMENT in $ELEMENTS; do

       local PIECE

       for PIECE in $(echo $ELEMENT); do

           #zhangzl

         if [[ $PIECE =~ "slf4j-(api|log4j12).*\.jar" ]]; then

           info "Excluding $PIECE from classpath"

           continue

         else

           FLUME_CLASSPATH="$FLUME_CLASSPATH:$PIECE"

         fi

       done

     done

   fi

 }

 add_HBASE_paths() {

   local HBASE_IN_PATH=$(PATH="${HBASE_HOME}/bin:$PATH" \

       which hbase >/dev/null)

   if [ -f "${HBASE_IN_PATH}" ]; then

     info "Including HBASE libraries found via ($HBASE_IN_PATH) for HBASE access"

     # determine HBASE java.library.path and use that for flume

     local HBASE_CLASSPATH=""

     local HBASE_JAVA_LIBRARY_PATH=$(HBASE_CLASSPATH="$FLUME_CLASSPATH" \

         ${HBASE_IN_PATH} org.apache.flume.tools.GetJavaProperty \

         java.library.path)

     # look for the line that has the desired property value

     # (considering extraneous output from some GC options that write to stdout)

     # IFS = InternalFieldSeparator (set to recognize only newline char as delimiter)

     IFS=$'\n'

     for line in $HBASE_JAVA_LIBRARY_PATH; do

     #zhangzl

       if [[ $line =~ "^java\.library\.path=(.*)$" ]]; then

         HBASE_JAVA_LIBRARY_PATH=${BASH_REMATCH[]}

         break

       fi

     done

     unset IFS

     if [ -n "${HBASE_JAVA_LIBRARY_PATH}" ]; then

       FLUME_JAVA_LIBRARY_PATH="$FLUME_JAVA_LIBRARY_PATH:$HBASE_JAVA_LIBRARY_PATH"

     fi

     # determine HBASE classpath

     HBASE_CLASSPATH=$($HBASE_IN_PATH classpath)

     # hack up and filter HBASE classpath

     local ELEMENTS=$(sed -e 's/:/ /g' <<<${HBASE_CLASSPATH})

     local ELEMENT

     for ELEMENT in $ELEMENTS; do

       local PIECE

       for PIECE in $(echo $ELEMENT); do

       #zhangzl

         if [[ $PIECE =~ "slf4j-(api|log4j12).*\.jar" ]]; then

           info "Excluding $PIECE from classpath"

           continue

         else

           FLUME_CLASSPATH="$FLUME_CLASSPATH:$PIECE"

         fi

       done

     done

     FLUME_CLASSPATH="$FLUME_CLASSPATH:$HBASE_HOME/conf"

   fi

 }

 set_LD_LIBRARY_PATH(){

 #Append the FLUME_JAVA_LIBRARY_PATH to whatever the user may have specified in

 #flume-env.sh

   if [ -n "${FLUME_JAVA_LIBRARY_PATH}" ]; then

     export LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:${FLUME_JAVA_LIBRARY_PATH}"

   fi

 }

 display_help() {

   cat <<EOF

 Usage: $ <command> [options]...

 commands:

   help                  display this help text

   agent                 run a Flume agent

   avro-client           run an avro Flume client

   version               show Flume version info

 global options:

   --conf,-c <conf>      use configs in <conf> directory

   --classpath,-C <cp>   append to the classpath

   --dryrun,-d           do not actually start Flume, just print the command

   --plugins-path <dirs> colon-separated list of plugins.d directories. See the

                         plugins.d section in the user guide for more details.

                         Default: \$FLUME_HOME/plugins.d

   -Dproperty=value      sets a Java system property value

   -Xproperty=value      sets a Java -X option

 agent options:

   --conf-file,-f <file> specify a config file (required)

   --name,-n <name>      the name of this agent (required)

   --help,-h             display help text

 avro-client options:

   --rpcProps,-P <file>   RPC client properties file with server connection params

   --host,-H <host>       hostname to which events will be sent

   --port,-p <port>       port of the avro source

   --dirname <dir>        directory to stream to avro source

   --filename,-F <file>   text file to stream to avro source (default: std input)

   --headerFile,-R <file> File containing event headers as key/value pairs on each new line

   --help,-h              display help text

   Either --rpcProps or both --host and --port must be specified.

 Note that if <conf> directory is specified, then it is always included first

 in the classpath.

 EOF

 }

 run_flume() {

   local FLUME_APPLICATION_CLASS

   if [ "$#" -gt  ]; then

     FLUME_APPLICATION_CLASS=$

     shift

   else

     error "Must specify flume application class"

   fi

   if [ ${CLEAN_FLAG} -ne  ]; then

     set -x

   fi

   $EXEC $JAVA_HOME/bin/java $JAVA_OPTS -cp "$FLUME_CLASSPATH" \

       -Djava.library.path=$FLUME_JAVA_LIBRARY_PATH "$FLUME_APPLICATION_CLASS" $*

 }

 ################################

 # main

 ################################

 # set default params

 FLUME_CLASSPATH=""

 FLUME_JAVA_LIBRARY_PATH=""

 JAVA_OPTS="-Xmx20m"

 LD_LIBRARY_PATH=""

 opt_conf=""

 opt_classpath=""

 opt_plugins_dirs=""

 opt_java_props=""

 opt_dryrun=""

 mode=$

 shift

 case "$mode" in

   help)

     display_help

     exit

     ;;

   agent)

     opt_agent=

     ;;

   node)

     opt_agent=

     warn "The \"node\" command is deprecated. Please use \"agent\" instead."

     ;;

   avro-client)

     opt_avro_client=

     ;;

   tool)

     opt_tool=

     ;;

   version)

    opt_version=

    CLEAN_FLAG=

    ;;

   *)

     error "Unknown or unspecified command '$mode'"

     echo

     display_help

     exit

     ;;

 esac

 args=""

 while [ -n "$*" ] ; do

   arg=$

   shift

   case "$arg" in

     --conf|-c)

       [ -n "$1" ] || error "Option --conf requires an argument"

       opt_conf=$

       shift

       ;;

     --classpath|-C)

       [ -n "$1" ] || error "Option --classpath requires an argument"

       opt_classpath=$

       shift

       ;;

     --dryrun|-d)

       opt_dryrun=""

       ;;

     --plugins-path)

       opt_plugins_dirs=$

       shift

       ;;

     -D*)

       opt_java_props="$opt_java_props $arg"

       ;;

     -X*)

       opt_java_props="$opt_java_props $arg"

       ;;

     *)

       args="$args $arg"

       ;;

   esac

 done

 # make opt_conf absolute

 if [[ -n "$opt_conf" && -d "$opt_conf" ]]; then

   opt_conf=$(cd $opt_conf; pwd)

 fi

 # allow users to override the default env vars via conf/flume-env.sh

 if [ -z "$opt_conf" ]; then

   warn "No configuration directory set! Use --conf <dir> to override."

 elif [ -f "$opt_conf/flume-env.sh" ]; then

   info "Sourcing environment configuration script $opt_conf/flume-env.sh"

   source "$opt_conf/flume-env.sh"

 fi

 # append command-line java options to stock or env script JAVA_OPTS

 if [ -n "${opt_java_props}" ]; then

   JAVA_OPTS="${JAVA_OPTS} ${opt_java_props}"

 fi

 # prepend command-line classpath to env script classpath

 if [ -n "${opt_classpath}" ]; then

   if [ -n "${FLUME_CLASSPATH}" ]; then

     FLUME_CLASSPATH="${opt_classpath}:${FLUME_CLASSPATH}"

   else

     FLUME_CLASSPATH="${opt_classpath}"

   fi

 fi

 if [ -z "${FLUME_HOME}" ]; then

   FLUME_HOME=$(cd $(dirname $)/..; pwd)

 fi

 # prepend $FLUME_HOME/lib jars to the specified classpath (if any)

 if [ -n "${FLUME_CLASSPATH}" ] ; then

   FLUME_CLASSPATH="${FLUME_HOME}/lib/*:$FLUME_CLASSPATH"

 else

   FLUME_CLASSPATH="${FLUME_HOME}/lib/*"

 fi

 # load plugins.d directories

 PLUGINS_DIRS=""

 if [ -n "${opt_plugins_dirs}" ]; then

   PLUGINS_DIRS=$(sed -e 's/:/ /g' <<<${opt_plugins_dirs})

 else

   PLUGINS_DIRS="${FLUME_HOME}/plugins.d"

 fi

 unset plugin_lib plugin_libext plugin_native

 for PLUGINS_DIR in $PLUGINS_DIRS; do

   if [[ -d ${PLUGINS_DIR} ]]; then

     for plugin in ${PLUGINS_DIR}/*; do

       if [[ -d "$plugin/lib" ]]; then

         plugin_lib="${plugin_lib}${plugin_lib+:}${plugin}/lib/*"

       fi

       if [[ -d "$plugin/libext" ]]; then

         plugin_libext="${plugin_libext}${plugin_libext+:}${plugin}/libext/*"

       fi

       if [[ -d "$plugin/native" ]]; then

         plugin_native="${plugin_native}${plugin_native+:}${plugin}/native"

       fi

     done

   fi

 done

 if [[ -n "${plugin_lib}" ]]

 then

   FLUME_CLASSPATH="${FLUME_CLASSPATH}:${plugin_lib}"

 fi

 if [[ -n "${plugin_libext}" ]]

 then

   FLUME_CLASSPATH="${FLUME_CLASSPATH}:${plugin_libext}"

 fi

 if [[ -n "${plugin_native}" ]]

 then

   if [[ -n "${FLUME_JAVA_LIBRARY_PATH}" ]]

   then

     FLUME_JAVA_LIBRARY_PATH="${FLUME_JAVA_LIBRARY_PATH}:${plugin_native}"

   else

     FLUME_JAVA_LIBRARY_PATH="${plugin_native}"

   fi

 fi

 # find java

 if [ -z "${JAVA_HOME}" ] ; then

   warn "JAVA_HOME is not set!"

   # Try to use Bigtop to autodetect JAVA_HOME if it's available

   if [ -e /usr/libexec/bigtop-detect-javahome ] ; then

     . /usr/libexec/bigtop-detect-javahome

   elif [ -e /usr/lib/bigtop-utils/bigtop-detect-javahome ] ; then

     . /usr/lib/bigtop-utils/bigtop-detect-javahome

   fi

   # Using java from path if bigtop is not installed or couldn't find it

   if [ -z "${JAVA_HOME}" ] ; then

     JAVA_DEFAULT=$(type -p java)

     [ -n "$JAVA_DEFAULT" ] || error "Unable to find java executable. Is it in your PATH?" 1

     JAVA_HOME=$(cd $(dirname $JAVA_DEFAULT)/..; pwd)

   fi

 fi

 # look for hadoop libs

 add_hadoop_paths

 add_HBASE_paths

 # prepend conf dir to classpath

 if [ -n "$opt_conf" ]; then

   FLUME_CLASSPATH="$opt_conf:$FLUME_CLASSPATH"

 fi

 set_LD_LIBRARY_PATH

 # allow dryrun

 EXEC="exec"

 if [ -n "${opt_dryrun}" ]; then

   warn "Dryrun mode enabled (will not actually initiate startup)"

   EXEC="echo"

 fi

 # finally, invoke the appropriate command

 if [ -n "$opt_agent" ] ; then

   run_flume $FLUME_AGENT_CLASS $args

 elif [ -n "$opt_avro_client" ] ; then

   run_flume $FLUME_AVRO_CLIENT_CLASS $args

 elif [ -n "${opt_version}" ] ; then

   run_flume $FLUME_VERSION_CLASS $args

 elif [ -n "${opt_tool}" ] ; then

   run_flume $FLUME_TOOLS_CLASS $args

 else

   error "This message should never appear" 1

 fi

 exit 0

五、测试配置文件

　　在conf目录下创建example-conf.properties文件，属性如下所示：　　

 # Describe the source

 a1.sources = r1

 a1.sinks = k1

 a1.channels = c1

 # Describe/configure the source

 a1.sources.r1.type = avro

 a1.sources.r1.bind = localhost

 a1.sources.r1.port = 

 # Describe the sink

 # 将数据输出至日志中

 a1.sinks.k1.type = logger

 # Use a channel which buffers events in memory

 a1.channels.c1.type = memory

 a1.channels.c1.capacity =

 a1.channels.c1.transactionCapacity = 

 # Bind the source and sink to the channel

 a1.sources.r1.channels = c1

 a1.sinks.k1.channel = c1

六、运行命令

　　6.1 启动代理

[hadoop@hadoop1 conf]$ flume-ng agent -n a1 -f example-conf.properties

　　6.2 启动avro-client客户端向agent代理发送数据-需要单独启动新的窗口

[hadoop@hadoop1 conf]$ flume-ng avro-client -H localhost -p  -F file01

七、结果查看

 // :: INFO ipc.NettyServer: [id: 0x0100c7e4, /127.0.0.1: => /127.0.0.1:] OPEN

 // :: INFO ipc.NettyServer: [id: 0x0100c7e4, /127.0.0.1: => /127.0.0.1:] BOUND: /127.0.0.1:

 // :: INFO ipc.NettyServer: [id: 0x0100c7e4, /127.0.0.1: => /127.0.0.1:] CONNECTED: /127.0.0.1:

 // :: INFO ipc.NettyServer: [id: 0x0100c7e4, /127.0.0.1: :> /127.0.0.1:] DISCONNECTED

 // :: INFO ipc.NettyServer: [id: 0x0100c7e4, /127.0.0.1: :> /127.0.0.1:] UNBOUND

 // :: INFO ipc.NettyServer: [id: 0x0100c7e4, /127.0.0.1: :> /127.0.0.1:] CLOSED

 // :: INFO ipc.NettyServer: Connection to /127.0.0.1: disconnected.

 // :: INFO sink.LoggerSink: Event: { headers:{} body:   6C 6C 6F   6F  6C                 hello world }

大数据工具篇之flume1.4-安装部署指南的更多相关文章

大数据工具篇之Hive与MySQL整合完整教程
大数据工具篇之Hive与MySQL整合完整教程一.引言 Hive元数据存储可以放到RDBMS数据库中,本文以Hive与MySQL数据库的整合为目标,详细说明Hive与MySQL的整合方法. 二.安装 ...
大数据工具篇之Hive与HBase整合完整教程
大数据工具篇之Hive与HBase整合完整教程一.引言最近的一次培训,用户特意提到Hadoop环境下HDFS中存储的文件如何才能导入到HBase,关于这部分基于HBase Java API的写入方 ...
大数据应用日志采集之Scribe 安装配置指南
大数据应用日志采集之Scribe 安装配置指南大数据应用日志采集之Scribe 安装配置指南 1.概述 Scribe是Facebook开源的日志收集系统,在Facebook内部已经得到大量的应用.它 ...
大数据基础环境--jdk1.8环境安装部署
1.环境说明 1.1.机器配置说明本次集群环境为三台linux系统机器,具体信息如下: 主机名称 IP地址操作系统 hadoop1 10.0.0.20 CentOS Linux release 7 ...
大数据学习之hdfs集群安装部署04
1-> 集群的准备工作 1)关闭防火墙(进行远程连接) systemctl stop firewalld systemctl -disable firewalld 2)永久修改设置主机名 vi ...
Java程序员在用的大数据工具，MongoDB稳居第一！
据日前的一则大数据工具使用情况调查,我们知道了Java程序猿最喜欢用的大数据工具. 问题:他们最近一年最喜欢用什么工具或者是框架? 受访者可以选择列表中的选项或者列出自己的,本文主要关心的是大数据工具 ...
CentOS6安装各种大数据软件第八章：Hive安装和配置
相关文章链接 CentOS6安装各种大数据软件第一章:各个软件版本介绍 CentOS6安装各种大数据软件第二章:Linux各个软件启动命令 CentOS6安装各种大数据软件第三章:Linux基础 ...
[转载]Java程序员使用的20几个大数据工具
最近我问了很多Java开发人员关于最近12个月内他们使用的是什么大数据工具. 这是一个系列,主题为: 语言web框架应用服务器SQL数据访问工具SQL数据库大数据构建工具云提供商今天我们就要说说大数据 ...
大数据工具——Splunk
Splunk是机器数据的引擎.使用 Splunk 可收集.索引和利用所有应用程序.服务器和设备(物理.虚拟和云中)生成的快速移动型计算机数据 .从一个位置搜索并分析所有实时和历史数据. 使用 Splu ...

随机推荐

Android轮播图Banner的实现
从慕课网上学了一门叫做“不一样的自定义实现轮播图效果”的课程,感觉实用性较强,而且循序渐进,很适合初学者.在此对该课程做一个小小的笔记. 实现轮播思路: 1.一般轮播图是由一组图片和底部轮播圆点组成, ...
poj2892
题解: 答案=后缀-前缀-1 如果被轰了,那么就时0 在一开始加入0,n+1,保证有前缀后缀代码: #include<cstdio> #include<cmath> #inc ...
select * from v$reserved_words
select * from v$reserved_words 查询库中所有关键字
UITableView简述
原帖:http://blog.csdn.net/totogo2010/article/details/7642908 Table View简单描述: 在iPhone和其他iOS的很多程序中都会看到Ta ...
解决：编辑一条彩信，附件选择添加音频，返回到编辑界面选择play，不能播放，没有声音
[操作步骤]:编辑一条彩信,附件选择添加音频(外部音频),返回到编辑界面选择play,菜单键选择view slideshow [测试结果]:不能播放,没有声音 [预期结果]:可以播放根据以往的经验( ...
nwjs问题总结
1.iframe中不支持flash解决方法: nw初始化中加入代码: // 设置flashplayer在iframe中可用 chrome.contentSettings.plugins.set({ p ...
LCD常用接口原理概述
Android LCD(5) 平台信息:内核:linux2.6/linux3.0系统:android/android4.0 平台:samsung exynos 4210.exynos 4412 .e ...
Vue CLI 3 配置兼容IE10
最近做了一个基于Vue的项目,需要兼容IE浏览器,目前实现了打包后可以在IE10以上运行,但是还不支持在运行时兼容IE10及以上. 安装依赖 yarn add --dev @babel/polyfil ...
网络流--最小费用最大流MCMF模板
标准大白书式模板 #include<stdio.h> //大概这么多头文件昂 #include<string.h> #include<vector> #includ ...
jQuery prop() 方法
<!DOCTYPE html> <html> <head> <meta charset="utf-8"> <title> ...

大数据工具篇之flume1.4-安装部署指南

大数据工具篇之flume1.4-安装部署指南的更多相关文章

随机推荐

热门专题