1、下载flume,解压到自建文件夹

2、修改flume-env.sh文件

  在文件中添加JAVA_HOME

3、修改flume.conf 文件(原名好像不叫这个,我自己把模板名改了)

里面我自己配的(具体配置参见 http://flume.apache.org/FlumeUserGuide.html)

agent1 是我的代理名称

source是netcat (数据源)

channel 是memory(内存)

sink是hdfs(输出)

注意配置中添加

agent1.sinks.k1.hdfs.fileType = DataStream

否则hdfs中接收的文件会出现乱码

如果要配置根据时间来分类写入hdfs的功能,要求传入的文件必须要有时间戳(datastamp)

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License. # The configuration file needs to define the sources,
# the channels and the sinks.
# Sources, channels and sinks are defined per agent,
# in this case called 'agent' agent1.sources = r1
agent1.channels = c1
agent1.sinks = k1 # For each one of the sources, the type is defined
agent1.sources.r1.type = netcat
agent1.sources.r1.channels = c1
#agent1.sources.r1.ack-every-event = false
agent1.sources.r1.max-line-length = 100
agent1.sources.r1.bind = 192.168.19.107
agent1.sources.r1.port = 44445 # Describe/configure the interceptor #agent1.sources.r1.interceptors = i1
#agent1.sources.r1.interceptors.i1.type = com.nd.bigdata.insight.interceptor.KeyTimestampForKafka$Builder # Each sink's type must be defined
#agent1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
#agent1.sinks.k1.topic = insight-test6
#agent1.sinks.k1.brokerList =192.168.181.120:9092,192.168.181.121:9092,192.168.181.66:9092
#agent1.sinks.k1.batchSize=100
#agent1.sinks.k1.requiredAcks = 0 # logger
#agent1.sinks.k1.type = logger
#agent1.sinks.k1.channel = c1 #HDFS
agent1.sinks.k1.type = hdfs
agent1.sinks.k1.channel = c1
agent1.sinks.k1.hdfs.path = test/flume/events/
agent1.sinks.k1.hdfs.filePrefix = events-
agent1.sinks.k1.hdfs.round = true
agent1.sinks.k1.hdfs.roundValue = 10
agent1.sinks.k1.hdfs.roundUnit = minute
agent1.sinks.k1.hdfs.fileType = DataStream # Each channel's type is defined.
agent1.channels.c1.type = memory
agent1.channels.c1.capacity = 1000
"flume.conf" 70L, 2291C

4、启动flume:(文件根目录下启动)

bin/flume-ng agent --conf conf --conf-file conf/flume.conf --name agent1 -Dflume.root.logger=INFO,console(里面的flume.conf ,agent1 请替换成你自己的名字)

5、用另一台机试着发送文件

telnet 192.168.19.107  44445  (创建连接)

然后发送内容

6、生成的hdfs文件

flume 配置与使用的更多相关文章

  1. 关于flume配置加载(二)

    为什么翻flume的代码,一方面是确实遇到了问题,另一方面是想翻一下flume的源码,看看有什么收获,现在收获还谈不上,因为要继续总结.不够已经够解决问题了,而且确实有好的代码,后续会继续慢慢分享,这 ...

  2. flume 配置

    [root@dtpweb data]#tar -zxvf apache-flume-1.7.0-bin.tar.gz[root@dtpweb conf]# cp flume-env.sh.templa ...

  3. 关于flume配置加载

    最近项目在用到flume,因此翻了下flume的代码, 启动脚本: nohup bin/flume-ng agent -n tsdbflume -c conf -f conf/配置文件.conf -D ...

  4. Flume配置Replicating Channel Selector

    1 官网内容 上面的配置是r1获取到的内容会同时复制到c1 c2 c3 三个channel里面 2 详细配置信息 # Name the components on this agent a1.sour ...

  5. Flume配置Multiplexing Channel Selector

    1 官网内容 上面配置的是根据不同的heder当中state值走不同的channels,如果是CZ就走c1 如果是US就走c2 c3 其他默认走c4 2 我的详细配置信息 一个监听http端口 然后 ...

  6. hadoop生态搭建(3节点)-09.flume配置

    # http://archive.apache.org/dist/flume/1.8.0/# ===================================================== ...

  7. flume配置和说明(转)

    Flume是什么 收集.聚合事件流数据的分布式框架 通常用于log数据 采用ad-hoc方案,明显优点如下: 可靠的.可伸缩.可管理.可定制.高性能 声明式配置,可以动态更新配置 提供上下文路由功能 ...

  8. flume配置参数的意义

    1.监控端口数据: flume启动: [bingo@hadoop102 flume]$ bin/flume-ng agent --conf conf/ --name a1 --conf-file jo ...

  9. Flume配置Failover Sink Processor

    1 官网内容 2 看一张图一目了然 3 详细配置 source配置文件 #配置文件: a1.sources= r1 a1.sinks= k1 k2 a1.channels= c1 #负载平衡 a1.s ...

随机推荐

  1. 洛谷 P3216 [HNOI2011]数学作业

    最近学了矩阵,kzj大佬推荐了我这一道题目. 乍一眼看上去,没看出是矩阵,就随便打了一个暴力,30分. 然后仔细分析了一波,发现蛮简单的. 结果全wa了,先看看下面的错误分析吧! 首先,设f[n]为最 ...

  2. spring mvc 基本原理

    在web.xml配置spring mvc入口servlet: <servlet> <servlet-name>mvc-dispatcher</servlet-name&g ...

  3. SUBMIT 用法

    [转自http://lz357502668.blog.163.com/blog/static/16496743201241195817597/] 1.最普通的用法 *Code used to exec ...

  4. pandas to_datetime()

    >>> import pandas as pd >>> i = pd.date_range() >>> df = pd.DataFrame(dic ...

  5. Datanode启动问题 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering>

    -- ::, INFO org.apache.hadoop.hdfs.server.datanode.DataNode: supergroup = supergroup -- ::, INFO org ...

  6. python作用域和js作用域的比较

    1.python和js一样,作用域链在执行方法之前就已经创建了 # 下面的执行结果就是aa,原因是这点python和js一样,作用域链已经创建了,不会去改变 xo="aa" def ...

  7. java入门了解12

    1.SequenceInputStream序列流:能将其他输入流的串联 用处:读完第一个再去读第二个输入流 用法:构造方法:SequenceInputStream(InputStream s1,Inp ...

  8. 算法(Algorithms)第4版 练习 2.1.4

    E A S Y Q U E S T I O N A E S Y Q U E S T I O N A E S Y Q U E S T I O N A E S Y Q U E S T I O N A E ...

  9. 大话设计模式--装饰者模式 Decorator -- C++实现实例

    1.装饰者模式 Decorator 动态地给一个对象添加一个额外的职责, 就添加功能来说, 装饰模式比生成子类更为灵活. 每个装饰对象的实现和如何使用这个对象分离,  每个装饰对象只关心自己的功能,不 ...

  10. Jquery Uploadify多文件上传实例

    jQuery Uploadify开发使用的语言是java. 详细的相关文档,可以参考官网的doc:http://www.uploadify.com/documentation/ 官网的讲解还是很详细的 ...