网上没什么资料,就分享下:)

简单模式:kafka传数据到Flink存储到mysql 可以参考网站:

利用Flink stream从kafka中写数据到mysql

maven依赖情况:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion> <groupId>com.xxr</groupId>
<artifactId>flink</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<flink.version>1.4.1</flink.version>
</properties>
<build>
<pluginManagement>
<plugins>
<!-- 设置java版本为1.8 -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.8</source>
<target>1.8</target>
<encoding>UTF-8</encoding>
<compilerArgs>
<arg>-extdirs</arg>
<arg>${project.basedir}/src/lib</arg>
</compilerArgs>
</configuration>
</plugin>
<!-- maven-assembly方式打包成jar -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.5.5</version>
<configuration>
<archive>
<manifest>
<mainClass>com.xxr.flink.stream_sql</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
</plugin>
</plugins>
</pluginManagement>
</build>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.8.0-beta1</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.8.0-beta1</version>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-compiler</artifactId>
<version>2.11.1</version>
</dependency>
<dependency>
<groupId>org.scala-lang.modules</groupId>
<artifactId>scala-xml_2.11</artifactId>
<version>1.0.2</version>
</dependency> <dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java</artifactId>
<version>${flink.version}</version>
</dependency> <dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java_2.11</artifactId>
<version>${flink.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-scala_2.11</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.11</artifactId>
<version>${flink.version}</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.flink/flink-core -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-core</artifactId>
<version>${flink.version}</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.flink/flink-runtime -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-runtime_2.11</artifactId>
<version>${flink.version}</version>
</dependency> <dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-wikiedits_2.11</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-kafka-0.8_2.11</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-table_2.11</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-jdbc</artifactId>
<version>${flink.version}</version>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.39</version>
</dependency>
</dependencies>
</project>

配置文件及sql语句,时间窗口是1分钟:

public class JDBCTestBase {
//每过一分钟计算一分钟内的num最大值,以rowtime作为时间基准
public static final String SQL_MAX = "SELECT MAX(num) ,TUMBLE_END(rowtime, INTERVAL '1' minute) as wEnd FROM wiki_table group by TUMBLE(rowtime, interval '1' minute)";
public static final String SQL_AVG = "SELECT AVG(num) ,TUMBLE_END(rowtime, INTERVAL '1' minute) as wEnd FROM wiki_table group by TUMBLE(rowtime, interval '1' minute)";
public static final String SQL_MIN = "SELECT MIN(num) ,TUMBLE_END(rowtime, INTERVAL '1' minute) as wEnd FROM wiki_table group by TUMBLE(rowtime, interval '1' minute)";
public static final String kafka_group = "test-consumer-group";
public static final String kafka_zookper = "localhost:2181";
public static final String kafka_hosts = "localhost:9092";
public static final String kafka_topic = "wiki-result";
public static final String DRIVER_CLASS = "com.mysql.jdbc.Driver";
public static final String DB_URL = "jdbc:mysql://localhost:3306/db?user=user&password=password";
}

MySQL建表:

CREATE TABLE wiki (
Id int(11) NOT NULL AUTO_INCREMENT,
avg double DEFAULT NULL,
time timestamp NULL DEFAULT NULL,
PRIMARY KEY (Id)
)

发送数据到kafka,用的是flink example的wikiproducer:

Monitoring the Wikipedia Edit Stream

import org.apache.flink.api.java.functions.KeySelector;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.api.java.tuple.Tuple3;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.datastream.KeyedStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.windowing.time.Time;
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer08;
import org.apache.flink.streaming.connectors.wikiedits.WikipediaEditEvent;
import org.apache.flink.streaming.connectors.wikiedits.WikipediaEditsSource;
import org.apache.flink.api.common.serialization.SimpleStringSchema;
import org.apache.flink.api.common.functions.FoldFunction;
import org.apache.flink.api.common.functions.MapFunction; public class WikipediaAnalysis {
public static void main(String[] args) throws Exception { StreamExecutionEnvironment see = StreamExecutionEnvironment.getExecutionEnvironment(); DataStream<WikipediaEditEvent> edits = see.addSource(new WikipediaEditsSource()); KeyedStream<WikipediaEditEvent, String> keyedEdits = edits
.keyBy(new KeySelector<WikipediaEditEvent, String>() {
@Override
public String getKey(WikipediaEditEvent event) {
return event.getUser();
}
}); DataStream<Tuple3<String, Long,Long>> result = keyedEdits
.timeWindow(Time.seconds(10))
.fold(new Tuple3<>("", 0L,0L), new FoldFunction<WikipediaEditEvent, Tuple3<String, Long,Long>>() {
@Override
public Tuple3<String, Long,Long> fold(Tuple3<String, Long,Long> acc, WikipediaEditEvent event) {
acc.f0 = event.getUser().trim();
acc.f1 += event.getByteDiff();
acc.f2 = System.currentTimeMillis();
return acc;
}
}); result
.map(new MapFunction<Tuple3<String,Long,Long>, String>() {
@Override
public String map(Tuple3<String, Long,Long> tuple) {
return tuple.toString();
}
})
.addSink(new FlinkKafkaProducer08<>("localhost:9092", "wiki-result", new SimpleStringSchema()));
result.print();
see.execute();
}
}

重写RichSinkFunction,用于写入到mysql中:

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.Timestamp; import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.api.java.tuple.Tuple3;
import org.apache.flink.streaming.api.functions.sink.RichSinkFunction; import kafka.common.Config; public class WikiSQLSink extends RichSinkFunction<Tuple3<String,Long, Long>> { private static final long serialVersionUID = 1L;
private Connection connection;
private PreparedStatement preparedStatement;
String drivername = JDBCTestBase.DRIVER_CLASS;
String dburl = JDBCTestBase.DB_URL; @Override
public void invoke(Tuple3<String,Long, Long> value) throws Exception {
Class.forName(drivername);
connection = DriverManager.getConnection(dburl);
String sql = "INSERT into wiki(name,avg,time) values(?,?,?)";
preparedStatement = connection.prepareStatement(sql);
preparedStatement.setString(1, value.f0);
preparedStatement.setLong(2, value.f1);
preparedStatement.setLong(3, value.f2);
//preparedStatement.setTimestamp(4, new Timestamp(System.currentTimeMillis()));
preparedStatement.executeUpdate();
if (preparedStatement != null) {
preparedStatement.close();
}
if (connection != null) {
connection.close();
} } }

用Flink中流计算类,用的是EventTime,用sql语句对数据进行聚合,写入数据到mysql中去,sql的语法用的是是另一个开源框架Apache Cassandra

图片说明:

https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/table/streaming.html#time-attributes

package com.xxr.flink;

import java.sql.Timestamp;
import java.util.Date;
import java.util.Properties;
import java.util.concurrent.TimeUnit; import org.apache.commons.lang3.StringUtils;
import org.apache.flink.api.common.functions.FilterFunction;
import org.apache.flink.api.common.functions.MapFunction;
import org.apache.flink.api.common.restartstrategy.RestartStrategies;
import org.apache.flink.api.common.typeinfo.TypeInformation;
import org.apache.flink.api.common.typeinfo.Types;
import org.apache.flink.api.java.io.jdbc.JDBCAppendTableSink;
import org.apache.flink.api.java.tuple.Tuple3;
import org.apache.flink.api.java.tuple.Tuple4;
import org.apache.flink.core.fs.FileSystem.WriteMode;
import org.apache.flink.streaming.api.TimeCharacteristic;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.TimestampAssigner;
import org.apache.flink.streaming.api.windowing.time.Time;
import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer08;
import org.apache.flink.streaming.util.serialization.SimpleStringSchema;
import org.apache.flink.table.api.Table;
import org.apache.flink.table.api.TableEnvironment;
import org.apache.flink.table.api.WindowedTable;
import org.apache.flink.table.api.java.StreamTableEnvironment;
//时间参数网址
//https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/table/streaming.html#event-time
//Concepts & Common API
//https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/table/common.html#register-a-table
//SQL语法
//https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/table/sql.html
public class stream_sql {
public static void main(String[] args) throws Exception { Properties pro = new Properties();
pro.put("bootstrap.servers", JDBCTestBase.kafka_hosts);
pro.put("zookeeper.connect", JDBCTestBase.kafka_zookper);
pro.put("group.id", JDBCTestBase.kafka_group);
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
StreamTableEnvironment tableEnv = TableEnvironment.getTableEnvironment(env);
// env.getConfig().disableSysoutLogging(); //设置此可以屏蔽掉日记打印情况
env.getConfig().setRestartStrategy(RestartStrategies.fixedDelayRestart(4, 10000));
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
env.enableCheckpointing(5000); DataStream<String> sourceStream = env
.addSource(new FlinkKafkaConsumer08<String>(JDBCTestBase.kafka_topic, new SimpleStringSchema(), pro)); DataStream<Tuple3<Long, String, Long>> sourceStreamTra = sourceStream.filter(new FilterFunction<String>() {
@Override
public boolean filter(String value) throws Exception {
return StringUtils.isNotBlank(value);
}
}).map(new MapFunction<String, Tuple3<Long, String, Long>>() {
@Override
public Tuple3<Long, String, Long> map(String value) throws Exception {
String temp = value.replaceAll("(\\(|\\))", "");
String[] args = temp.split(",");
try {
return new Tuple3<Long, String, Long>(Long.valueOf(args[2]), args[0].trim(), Long.valueOf(args[1])); } catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
return new Tuple3<Long, String, Long>(System.currentTimeMillis(), args[0].trim(),0L); }
}
});
//設置将哪个字段用于eventTime
DataStream<Tuple3<Long, String, Long>> withTimestampsAndWatermarks = sourceStreamTra
.assignTimestampsAndWatermarks(new FirstTandW());
//内置参数rowtime.rowtime就是EventTime protime是ProcessingTime
tableEnv.registerDataStream("wiki_table", withTimestampsAndWatermarks, "etime,name, num,rowtime.rowtime");
withTimestampsAndWatermarks.print();
// define sink for room data and execute query
JDBCAppendTableSink sink = JDBCAppendTableSink.builder().setDrivername(JDBCTestBase.DRIVER_CLASS)
.setDBUrl(JDBCTestBase.DB_URL).setQuery("INSERT INTO wiki (avg,time) VALUES (?,?)")
.setParameterTypes(Types.LONG, Types.SQL_TIMESTAMP).build();
//执行查询
Table result = tableEnv.sqlQuery(JDBCTestBase.SQL_MIN);
//写入csv
// result.writeToSink(new CsvTableSink("D:/a.csv", // output path
// "|", // optional: delimit files by '|'
// 1, // optional: write to a single file
// WriteMode.OVERWRITE)); // optional: override existing files
//写入数据库
result.writeToSink(sink); env.execute(); }
}

重写AssignerWithPeriodicWatermarks设置watermark,处理时间是EventTime的话必须要有这个方法,ProcessingTime 可忽略

import org.apache.flink.api.java.tuple.Tuple3;
import org.apache.flink.streaming.api.functions.AssignerWithPeriodicWatermarks;
import org.apache.flink.streaming.api.watermark.Watermark; public class FirstTandW implements AssignerWithPeriodicWatermarks<Tuple3<Long,String, Long>> { private final long maxOutOfOrderness = 3500; // 3.5 seconds private long currentMaxTimestamp;
@Override
public long extractTimestamp(Tuple3<Long,String, Long> element, long previousElementTimestamp) {
// TODO Auto-generated method stub
long timestamp = element.f0;
currentMaxTimestamp = Math.max(timestamp, currentMaxTimestamp);
return timestamp;
} @Override
public Watermark getCurrentWatermark() {
// TODO Auto-generated method stub
return new Watermark(currentMaxTimestamp - maxOutOfOrderness);
} }

maven assembly打包成jar,放flink运行就行了,不会打包看我博客

基础知识

Flink 的Window 操作

Flink流计算编程--在WindowedStream中体会EventTime与ProcessingTime

Flink文档写的很好。。刚开始做没仔细看,坑不少

git:https://github.com/xxrznj/flink-kafka-sql

kafka传数据到Flink存储到mysql之Flink使用SQL语句聚合数据流(设置时间窗口,EventTime)的更多相关文章

  1. 如何用VS EF连接 Mysql,以及执行SQL语句 和存储过程?

    VS2013, MySQL5.7.18 , MySQL5.7.14 执行SQL语句: ztp_user z = new ztp_user(); object[] obj = new object[] ...

  2. 【转】MySQL用户管理及SQL语句详解

    [转]MySQL用户管理及SQL语句详解 1.1 MySQL用户管理 1.1.1 用户的定义 用户名+主机域 mysql> select user,host,password from mysq ...

  3. 浅谈mysql配置优化和sql语句优化【转】

    做优化,我在这里引用淘宝系统分析师蒋江伟的一句话:只有勇于承担,才能让人有勇气,有承担自己的错误的勇气.有承担错误的勇气,就有去做事得勇气.无论做什么事,只要是对的,就要去做,勇敢去做.出了错误,承担 ...

  4. MySQL数据库(一)—— 数据库介绍、MySQL安装、基础SQL语句

    数据库介绍.MySQL安装.基础SQL语句 一.数据库介绍 1.什么是数据库 数据库即存储数据的仓库 2.为什么要用数据库 (1)用文件存储是和硬盘打交道,是IO操作,所以有效率问题 (2)管理不方便 ...

  5. 如何记录MySQL执行过的SQL语句

    很多时候,我们需要知道 MySQL 执行过哪些 SQL 语句,比如 MySQL 被注入后,需要知道造成什么伤害等等.只要有 SQL 语句的记录,就能知道情况并作出对策.服务器是可以开启 MySQL 的 ...

  6. mysql里面如何用sql语句让字符串转换为数字

    sql语句将字符串转换为数字默认去掉单引号中的空格,遇到空格作为字符串截止, SELECT '123 and 1=1' +0 结果为123 MySQL里面如何用sql语句让字符串的‘123’转换为数字 ...

  7. mysql进阶(十九)SQL语句如何精准查找某一时间段的数据

    SQL语句如何精准查找某一时间段的数据 在项目开发过程中,自己需要查询出一定时间段内的交易.故需要在sql查询语句中加入日期时间要素,sql语句如何实现? SELECT * FROM lmapp.lm ...

  8. MySQL用户管理及SQL语句详解

    1.1 MySQL用户管理 1.1.1 用户的定义 用户名+主机域 mysql> select user,host,password from mysql.user; +--------+--- ...

  9. mysql优化:explain分析sql语句执行效率

    Explain命令在解决数据库性能上是第一推荐使用命令,大部分的性能问题可以通过此命令来简单的解决,Explain可以用来查看SQL语句的执行效 果,可以帮助选择更好的索引和优化查询语句,写出更好的优 ...

随机推荐

  1. js replace 如何替换字符串中的最后一个匹配项

    1.正则表达时,贪婪模式,.*会一直匹配到最后一个 // 验证 let str = "123[空]345[空]789[空]0"; let res = str.replace(/(. ...

  2. sybase数据库学习笔记(一)

    sybase的基本框架 sybase数据库由系统数据库.用户数据库.数据库设备和辅助文件组成. 1. 系统数据库 sybase数据库是多个数据库结构的数据库管理系统.分为系统数据库和用户数据库. 系统 ...

  3. 【docker】挂载web应用

    现在将webapps目录挂载在宿主机目录,方便运维 docker run -p 8090:8080 --name app -v /usr/app:/usr/local/tomcat/webapps d ...

  4. KnockoutJS + My97DatePicker

    如何将Knockoutjs和其他脚本库结合使用?这里给出一个Knockoutjs与my97datepicker配合使用的例子,例子中使用了ko的自定义绑定功能: ko.bindingHandlers. ...

  5. LR函数基础(一)

    函数用到:web_reg_find(). lr_log_message(). lr_eval_string().strcmp().atoi() Action(){    web_reg_find(&q ...

  6. EasyUI 条件设置行背景颜色

    数据网格(datagrid)的 rowStyler 函数的设计目的是允许您自定义行样式. rowStyler 函数需要两个参数: rowIndex:行的索引,从 0 开始. rowData:该行相应的 ...

  7. Web Service-WSDL详解

    WSDL指网络服务描述语言 (Web Services Description Language), 是一种用XML编写的文档, 用于描述Web Service和函数.参数以及返回值等; 文档内规定了 ...

  8. PHP-Open Flash Chart报表生成

    下载: http://www.cnblogs.com/huangcong/archive/2013/01/27/2878650.html 安装: 解压ZIP包, 将open-flash-chart.s ...

  9. 【转载】Delphi下实现鼠标自动点击器

    本文最早于2009年6月1日在编程论坛(programbbs.com)上发表,页面地址:http://programbbs.com/bbs/view12-20849-1.htm . 众所周知,当鼠标指 ...

  10. IDEA编辑区光标样式修改

    转自:http://blog.csdn.net/aosica321/article/details/52787418 Intellj IDEA光标为insert状态,无法删除内容以前用得是社区版的ID ...