Elasticsearch的快速使用——Spring Boot使用Elastcisearch, 并且使用Logstash同步mysql和Elasticsearch的数据

我主要是给出一些方向，很多地方没有详细说明。当时我学习的时候一直不知道怎么着手，花时间找入口点上比较多，你们可以直接顺着方向去找资源学习。

如果不是Spring Boot项目，那么根据Elasticsearch的版本选择对应版本的依赖即可。

例：Elasticsearch的版本为5.4.0，那么

　　　　<dependency>

            <groupId>org.elasticsearch.client</groupId>

            <artifactId>transport</artifactId>

            <version>5.4.0</version>

        </dependency>

下方开始介绍Spring Boot整合Elasticsearch, 并且如何同步mysql和Elasticsearch的数据。

我的版本（经过我自己的使用，确认没有问题）：

　　1.电脑：windows10操作系统

　　2.Elasticsearch版本：7.0.0

　　3.ik分词器版本：7.0.0（必须和Elasticsearch版本一致）

　　4.这个如果不是想要学习使用语句，其实是可以不要的，因为elasticsearch-head-master可以切换进“数据浏览”查看数据，凡是可以通过postman之类的操作的，elastcisearch-head-master都可以通过图形界面操作成功。postman: 无版本（这个用来查看indexs的数据）当然，相类似的工具还有curl, kibana, 当时这两个需要要版本和Elasticsearch一致，相比较麻烦（外国网站，我就普通下载花了1整天才下完，所以不想下载这两个可以选择不下载，对我们写代码没有影响）

　　5.elasticsearch-head-master: 无版本的（这个下载源代码后需要编译，启动方式：

　　　　需要先启动Elasticsearch(浏览器输入localhost:9200,出现关于elasticsearch的信息json文本说明成功)，再：

　　　　方式1.cmd切换到文件夹后npm run start, 然后浏览器网址处输入localhost:9100

　　　　方式2.或者选择直接双击文件夹中的index.html文件）

　　6.Spring Boot版本：2.2.4RELEASE

　　若是关于Elasticsearch的一些基础安装和使用不会可以百度，下面正式开始了

开始使用：整合方式我了解到的有如下几种

　　1.使用elasticsearch官方的logstash进行mysql和Elasticsearch的数据同步, 下面我介绍的就是这种。限制：需要和Elasticsearch版本相同版本的Logstash, Elasticsearch的官网可以下载Logstash(下载速度也很慢，我下载的是7.0.0，超过100M, 花了一上午时间)

　　2.使用Bboss这个开源项目，它的官方说法是兼容所有版本的Elasticsearch. 因此不会出现版本问题，可以点击开始学习前往，说明很详细

　　3. 使用elasticsearch-river-jdbc, 不推荐。和它类似的还有2种，但是不推荐，原因：支持的Elasticsearch过低。这个似乎是最高支持6.0左右的版本，另外两个的其中一个只支持Elasticsearch2.0版本左右，所以我也记不得了就不介绍了，有兴趣的可以去看看。

Spring Boot使用Elastcisearch, 并且使用Logstash同步mysql和Elasticsearch数据：

1.Spring Boot使用Elasticsearch：

    <dependency>

            <groupId>org.springframework.boot</groupId>

            <artifactId>spring-boot-starter-data-elasticsearch</artifactId>

        </dependency>

<!--版本是和Spring Boot的parent同步的，默认2.2.4RELEASE,里面的spring-data-lasticsearch版本为3.2.4.RELEASE,
再里面的elasticsearch-rest-high-level-client和transport版本为6.8.6-->

具体的整合讲解可以参考这篇文章

2.下载Logstash，我选择的是zip压缩包下载后解压，不能放中文路径（像这种开发用的东西最好全放英文路径下，我上次还碰到过安装路径有符号也会报错的）。

1.判断是否安装成功：命令忘记了，这个不重要，我们直接下一步。

2.进入Logstash的bin目录下，在里面自己创建jdbc.conf文件，文件名随意，后面cmd命令跟着变就行了：cmd进入bin文件夹下使用命令:logstash -f jdbc.conf

input {
    jdbc {
      jdbc_connection_string => "jdbc:mysql://localhost:3306/seven_forum?useUnicode=true&characterEncoding=UTF-8&useJDBCCompliantTimezoneShift=true&useLegacyDatetimeCode=false&serverTimezone=UTC"
      jdbc_user => "root"
      jdbc_password => "root"
      jdbc_driver_library => "D:\softwareRepository\logstash-7.0.0\config\test-config\mysql-connector-java-5.1.46.jar"
      jdbc_driver_class => "com.mysql.jdbc.Driver"

　　　　#设置取消小写，这个和数据库取别名有关，后面有讲
　　lowercase_column_names => false

　　　　# sql的查询语句存放的文件位置，这里可以选择另外一种配置方式，直接写sql语句的，但就不是用statement_filepath
statement_filepath => "D:\softwareRepository\logstash-7.0.0\config\test-config\postBarInfo.sql"

　　　　#每分钟更新一次数据
schedule => "* * * * *"
}

　　#这是另外一个，同时同步两个表
    jdbc {
      jdbc_connection_string => "jdbc:mysql://localhost:3306/seven_forum?useUnicode=true&characterEncoding=UTF-8&useJDBCCompliantTimezoneShift=true&useLegacyDatetimeCode=false&serverTimezone=UTC"
      jdbc_user => "root"
      jdbc_password => "root"
      jdbc_driver_library => "D:\softwareRepository\logstash-7.0.0\config\test-config\mysql-connector-java-5.1.46.jar"
      jdbc_driver_class => "com.mysql.jdbc.Driver"
   lowercase_column_names => false
      statement_filepath => "D:\softwareRepository\logstash-7.0.0\config\test-config\postInfo.sql"
      schedule => "* * * * *"
    }
}

filter {

　　#这个说实话，我不清楚……
    json {
        source => "message"
        remove_field => ["message"]
    }
}

output {

　　#判断，如果postBarStatus为1就放入index为"post_bar_info"中，如果postStatus为1就放入index为"post_info"中
    if[postBarStatus] == 1 {
        elasticsearch {
            hosts => ["localhost:9200"]
            index => "post_bar_info"

　　　　　　　　# 这个是id标识设置，不设置的话，elasticsearch会用uuid代替，我们这里就是设置查询语句中的postBarId为"_id"， "_id"的值如果一样，后面的数据会覆盖前面的数据
            document_id => "%{postBarId}"
        }
    }
    if[postStatus] == 1 {
        elasticsearch {
            hosts => ["localhost:9200"]
            index => "post_info"
            document_id => "%{postId}"
        }
    }

    stdout {
        codec => json_lines
    }
}

然后再就是两个sql语句的文件内容：

postBarInfo.sql

SELECT post_bar_id postBarId, catalogue_id catalogueId, post_bar_name postBarName,

post_bar_explain postBarExplain, post_bar_logo_url postBarLogoUrl, user_id userId,

post_count postCount, user_count userCount, create_time createTime, post_bar_status postBarStatus

 FROM post_bar_info WHERE post_bar_status = 1

postInfo.sql

SELECT post_id postId, post_bar_id postBarId, post_title postTitle, post_content postContent,

user_id userId, top_post topPost, wonderful_post wonderfulPost,audit,

visit_count visitCount, post_status postStatus, create_time createTime

FROM post_info WHERE post_status = 1

注意：取了别名，这是因为使用ElasticsearchRestRepository<T,ID>的时候，他的机制是这样的：例如

package ……；

import com.seven.forum.entity.zyl.PostBarInfoEntity;

import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;

import java.util.List;

public interface PostBarInfoRepository extends ElasticsearchRepository<PostBarInfoEntity, Long> {


　　// 我看它模板里面用的是Iterable<T>接收数据的，不过我这里写的是List， 
　　// 这里的PostBarName他只会到elasticsearch中找到mapping中名字为postBarName，
　　// 不管是postbarname还是post_bar_name都是不行的，所以前面 数据库要给有下划线的列取别名，并且要设置不使用默认的小写转换

    List<PostBarInfoEntity> queryByPostBarNameContains(String postBarName);

    List<PostBarInfoEntity> getByCatalogueId(Long id);

}

我再给出实体类和测试类的代码：

package ……;

import lombok.Data;

import org.springframework.data.annotation.Id;

import org.springframework.data.elasticsearch.annotations.Document;

import org.springframework.data.elasticsearch.annotations.Field;

import org.springframework.data.elasticsearch.annotations.FieldType;

import java.util.Date;

@Data
// 我用的elasticsearch，type不设置的话，mapping用的就是用的类名小写，这个我目前还没去研究有什么区别，之前看的那本书上的知识没记住……

@Document(indexName = "post_bar_info", type = "_doc")

public class PostBarInfoEntity {

    @Id

    @Field(type = FieldType.Long)

    private Long postBarId;

    @Field(type = FieldType.Long)

    private Long catalogueId;

    @Field(type = FieldType.Text, analyzer = "ik_max_word")

    private String postBarName;

    @Field(type = FieldType.Text, analyzer = "ik_max_word")

    private String postBarExplain;

    @Field(type = FieldType.Text)

    private String postBarLogoUrl;

    @Field(type = FieldType.Long)

    private Long userId;

    @Field(type = FieldType.Long)

    private Long postCount;

    @Field(type = FieldType.Long)

    private Long userCount;

    @Field(type = FieldType.Date)

    private Date createTime;

    @Field(type = FieldType.Integer)

    private Integer postBarStatus;

    @Override

    public String toString() {

        return "PostBarInfoEntity{" +

                "postBarId=" + postBarId +

                ", catalogueId=" + catalogueId +

                ", postBarName='" + postBarName + '\'' +

                ", postBarExplain='" + postBarExplain + '\'' +

                ", postBarLogoUrl='" + postBarLogoUrl + '\'' +

                ", userId=" + userId +

                ", postCount=" + postCount +

                ", userCount=" + userCount +

                ", createTime=" + createTime +

                ", postBarStatus=" + postBarStatus +

                '}';

    }

}

import com.seven.forum.SpringBootApp;import org.junit.Test;

import org.junit.runner.RunWith;

import org.springframework.beans.factory.annotation.Autowired;

import org.springframework.boot.test.context.SpringBootTest;

import org.springframework.data.elasticsearch.core.ElasticsearchRestTemplate;

import org.springframework.test.context.junit4.SpringRunner;

import java.util.List;

@RunWith(SpringRunner.class)

@SpringBootTest(classes = SpringBootApp.class)

public class ElasticsearchTest {

    @Autowired

    private ElasticsearchRestTemplate template;

　　// 这里有个坑，在使用createIndex方法的时候，它有时候会顺便一起把putMapping一起执行（不是一起执行，是创建index的时候把mapping一起映射了出来，
　　// 之前我刚学的时候就在这里栽了个坑，两份一样的代码，都只有createIndex方法，一个会创建出mapping，一个就只有index而没有mapping）我目前觉得它是随机的效果。

    @Test

    public void testCreatePostBar() {

        template.createIndex(PostBarInfoEntity.class);

    }

　　// 同步logstash的时候，没有mapping也可以，只要有index就可以

    @Test

    public void testCreatePostBarMapping() {

        template.putMapping(PostBarInfoEntity.class);

    }

5.注意：logstash只可以同步增和改，因此删除是需要自己在代码中同步的，通过deleteXX方法（忘了是什么方法了，不想找了）。也就是说：在开启Elasticsearch和Logstash的服务下，每分钟它会自动的进行增和改的同步，然后代码中自己手动实现删除数据的同步。删除不同步，也就是说，修改是有问题的，因为修改是分为删除和增加两步，但因为elasticsearch的相同"_id"会进行覆盖，也就是说如果"_id"和原来的一样，那么修改成功，若不是，就可能会出现增加了修改后的，但修改前的还在)

就到此结束了，如果还想了解详细一些，可以看看这篇文章

同时，感谢这篇文章：https://blog.csdn.net/baidu_29092471/article/details/85092545 我就是在这上面看到了数据库的查询语句写了别名，才豁然开朗，知道怎么处理数据库的同步问题时的下划线怎么解决，再次表示感谢。这个问题困扰了半个多月，网上也没看到什么人提出来，虽然知道后也就觉得是那么一回事，但是我当时真的是想不起来用sql语句的别名方式解决。

最后，我上面给出的配置都是最简单的，如果要设计高级一些的，别入分页，高亮就需要去查找资料了。我只是给出了方向怎么配置和使用，花了我不少时间，走了不少弯路。

给出简书上的一篇logstash的配置：https://www.jianshu.com/p/d127c3799ad1

虽然说官方可能会有详细说明，但是……进入官网太难了，我去外国网站基本一个网页要等个几十分钟，多的时候1个小时都不见得能进去。（可能是我浏览器还是什么问题？我用的微软自带的,感觉还行，设置个百度首页，……然后就是谷歌和火狐，慢得……）

总结一下：1.版本问题要匹配　　2.同步数据要记得给sql语句带有下划线的取别名，并且设置不小写转换 3.注意elasticsearch的改的问题，要注意id

Elasticsearch的快速使用——Spring Boot使用Elastcisearch, 并且使用Logstash同步mysql和Elasticsearch的数据的更多相关文章

Springboot(一):使用Intellij中的Spring Initializr来快速构建Spring Boot工程
使用Intellij中的Spring Initializr来快速构建Spring Boot工程 New---Project 可以看到图所示的创建功能窗口.其中Initial Service Url指向 ...
想要快速上手 Spring Boot？看这些教程就足够了！| 码云周刊第 81 期
原文:https://blog.gitee.com/2018/08/19/weekly-81/ 想要快速上手 Spring Boot?看这些教程就足够了!| 码云周刊第 81 期码云周刊 | 201 ...
基于 intellij IDEA 快速搭建Spring Boot项目
在<一步步搭建 Spring Boot maven 框架的工程>一文中,已经介绍了如何使用Eclipse快速搭建Spring Boot项目.由于最近将开发工具由Eclipse ...
快速搭建Spring Boot + Apache Shiro 环境
个人博客网:https://wushaopei.github.io/ (你想要这里多有) 一.Apache Shiro 介绍及概念概念:Apache Shiro是一个强大且易用的Java安全框 ...
快速体验Spring Boot了解使用、运行和打包 | SpringBoot 2.7.2学习系列
SpringBoot 2.7.2 学习系列,本节内容快速体验Spring Boot,带大家了解它的基本使用.运行和打包. Spring Boot 基于 Spring 框架,底层离不开 IoC.AoP ...
spring boot / cloud (十九) 并发消费消息,如何保证入库的数据是最新的?
spring boot / cloud (十九) 并发消费消息,如何保证入库的数据是最新的? 消息中间件在解决异步处理,模块间解耦和,和高流量场景的削峰,等情况下有着很广泛的应用 . 本文将跟大家一起 ...
Spring Boot入门(六)：使用MyBatis访问MySql数据库(注解方式)
本系列博客记录自己学习Spring Boot的历程,如帮助到你,不胜荣幸,如有错误,欢迎指正! 本篇博客我们讲解下在Spring Boot中使用MyBatis访问MySql数据库的简单用法. 1.前期 ...
探究Spring Boot中的接收参数问题与客户端发送请求传递数据
结合此篇参考Spring框架学习笔记(9)--API接口设计相关知识及具体编码实现在使用Spring Boot进行接收参数的时候,发现了许多问题,之前一直都很忙,最近才稍微有空研究一下此问题. 网上 ...
使用logstash同步mysql数据库信息到ElasticSearch
本文介绍如何使用logstash同步mysql数据库信息到ElasticSearch. 1.准备工作 1.1 安装JDK 网上文章比较多,可以参考:https://www.dalaoyang.cn/a ...

随机推荐

Android Studio 使用入门及问题汇总
声明:转载自http://blog.csdn.net/wei_chong_chong/article/details/56280383 之前一直用eclipse+adt做Android开发.曾经尝试使 ...
UNICODE UTF编码方式解析
先明确几个概念基础概念部分 1.字符编码方式CEF(Character Encoding Form) 对符号进行编码,便于处理与显示常用的编码方式有 GB2312(汉字国标码 2字节) ASCII ...
paramiko linux pip18.1
Collecting paramiko Downloading https://files.pythonhosted.org/packages/cf/ae/94e70d49044ccc234bfdba ...
Codeforces #617 (Div. 3) C. Yet Another Walking Robot
There is a robot on a coordinate plane. Initially, the robot is located at the point (0,0)(0,0) . It ...
idea没有import project解决办法
参考:https://blog.csdn.net/zengxiaosen/article/details/52807540
二、linux基础-路径和目录_用户管理_组_权限
2.1路径和目录1.相对路径:参照当前目录进行查找. 如:[root@localhost ~]# cd ../opt/hosts/备注:相对路径是从你的当前目录开始为基点,去寻找另外一个目录(或者 ...
谁才是天朝最厉害的演员？让Python来为你揭晓！
一.项目背景上个月笔者的一个同学开了间影视投资公司,出于对创业人员的仰慕和影视投资行业的好奇,我就跟他寒暄了几句,聊天当中他提及到国庆节有部<攀登者>即将上映,预计票房会大好,因为吴京是 ...
ASP.NET MVC Web项目中使用Log4Net记录日志，并按照日志类型分文件存储
1.创建MvcLog4Net项目 2.创建空的MVC项目 3.项目创建完成的效果 4.选择项目,点击鼠标右键,在弹出菜单中选择“管理解决方案的 NuGet 程序包” 5. 在NuGet浏览界面: 点 ...
SparkSQL 疫情Demo练习
在家闲着没事干, 写个简单的疫情数据处理Demo, 顺便回顾下SparkSQL. 模拟数据(以下数据皆为虚构, 如有雷同不胜荣幸) 市民信息(civic_info.csv) id_no,name,se ...
Mini_Linux需要搭的环境
1.bash:ifconfig:command not found sudo yum install -y net-tools 2.如果Linux系统是通过复制得到需要更改hostname vi ...

Elasticsearch的快速使用——Spring Boot使用Elastcisearch, 并且使用Logstash同步mysql和Elasticsearch的数据

Elasticsearch的快速使用——Spring Boot使用Elastcisearch, 并且使用Logstash同步mysql和Elasticsearch的数据的更多相关文章

随机推荐

热门专题