Dolphin Scheduler 1.1.0升级1.2.0避坑指南

本文章经授权转载

组件介绍

Apache Dolphin Scheduler是一个分布式易扩展的可视化DAG工作流任务调度系统。致力于解决数据处理流程中错综复杂的依赖关系，使调度系统在数据处理流程中开箱即用。

官网

https://dolphinscheduler.apache.org/en-us/

github

https://github.com/apache/incubator-dolphinscheduler

Dolphin Scheduler 1.2.0是ds发布的第一个Apache版本，目前也是社区推荐的版本。引入了跨项目依赖，Flink&http组件等特性，具体Release Notes请见：

https://github.com/apache/incubator-dolphinscheduler/releases

文档编写目的

记录生产环境升级Dolphin Scheduler 1.1.0 -> 1.2.0的流程

安装环境

CDH5.16.2
Dolphin Scheduler 1.1.0

部署方式

源码编译部署，需要修改hive-jdbc版本适配CDH

前置条件

Dolphin Scheduler1.1.0部署在集群网关节点上
Dolphin Scheduler源码切换到1.2.0-release
ds源码编译采用ubuntu环境

安装包准备

github clone Dolphin Scheduler代码，本地切换到1.2.0-release分支

修改数据库

ds1.1.0中数据库用的mysql，本次升级依然使用mysql作为数据库

- 去除pom文件中的mysql包引入方式，去除test

- 修改dolphinscheduler-dao包下的application-dao.properties
- - 将数据库连接从pg修改到mysql

- 修改dolphinscheduler-common包下的quartz.properties
- - 将数据库连接从pg修改为mysql

修改pom文件中的hive版本

源码编译

更新maven
执行：mvn -U clean package -Prelease -Dmaven.test.skip=true

这里建议大家使用ubuntu或mac系统进行源码编译，win系统下问题比较多
编译完成

到dolphinscheduler-dist包下分别下载后端和前端的tar.gz文件

也可以直接到官网进行下载，要使用mysql数据库需要将mysql-connector-java包放到lib目录下

数据库备份

使用navicat工具进行mysql库的备份
可以导出库的结构和数据文件，也可以直接进行数据库复制

修改配置

修改tar包配置

上传后端tar包

# 创建部署目录

mkdir -p /opt/dolphinscheduler

# 解压tar包

tar -zxvf dolphinscheduler-1.2.0-backend-bin.tar.gz -C /opt/dolphinscheduler/

# 修改安装包权限和所属用户，这里部署用户依然采用1.1.0的escheduler

修改环境变量

修改conf/env目录下的.dolphinscheduler_env.sh文件

- 修改为自己集群的配置，FLINK_HOME暂时没有配置
- 这里的Spark组件切换spark版本有点问题，如果只用spark2可以把SPARK_HOME1注释掉或者指向SPARK_HOME2

export HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop

export HADOOP_CONF_DIR=/opt/cloudera/parcels/CDH/lib/hadoop/etc/hadoop

export SPARK_HOME1=/opt/cloudera/parcels/CDH/lib/spark

export SPARK_HOME2=/opt/cloudera/parcels/SPARK2/lib/spark2

export PYTHON_HOME=/usr/local/anaconda3/bin/python

export JAVA_HOME=/usr/java/jdk1.8.0_131

export HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive

export FLINK_HOME=/opt/soft/flink

export PATH=$HADOOP_HOME/bin:$SPARK_HOME1/bin:$SPARK_HOME2/bin:$PYTHON_HOME:$JAVA_HOME/bin:$HIVE_HOME/bin:$PATH:$FLINK_HOME/bin:$PATH

修改install.sh中的部署参数

zk集群配置的时候，需要按照ip1:2181,ip2:2181,ip3:2181的方式配置
要使用HDFS作为资源中心，HA情况下，需要将集群的core-site.xml文件和hdfs-site.xml文件拷贝到conf目录下，这里注意下修改core-site.xml和hdfs-site.xml文件的权限为755
其他按照业务修改，注意要与1.1.0进行兼容，以下参数特别注意！！！！

# 需要特别注意的install.sh参数

# for example postgresql or mysql ...

dbtype="mysql"

# db config

# db address and port

dbhost="192.168.xx.xx:3306"

# db name

dbname="escheduler"

# db username

username="escheduler"

# db passwprd

# Note: if there are special characters, please use the \ transfer character to transfer

passowrd="escheduler"

# conf/config/install_config.conf config

# Note: the installation path is not the same as the current path (pwd)

installPath="/opt/ds_120"

# deployment user

# Note: the deployment user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled, the root directory needs to be created by itself

deployUser="escheduler"

# hdfs root path, the owner of the root path must be the deployment user.

# versions prior to 1.1.0 do not automatically create the hdfs root directory, you need to create it yourself.

hdfsPath="/escheduler"

# common config

# Program root path

programPath="/tmp/escheduler"

# download path

downloadPath="/tmp/escheduler/download"

# task execute path

execPath="/tmp/escheduler/exec"

# api config

# api server port

apiServerPort="12345"

# api session timeout

apiServerSessionTimeout="7200"

# api server context path

apiServerContextPath="/dolphinscheduler/"

数据库升级&组件升级

运行升级脚本

修改conf/application-dao.properties
如果源码编译的时候没有去除mysql jar包的test，则需要将mysql连接jar放到lib目录下

spring.datasource.driver-class-name=com.mysql.jdbc.Driver

spring.datasource.url=jdbc:mysql://xxxx:3306/dolphinscheduler?characterEncoding=UTF-8

spring.datasource.username=xxxxx

spring.datasource.password=xxxxx

运行升级脚本script下的upgrade-dolphinscheduler.sh，升级数据库

- sh upgrade-dolphinscheduler.sh

特别注意

升级完成之后，需要在ds的元数据库中在执行一条ddl语句，修改任务实例表中的app_link字段长度，否则运行多阶段的hive-ql会导致任务状态不正确。报错信息：

data too long for field 'app_link'

执行ddl语句

Mysql：

alter table t_ds_task_instance modify column app_link text;

Pg：

alter table t_ds_task_instance alter column app_link type text;

关键数据核查

conf/quartz.properties配置文件中实例名属性是否为DolphinScheduler，也就是属性org.quartz.scheduler.instanceName对应的值
QRTZ_SCHEDULER_STATE表中的SCHED_NAME字段是否为DolphinScheduler，1.1.0中为EasyScheduler
QRTZ_JOB_DETAILS表中的JOB_CLASS_NAME字段是否为org.apache.dolphinscheduler.server.quartz.ProcessScheduleJob，1.1.0中为cn.escheduler.server.quartz.ProcessScheduleJob
检查nginx配置的上下文是否为/dolphinscheduler

后端服务升级

sh install.sh

前端服务升级

解压前端tar包，将dist文件夹覆盖1.1.0版本的dist文件夹
修改nginx配置，上下文修改为dolphinscheduler
重启nginx，systemctl restart nginx

vi /etc/nginx/conf.d/escheduler.conf#重启nginx

systemctl restart nginx

至此1.2.0升级完成

任务流测试

升级成功！

欢迎试用Dolphin Scheduler!!!