Singer 学习二 使用Singer进行gitlab 2 postgres 数据转换
Singer 可以方便的进行数据的etl 处理,我们可以处理的数据可以是api 接口,也可以是数据库数据,或者
是文件
备注: 测试使用docker-compose 运行&&提供数据库内容,使用virtualenv && python 3.5 以及以上
环境准备
- docker-compose 文件
version: "3"
services:
gogs-service:
image: gogs/gogs
ports:
- "10022:22"
- "10080:3000"
mysql:
image: mysql:5.7.16
ports:
- 3306:3306
command: --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci
environment:
MYSQL_ROOT_PASSWORD: dalongrong
MYSQL_DATABASE: gogs
MYSQL_USER: gogs
MYSQL_PASSWORD: dalongrong
TZ: Asia/Shanghai
postgres:
image: postgres:9.6.11
ports:
- "5432:5432"
environment:
- "POSTGRES_PASSWORD:dalong"
- postgres target 配置
target.json
{
"host": "localhost",
"port": 5432,
"dbname": "postgres",
"user": "postgres",
"password": "postgres",
"schema": "public"
}
- 创建gitlab virtualenv
virtualenv gitlab
source ./gitlab/bin/activate
pip install tap-gitlab
- 创建access_token
从gitlab 官方网站创建即可 - gitlab tap 配置文件
格式如下,因为隐私没有暴露:
{
"api_url": "https://gitlab.com/api/v4",
"private_token": "xxxxxxx",
"groups": "",
"projects": "",
"start_date":"2010-01-01T00:00:00Z"
}
运行&&效果
- 运行
./gitlab/bin/tap-gitlab -c gitlab.json | ./postgres/bin/target-postgres -c target.json
- 效果
INFO Starting sync
INFO GET https://gitlab.com/api/v4/projects/dalongrong%2Fdemoapp?private_token=XXXXXXXXX
INFO Table 'projects' does not exist. Creating... CREATE TABLE public.projects ("approvals_before_merge" bigint, "archived" boolean, "avatar
_url" character varying, "builds_enabled" boolean, "container_registry_enabled" boolean, "created_at" timestamp without time zone, "creator_
id" bigint, "default_branch" character varying, "description" character varying, "forks_count" bigint, "http_url_to_repo" character varying,
"id" bigint, "issues_enabled" boolean, "last_activity_at" timestamp without time zone, "lfs_enabled" boolean, "merge_requests_enabled" bool
ean, "name" character varying, "name_with_namespace" character varying, "namespace__id" bigint, "namespace__kind" character varying, "namespace__name" character varying, "namespace__path" character varying, "only_allow_merge_if_all_discussions_are_resolved" boolean, "only_allow_merge_if_build_succeeds" boolean, "open_issues_count" bigint, "owner_id" bigint, "path" character varying, "path_with_namespace" character varying, "public" boolean, "public_builds" boolean, "request_access_enabled" boolean, "shared_runners_enabled" boolean, "shared_with_groups" jsonb, "snippets_enabled" boolean, "ssh_url_to_repo" character varying, "star_count" bigint, "tag_list" jsonb, "visibility_level" bigint, "web_url" character varying, "wiki_enabled" boolean, PRIMARY KEY ("id"))
INFO Table 'branches' does not exist. Creating... CREATE TABLE public.branches ("commit_id" character varying, "developers_can_merge" boolean, "developers_can_push" boolean, "merged" boolean, "name" character varying, "project_id" bigint, "protected" boolean, PRIMARY KEY ("project_id", "name"))
INFO Table 'commits' does not exist. Creating... CREATE TABLE public.commits ("allow_failure" boolean, "author_email" character varying, "author_name" character varying, "committer_email" character varying, "committer_name" character varying, "created_at" timestamp without time zone, "id" character varying, "message" character varying, "project_id" bigint, "short_id" character varying, "title" character varying, PRIMARY KEY ("id"))
INFO Table 'issues' does not exist. Creating... CREATE TABLE public.issues ("assignee_id" bigint, "author_id" bigint, "confidential" boolean, "created_at" timestamp without time zone, "description" character varying, "due_date" character varying, "id" bigint, "iid" bigint, "labels" jsonb, "milestone_id" bigint, "project_id" bigint, "state" character varying, "subscribed" boolean, "title" character varying, "updated_at" timestamp wi
说明
使用类似的方法,我们也可以转换github 的以及jira 等基于api 开发的模型
参考资料
https://github.com/singer-io/tap-gitlab
https://github.com/rongfengliang/singer-mysql2postges-demo
Singer 学习二 使用Singer进行gitlab 2 postgres 数据转换的更多相关文章
- Singer 学习三 使用Singer进行mongodb 2 postgres 数据转换
Singer 可以方便的进行数据的etl 处理,我们可以处理的数据可以是api 接口,也可以是数据库数据,或者 是文件 备注: 测试使用docker-compose 运行&&提供数据库 ...
- Singer 学习一 使用Singer进行mysql 2 postgres 数据转换
Singer 因为版本的问题,推荐的运行方式是使用virtualenv,对于taps&& target 的运行都是 推荐使用此方式,不然包兼容的问题太费事了 备注: 使用docker- ...
- Singer 学习七 运行&&开发taps、targets (二 targets 运行说明)
接上文: Singer 学习六 运行&&开发taps.targets (一 taps 运行说明) 说明target 需要tap 进行配合运行,所以需要了解tap 的使用 运行targe ...
- emberjs学习二(ember-data和localstorage_adapter)
emberjs学习二(ember-data和localstorage_adapter) 准备工作 首先我们加入ember-data和ember-localstorage-adapter两个依赖项,使用 ...
- ReactJS入门学习二
ReactJS入门学习二 阅读目录 React的背景和基本原理 理解React.render() 什么是JSX? 为什么要使用JSX? JSX的语法 如何在JSX中如何使用事件 如何在JSX中如何使用 ...
- TweenMax动画库学习(二)
目录 TweenMax动画库学习(一) TweenMax动画库学习(二) TweenMax动画库学习(三) Tw ...
- Hbase深入学习(二) 安装hbase
Hbase深入学习(二) 安装hbase This guidedescribes setup of a standalone hbase instance that uses the local fi ...
- Struts2框架学习(二) Action
Struts2框架学习(二) Action Struts2框架中的Action类是一个单独的javabean对象.不像Struts1中还要去继承HttpServlet,耦合度减小了. 1,流程 拦截器 ...
- Python学习二:词典基础详解
作者:NiceCui 本文谢绝转载,如需转载需征得作者本人同意,谢谢. 本文链接:http://www.cnblogs.com/NiceCui/p/7862377.html 邮箱:moyi@moyib ...
随机推荐
- SQL-34 对于表actor批量插入如下数据
题目描述 对于表actor批量插入如下数据CREATE TABLE IF NOT EXISTS actor (actor_id smallint(5) NOT NULL PRIMARY KEY,fir ...
- Java作业四
1.先在一个包中编写第一个类ClassA,要求该类中具有四种不同访问权限的成员,再在另一个包中编写第二个类ClassB,并在该类中编写一个方法以访问第一个类中的成员.总结类成员访问控制的基本规则. p ...
- vue2+webpack 开发环境配置
前提条件: 1.安装node.js https://nodejs.org/en/ 下载安装合适的平台 2.安装npm 第一步:初始化项目 新建文件夹 E:\app 推荐vue项目目录结构: confi ...
- poj2406(kmp算法)
Given two strings a and b we define a*b to be their concatenation. For example, if a = "abc&quo ...
- <Using parquet with impala>
Operations upon Impala Create table stored as parquet like parquet '/user/etl/datafile1' stored as p ...
- MySQL:数据表基本操作
数据表基本操作 注意点: 1.数据表中已经有数据时,轻易修改数据类型,有可能因为不同的数据类型的数据在机器 中存储的方式及长度并不相同,修改数据类型可能会影响到数据表中已有的数据类型. 2. 数据表 ...
- Vue中 v-html 与 v-text 的区别
解析的效果:
- ClientAsTemplate用法
kbmMW提供了TkbmMWClientQuery查询组件,作为kbmMW开发者都知道,这是一个内存数据集,基于服务端的查询服务(Query Service),可以直接执行sql得取想要的记录,因为是 ...
- Java实现循环链表
本案例需要完成的任务定义如下:实现一个循环链表(单链表),具备增加元素.删除元素.打印循环链表等功能. 网上许多同类问题的实现方式过于复杂.难懂,本文旨在提出一种实现循环链表的简单.易懂的方法. 定义 ...
- Python 网络通信协议 tcp udp区别
网络通信的整个流程 在这一节就给大家讲解,有些同学对网络是既熟悉又陌生,熟悉是因为我们都知道,我们安装一个路由器,拉一个网线,或者用无限路由器,连上网线或者连上wifi就能够上网购物.看片片.吃鸡了, ...