CentOS7.4+MongoBD3.6.4集群(Shard)部署以及大数据量入库
前言
mongodb支持自动分片,集群自动的切分数据,做负载均衡。避免上面的分片管理难度。mongodb分片是将集合切合成小块,分散到若干片里面,每个片负责所有数据的一部分。这些块对应用程序来说是透明的,不需要知道哪些数据分布到哪些片上,甚至不在乎是否有做过分片,应用程序连接mongos进程,mongos知道数据和片的对应关系,将客户端请求转发到正确的片上,如果请求有了响应,mongos将结果收集起来返回给客户端程序。
分片适用场景:
1)服务器磁盘不够用
2)单个mongod不能满足日益频繁写请求
3)将大量数据存放于内存中提高性能
建立分片需要三种角色:
1.shard server
保存实际数据容器。每个shard可以是单个的mongod实例,也可以是复制集,即使片内又多台服务器,只能有一个主服务器,其他的保存相同数据的副本。为了实现每个shard内部的auto-failover,强烈建议为每个shard配置一组Replica Set。2.config server
为了将一个特定的collection 存储在多个shard 中,需要为该collection 指定一个shardkey,shardkey 可以决定该条记录属于哪个chunk。Config Servers 就是用来存储:所有shard 节点的配置信息、每个chunk 的shardkey 范围、chunk 在各shard 的分布情况、该集群中所有DB 和collection 的sharding 配置信息。
3.route server
集群前端路由,路由所有请求,然后将结果聚合。客户端由此接入,询问config server需要到哪些shard上查询或保存数据,再连接到相应的shard进行操作,最后将结果返回给客户端。客户端只需要将原先发送给mongod的请求原封不动的发给mongos(即route server)即可,不必知道数据分布在哪个shard上。
shard key:设置分片时,需要从集合中选一个键,作为数据拆分的依据,这个键就是shard key。shard key的选择决定了插入操作在片之间的分布。shard key保证足够的不一致性,数据才能更好的分布到多台服务器上。同时保持块在一个合理的规模是非常重要的,这样数据平衡和移动块不会消耗大量的资源。
1. 参考说明
参考文档:
https://www.cnblogs.com/ityouknow/p/7566682.html
https://docs.mongodb.com/manual/tutorial/deploy-shard-cluster/
2. 安装环境说明
2.1. 环境说明
服务器 |
||||
主机名 |
server1.smartmap.com |
server2.smartmap.com |
server3.smartmap.com |
server4.smartmap.com |
IP |
192.168.1.31 |
192.168.1.32 |
192.168.1.33 |
192.168.1.34 |
Subnet mask |
255.255.255.0 |
255.255.255.0 |
255.255.255.0 |
255.255.255.0 |
Gateway |
192.168.1.1 |
192.168.1.1 |
192.168.1.1 |
192.168.1.1 |
DNS |
218.30.19.50 61.134.1.5 |
218.30.19.50 61.134.1.5 |
218.30.19.50 61.134.1.5 |
218.30.19.50 61.134.1.5 |
运行服务 |
||||
Config |
Config |
Config |
Route |
|
Shard1 |
Shard1 |
Shard1 |
Shard2 |
|
Shard2 |
Shard2 |
Shard3 |
Shard3 |
|
Shard3 |
Shard4 |
Shard4 |
Shard4 |
|
Route |
服务 |
端口 |
Route |
20000 |
Config |
21000 |
Shard1 |
27001 |
Shard2 |
27002 |
Shard3 |
27003 |
Shard4 |
27004 |
2.2. 安装基础软件
[root@server1~]# yum install unzip wget ntp
2.3. 安装与配置NTP
[root@server1 ~]# yum update
[root@server1 ~]# yum install unzip wget ntp
[root@server1 ~]# systemctl is-enabled ntpd
disabled
[root@server1 ~]# systemctl enable ntpd
Created symlink from /etc/systemd/system/multi-user.target.wants/ntpd.service to /usr/lib/systemd/system/ntpd.service.
[root@server1 ~]# systemctl start ntpd
[root@server1 ~]# ntpdate -u 2.asia.pool.ntp.org
2.4. 添加用户
[root@server1 ~]# useradd mongodb
[root@server1 ~]# passwd mongodb
[root@server1 ~]# chmod u+w /etc/sudoers
[root@server1 ~]#
[root@server1 ~]# vi /etc/sudoers
添加如下内容:
mongodb ALL=(ALL) NOPASSWD: ALL
3. 安装
3.1. MongoDB下载
https://www.mongodb.com/download-center?jmp=nav#community
[root@server1 ~]# mkdir /opt/mongodb
[root@server1 ~]# chown -R mongodb:mongodb /opt/mongodb/
3.2. MongoDB解压
[root@server1 ~]# su – mongodb
[mongodb@server1 mongodb]$ cd /opt/mongodb
[mongodb@server1 mongodb]$ tar -zxvf mongodb-linux-x86_64-rhel70-3.6.4.tgz
[mongodb@server1 mongodb]$ mv mongodb-linux-x86_64-rhel70-3.6.4 mongodb-app
3.3. 创建相关目录
在所有的服务器上建立conf、mongos、config、shard1、shard2、shard3、shard4、目录,因为mongos不存储数据,只需要建立日志文件目录即可
[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/conf
[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/mongos/log
[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/config/data
[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/config/log
[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard1/data
[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard1/log
[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard2/data
[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard2/log
[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard3/data
[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard3/log
[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard4/data
[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard4/log
3.4. 复制到其它节点
[mongodb@server2 ~]$ scp -r mongodb@192.168.1.31:/opt/mongodb/mongodb-app /opt/mongodb/
[mongodb@server3 ~]$ scp -r mongodb@192.168.1.31:/opt/mongodb/mongodb-app /opt/mongodb/
[mongodb@server4 ~]$ scp -r mongodb@192.168.1.31:/opt/mongodb/mongodb-app /opt/mongodb/
3.5. 环境变量
[mongodb@server1 mongodb]$ sudo vi /etc/profile
export MONGODB_HOME=/opt/mongodb/mongodb-app
export PATH=$MONGODB_HOME/bin:$PATH
[mongodb@server1 mongodb]$ source /etc/profile
[mongodb@server1 mongodb]$ mongod -v
4. 配置
4.1. config server配置服务器
4.1.1. 创建配置文件
服务器31,32,33上配置以下内容
[mongodb@server1 mongodb]$ vi /opt/mongodb/mongodb-app/conf/config.conf
## content
systemLog:
destination: file
logAppend: true
path: /opt/mongodb/mongodb-app/data/config/log/config.log
# Where and how to store data.
storage:
dbPath: /opt/mongodb/mongodb-app/data/config/data
journal:
enabled: true
# how the process runs
processManagement:
fork: true
pidFilePath: /opt/mongodb/mongodb-app/data/config/log/configsrv.pid
# network interfaces
net:
port:
bindIp: 0.0.0.0
#operationProfiling:
replication:
replSetName: config
sharding:
clusterRole: configsvr
4.1.2. 启动三台服务器的config server
[mongodb@server1 mongodb]$ mongod --config /opt/mongodb/mongodb-app/conf/config.conf
about to fork child process, waiting until server is ready for connections.
forked process: 8072
child process started successfully, parent exiting
[mongodb@server1 mongodb]$
[mongodb@server1 mongodb]$ mongod --shutdown --config /opt/mongodb/mongodb-app/conf/config.conf
4.1.3. 登录任意一台配置服务器,初始化配置副本集
[mongodb@server1 conf]$ mongo 192.168.1.31:21000
MongoDB shell version v3.6.4
connecting to: mongodb://192.168.1.31:21000/test
MongoDB server version: 3.6.4
> use admin;
switched to db admin
> config = {
_id : "config",
members : [
{_id : 0, host : "192.168.1.31:21000" },
{_id : 1, host : "192.168.1.32:21000" },
{_id : 2, host : "192.168.1.33:21000" }
]
}
> rs.initiate(config);
config:SECONDARY> rs.status();
登录
mongo 192.168.1.31:21000
#切换数据库
use admin;
#定义配置变量
config = {
_id : "config",
members : [
{_id : 0, host : "192.168.1.31:21000" },
{_id : 1, host : "192.168.1.32:21000" },
{_id : 2, host : "192.168.1.33:21000" }
]
}
#初始化副本集
rs.initiate(config);
#查看分区状态
rs.status();
4.2. shard server配置服务器
4.2.1. shard1配置
4.2.1.1.创建配置文件
服务器31,32,33上配置以下内容
[mongodb@server1 mongodb]$ vi /opt/mongodb/mongodb-app/conf/shard1.conf
# shard1 config
systemLog:
destination: file
logAppend: true
path: /opt/mongodb/mongodb-app/data/shard1/log/shard1.log
# Where and how to store data.
storage:
dbPath: /opt/mongodb/mongodb-app/data/shard1/data
journal:
enabled: true
wiredTiger:
engineConfig:
cacheSizeGB: 20
# how the process runs
processManagement:
fork: true
pidFilePath: /opt/mongodb/mongodb-app/data/shard1/log/shard1.pid
# network interfaces
net:
port:
bindIp: 0.0.0.0
#operationProfiling:
replication:
replSetName: shard1
sharding:
clusterRole: shardsvr
4.2.1.2.启动三台服务器的shard1 server
[mongodb@server1 mongodb]$ mongod --config /opt/mongodb/mongodb-app/conf/shard1.conf
about to fork child process, waiting until server is ready for connections.
forked process: 8420
child process started successfully, parent exiting
[mongodb@server1 mongodb]$
[mongodb@server1 mongodb]$ mongod --shutdown --config /opt/mongodb/mongodb-app/conf/shard1.conf
killing process with pid: 8420
4.2.1.3.登录任意一台配置服务器,初始化配置副本集
[mongodb@server1 mongodb]$ mongo 192.168.1.31:27001
MongoDB shell version v3.6.4
connecting to: mongodb://192.168.1.31:27001/test
MongoDB server version: 3.6.4
> use admin;
switched to db admin
> config = {
_id : "shard1",
members : [
{_id : 0, host : "192.168.1.31:27001" },
{_id : 1, host : "192.168.1.32:27001" },
{_id : 2, host : "192.168.1.33:27001" }
]
}
> rs.initiate(config);
{ "ok" : 1 }
shard1:SECONDARY> rs.status();
4.2.2. shard2配置
4.2.2.1.创建配置文件
服务器34,31,32上配置以下内容
[mongodb@server4 mongodb]$ vi /opt/mongodb/mongodb-app/conf/shard2.conf
# shard2 config
systemLog:
destination: file
logAppend: true
path: /opt/mongodb/mongodb-app/data/shard2/log/shard2.log
# Where and how to store data.
storage:
dbPath: /opt/mongodb/mongodb-app/data/shard2/data
journal:
enabled: true
wiredTiger:
engineConfig:
cacheSizeGB: 20
# how the process runs
processManagement:
fork: true
pidFilePath: /opt/mongodb/mongodb-app/data/shard2/log/shard2.pid
# network interfaces
net:
port:
bindIp: 0.0.0.0
#operationProfiling:
replication:
replSetName: shard2
sharding:
clusterRole: shardsvr
4.2.2.2.启动三台服务器的shard2 server
[mongodb@server4 mongodb]$ mongod --config /opt/mongodb/mongodb-app/conf/shard2.conf
about to fork child process, waiting until server is ready for connections.
forked process: 8025
child process started successfully, parent exiting
[mongodb@server4 mongodb]$
[mongodb@server1 mongodb]$ mongod --shutdown --config /opt/mongodb/mongodb-app/conf/shard2.conf
killing process with pid: 8420
4.2.2.3.登录任意一台配置服务器,初始化配置副本集
[mongodb@server4 mongodb]$ mongo 192.168.1.34:27002
MongoDB shell version v3.6.4
connecting to: mongodb://192.168.1.34:27002/test
MongoDB server version: 3.6.4
Welcome to the MongoDB shell.
> use admin;
switched to db admin
> config = {
_id : "shard2",
members : [
{_id : 0, host : "192.168.1.34:27002" },
{_id : 1, host : "192.168.1.31:27002" },
{_id : 2, host : "192.168.1.32:27002" }
]
}
> rs.initiate(config);
{ "ok" : 1 }
shard2:OTHER> rs.status();
4.2.3. shard3配置
4.2.3.1.创建配置文件
服务器33,34,31上配置以下内容
[mongodb@server3 mongodb]$ vi /opt/mongodb/mongodb-app/conf/shard3.conf
# shard3 config
systemLog:
destination: file
logAppend: true
path: /opt/mongodb/mongodb-app/data/shard3/log/shard3.log
# Where and how to store data.
storage:
dbPath: /opt/mongodb/mongodb-app/data/shard3/data
journal:
enabled: true
wiredTiger:
engineConfig:
cacheSizeGB: 20
# how the process runs
processManagement:
fork: true
pidFilePath: /opt/mongodb/mongodb-app/data/shard3/log/shard3.pid
# network interfaces
net:
port:
bindIp: 0.0.0.0
#operationProfiling:
replication:
replSetName: shard3
sharding:
clusterRole: shardsvr
4.2.3.2.启动三台服务器的shard3 server
[mongodb@server3 mongodb]$ mongod --config /opt/mongodb/mongodb-app/conf/shard3.conf
about to fork child process, waiting until server is ready for connections.
forked process: 8179
child process started successfully, parent exiting
[mongodb@server3 mongodb]$
[mongodb@server1 mongodb]$ mongod --shutdown --config /opt/mongodb/mongodb-app/conf/shard3.conf
killing process with pid: 8420
4.2.3.3.登录任意一台配置服务器,初始化配置副本集
[mongodb@server3 mongodb]$ mongo 192.168.1.33:27003
MongoDB shell version v3.6.4
connecting to: mongodb://192.168.1.33:27003/test
MongoDB server version: 3.6.4
Welcome to the MongoDB shell.
> use admin
switched to db admin
> config = {
_id : "shard3",
members : [
{_id : 0, host : "192.168.1.33:27003" },
{_id : 1, host : "192.168.1.34:27003" },
{_id : 2, host : "192.168.1.31:27003" }
]
}
> rs.initiate(config);
{ "ok" : 1 }
shard3:OTHER> rs.status();
4.2.4. shard4配置
4.2.4.1.创建配置文件
服务器32,33,34上配置以下内容
[mongodb@server2 mongodb]$ vi /opt/mongodb/mongodb-app/conf/shard4.conf
# shard4 config
systemLog:
destination: file
logAppend: true
path: /opt/mongodb/mongodb-app/data/shard4/log/shard4.log
# Where and how to store data.
storage:
dbPath: /opt/mongodb/mongodb-app/data/shard4/data
journal:
enabled: true
wiredTiger:
engineConfig:
cacheSizeGB: 20
# how the process runs
processManagement:
fork: true
pidFilePath: /opt/mongodb/mongodb-app/data/shard4/log/shard4.pid
# network interfaces
net:
port:
bindIp: 0.0.0.0
#operationProfiling:
replication:
replSetName: shard4
sharding:
clusterRole: shardsvr
4.2.4.2.启动三台服务器的shard4 server
[mongodb@server2 mongodb]$ mongod --config /opt/mongodb/mongodb-app/conf/shard4.conf
about to fork child process, waiting until server is ready for connections.
forked process: 8436
child process started successfully, parent exiting
[mongodb@server4 mongodb]$ mongod --shutdown --config /opt/mongodb/mongodb-app/conf/shard4.conf
killing process with pid: 8238
4.2.4.3.登录任意一台配置服务器,初始化配置副本集
[mongodb@server2 mongodb]$ mongo 192.168.1.32:27004
MongoDB shell version v3.6.4
connecting to: mongodb://192.168.1.32:27004/test
MongoDB server version: 3.6.4
Welcome to the MongoDB shell.
For interactive help, type "help".
> use admin
switched to db admin
> config = {
_id : "shard4",
members : [
{_id : 0, host : "192.168.1.32:27004" },
{_id : 1, host : "192.168.1.33:27004" },
{_id : 2, host : "192.168.1.34:27004" }
]
}
> rs.initiate(config);
{ "ok" : 1 }
shard4:OTHER> rs.status();
4.3. 配置路由服务器 mongos
4.3.1. 创建配置文件
服务器31,34上配置以下内容
[mongodb@server1 mongodb]$ vi /opt/mongodb/mongodb-app/conf/mongos.conf
systemLog:
destination: file
logAppend: true
path: /opt/mongodb/mongodb-app/data/mongos/log/mongos.log
processManagement:
fork: true
# pidFilePath: /opt/mongodb/mongodb-app/data/mongos/log/mongos.pid
# network interfaces
net:
port:
bindIp: 0.0.0.0
# 监听的配置服务器,只能有1个或者3个 config为配置服务器的副本集名字
sharding:
configDB: config/192.168.1.31:21000,192.168.1.32:21000,192.168.1.33:21000
4.3.2. 启动三台服务器的config server
注意:先启动配置服务器和分片服务器,后启动路由实例
[mongodb@server1 conf]$ mongos --config /opt/mongodb/mongodb-app/conf/mongos.conf
about to fork child process, waiting until server is ready for connections.
forked process: 9869
child process started successfully, parent exiting
4.4. 启用分片
目前搭建了mongodb配置服务器、路由服务器,各个分片服务器,不过应用程序连接到mongos路由服务器并不能使用分片机制,还需要在程序里设置分片配置,让分片生效。
登陆任意一台mongos
[mongodb@server1 data]$ mongo 192.168.1.31:20000/admin
MongoDB shell version v3.6.4
connecting to: mongodb://192.168.1.31:20000/admin
MongoDB server version: 3.6.4
mongos> use admin;
switched to db admin
mongos> sh.addShard("shard1/192.168.1.31:27001,192.168.1.32:27001,192.168.1.33:27001")
mongos> sh.addShard("shard2/192.168.1.34:27002,192.168.1.31:27002,192.168.1.32:27002")
mongos> sh.addShard("shard3/192.168.1.33:27003,192.168.1.34:27003,192.168.1.31:27003")
mongos> sh.addShard("shard4/192.168.1.32:27004,192.168.1.33:27004,192.168.1.34:27004")
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5b00d641da35619896a78891")
}
shards:
{ "_id" : "shard1", "host" : "shard1/192.168.1.31:27001,192.168.1.32:27001,192.168.1.33:27001", "state" : 1 }
{ "_id" : "shard2", "host" : "shard2/192.168.1.31:27002,192.168.1.32:27002,192.168.1.34:27002", "state" : 1 }
{ "_id" : "shard3", "host" : "shard3/192.168.1.31:27003,192.168.1.33:27003,192.168.1.34:27003", "state" : 1 }
{ "_id" : "shard4", "host" : "shard4/192.168.1.32:27004,192.168.1.33:27004,192.168.1.34:27004", "state" : 1 }
active mongoses:
"3.6.4" : 2
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "config", "primary" : "config", "partitioned" : true }
mongos>
mongos> sh.getBalancerState()
true
mongos> sh.startBalancer()
{
"ok" : 1,
"$clusterTime" : {
"clusterTime" : Timestamp(1526784919, 4),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
},
"operationTime" : Timestamp(1526784919, 4)
}
5. 验证
5.1. 列出shard
[mongodb@server1 data]$ mongo 192.168.1.31:20000/admin
MongoDB shell version v3.6.4
connecting to: mongodb://192.168.1.31:20000/admin
MongoDB server version: 3.6.4
mongos> use admin;
switched to db admin
mongos> db.runCommand({ listshards:1 }); #列出 shard 个数
{
"shards" : [
{
"_id" : "shard1",
"host" : "shard1/192.168.1.31:27001,192.168.1.32:27001,192.168.1.33:27001",
"state" : 1
},
{
"_id" : "shard2",
"host" : "shard2/192.168.1.31:27002,192.168.1.32:27002,192.168.1.34:27002",
"state" : 1
},
{
"_id" : "shard3",
"host" : "shard3/192.168.1.31:27003,192.168.1.33:27003,192.168.1.34:27003",
"state" : 1
},
{
"_id" : "shard4",
"host" : "shard4/192.168.1.32:27004,192.168.1.33:27004,192.168.1.34:27004",
"state" : 1
}
],
"ok" : 1,
"$clusterTime" : {
"clusterTime" : Timestamp(1526786005, 2),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
},
"operationTime" : Timestamp(1526786005, 2)
}
5.2. 创建 Shard功能的数据库与表
[mongodb@server1 data]$ mongo 192.168.1.31:20000/admin
MongoDB shell version v3.6.4
connecting to: mongodb://192.168.1.31:20000/admin
MongoDB server version: 3.6.4
mongos> use admin;
switched to db admin
mongos> db.runCommand({ enablesharding: "RHY" });
{
"ok" : 1,
"$clusterTime" : {
"clusterTime" : Timestamp(1526791888, 10),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
},
"operationTime" : Timestamp(1526791888, 10)
}
mongos> db.runCommand({ shardcollection: "RHY.ST_RIVER_R", key: {STCD: 1}, unique: false })
{
"collectionsharded" : "RHY.ST_RIVER_R",
"collectionUUID" : UUID("370a8a62-cd5d-488a-b752-6fedba5da507"),
"ok" : 1,
"$clusterTime" : {
"clusterTime" : Timestamp(1526791904, 14),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
},
"operationTime" : Timestamp(1526791904, 14)
}
mongos>
5.3. 向单个MongoDB中导入CSV数据
[root@server1 bin]# ./mongoimport -d RHY -c ST_STBPRP_B --type csv --headerline --file /opt/mongodb/mydata/ST_STBPRP_B.CSV
2018-05-20T16:18:37.849+0800 connected to: localhost
2018-05-20T16:18:38.748+0800 imported 27276 documents
5.4. 向Shard中导入数据
5.4.1. 安装Python包
pip install pymongo
5.4.2. Python访问MongoDB的Shard集群
import sys
import json
import pymongo
import datetime
from pymongo import MongoClient
client = MongoClient('mongodb://192.168.1.31:20000,192.168.1.34:20000')
db = client.RHY
collection = db.ST_RIVER_R
f = open("D:/bigdata/st_river_r.CSV")
line = f.readline()
print(line)
fieldNames = line.split(',')
# STCD,TM,Z,Q,XSA,XSAVV,XSMXV,FLWCHRCD,WPTN,MSQMT,MSAMT,MSVMT
line = f.readline()
count = 0
records = []
insertCount = 0
while line:
#
count = count + 1
fieldValues = line.split(',')
iflen(fieldValues) == 12or fieldValues[0].strip() != '':
insertObj = {}
STCD = fieldValues[0]
insertObj['STCD'] = STCD
TM = fieldValues[1]
if TM.strip() != '':
TM = datetime.datetime.strptime(TM, '%Y-%m-%d %H:%M:%S')
insertObj['TM'] = TM
Z = fieldValues[2]
if Z.strip() != '':
Z = float(Z)
insertObj['Z'] = Z
Q = fieldValues[3]
if Q.strip() != '':
Q = float(Q)
insertObj['Q'] = Q
# XSA
XSA = fieldValues[4]
if XSA.strip() != '':
XSA = float(XSA)
insertObj['XSA'] = XSA
# XSAVV
XSAVV = fieldValues[5]
if XSAVV.strip() != '':
XSAVV = float(XSAVV)
insertObj['XSAVV'] = XSAVV
#
XSMXV = fieldValues[6]
if XSMXV.strip() != '':
XSMXV = float(XSMXV)
insertObj['XSMXV'] = XSMXV
#
FLWCHRCD = fieldValues[7]
if FLWCHRCD.strip() != '':
insertObj['FLWCHRCD'] = FLWCHRCD
#
WPTN = fieldValues[8]
if WPTN.strip() != '':
insertObj['WPTN'] = WPTN
#
MSQMT = fieldValues[9]
if MSQMT.strip() != '':
insertObj['MSQMT'] = MSQMT
#
MSAMT = fieldValues[10]
if MSAMT.strip() != '':
insertObj['MSAMT'] = MSAMT
#
MSVMT = fieldValues[11]
if MSVMT.strip() != '':
insertObj['MSVMT'] = MSVMT
#
# collection.insert_one(insertObj)
# collection.insert_many(new_posts)
records.append(insertObj)
iflen(records) == 1000:
insertCount = insertCount + 1
if count > 1451000:
collection.insert_many(records)
print(str(count) + ' ' + str(insertCount))
print(count)
records = []
else:
print(line)
#
line = f.readline()
f.close()
client.close()
5.5. 从单个MongoDB中备份数据
[root@server1 bin]# ./mongodump -h 127.0.0.1:27017 -d RHY -o /opt/mongodb/mydata/dump
5.6. 向Shard中恢复数据
[mongodb@server1 bin]$ ./mongorestore -h 192.168.1.31 --port 20000 -d RHY /opt/mongodb/mydata/dump/RHY
2018-05-20T17:15:20.762+0800 the --db and --collection args should only be used when restoring from a BSON file. Other uses are deprecated and will not exist in the future; use --nsInclude instead
2018-05-20T17:15:20.763+0800 building a list of collections to restore from /opt/mongodb/mydata/dump/RHY dir
2018-05-20T17:15:20.767+0800 reading metadata for RHY.ST_RIVER_R from /opt/mongodb/mydata/dump/RHY/ST_RIVER_R.metadata.json
2018-05-20T17:15:20.767+0800 reading metadata for RHY.ST_STBPRP_B from /opt/mongodb/mydata/dump/RHY/ST_STBPRP_B.metadata.json
2018-05-20T17:15:20.779+0800 restoring RHY.ST_RIVER_R from /opt/mongodb/mydata/dump/RHY/ST_RIVER_R.bson
2018-05-20T17:15:20.779+0800 restoring RHY.ST_STBPRP_B from /opt/mongodb/mydata/dump/RHY/ST_STBPRP_B.bson
2018-05-20T17:15:23.772+0800 [####....................] RHY.ST_STBPRP_B 2.61MB/13.6MB (19.1%)
2018-05-20T17:15:23.772+0800 [........................] RHY.ST_RIVER_R 849KB/2.89GB (0.0%)
2018-05-20T17:15:23.772+0800
2018-05-20T17:15:26.759+0800 [##########..............] RHY.ST_STBPRP_B 6.04MB/13.6MB (44.4%)
2018-05-20T17:15:26.759+0800 [........................] RHY.ST_RIVER_R 2.15MB/2.89GB (0.1%)
2018-05-20T17:15:26.759+0800
2018-05-20T17:15:29.759+0800 [#####################...] RHY.ST_STBPRP_B 12.0MB/13.6MB (87.9%)
2018-05-20T17:15:29.759+0800 [........................] RHY.ST_RIVER_R 4.63MB/2.89GB (0.2%)
2018-05-20T17:15:29.759+0800
2018-05-20T17:15:31.447+0800 [########################] RHY.ST_STBPRP_B 13.6MB/13.6MB (100.0%)
2018-05-20T17:15:31.447+0800 no indexes to restore
2018-05-20T17:15:31.447+0800 finished restoring RHY.ST_STBPRP_B (27276 documents)
2018-05-20T17:15:32.758+0800 [........................] RHY.ST_RIVER_R 6.79MB/2.89GB (0.2%)
2018-05-20T17:15:35.758+0800 [........................] RHY.ST_RIVER_R 8.61MB/2.89GB (0.3%)
2018-05-20T17:15:38.758+0800 [........................] RHY.ST_RIVER_R 11.9MB/2.89GB (0.4%)
2018-05-20T17:15:41.758+0800 [........................] RHY.ST_RIVER_R 15.7MB/2.89GB (0.5%)
[mongodb@server4 conf]$ mongo 192.168.1.31:20000
MongoDB shell version v3.6.4
connecting to: mongodb://192.168.1.31:20000/test
MongoDB server version: 3.6.4
Server has startup warnings:
2018-05-20T17:01:28.017+0800 I CONTROL [main]
2018-05-20T17:01:28.018+0800 I CONTROL [main] ** WARNING: Access control is not enabled for the database.
2018-05-20T17:01:28.018+0800 I CONTROL [main] ** Read and write access to data and configuration is unrestricted.
2018-05-20T17:01:28.018+0800 I CONTROL [main]
mongos> use RHY
switched to db RHY
mongos> db.ST_RIVER_R.count()
1589817
mongos> db.ST_RIVER_R.findOne()
{
"_id" : ObjectId("5b012fdee4a39884b75e4880"),
"STCD" : 60812000,
"TM" : "2011-04-13 02:00:00",
"Z" : 405,
"Q" : 60,
"XSA" : "",
"XSAVV" : "",
"XSMXV" : "",
"FLWCHRCD" : "",
"WPTN" : 4,
"MSQMT" : 1,
"MSAMT" : "",
"MSVMT" : ""
}
mongos> db.ST_RIVER_R.find({"STCD":60812000})
{ "_id" : ObjectId("5b012fdee4a39884b75e4880"), "STCD" : 60812000, "TM" : "2011-04-13 02:00:00", "Z" : 405, "Q" : 60, "XSA" : "", "XSAVV" : "", "XSMXV" : "", "FLWCHRCD" : "", "WPTN" : 4, "MSQMT" : 1, "MSAMT" : "", "MSVMT" : "" }
{ "_id" : ObjectId("5b012fdee4a39884b75e48ad"), "STCD" : 60812000, "TM" : "2011-04-19 02:00:00", "Z" : 377.71, "Q" : 13.3, "XSA" : "", "XSAVV" : "", "XSMXV" : "", "FLWCHRCD" : "", "WPTN" : 6, "MSQMT" : 1, "MSAMT" : "", "MSVMT" : "" }
CentOS7.4+MongoBD3.6.4集群(Shard)部署以及大数据量入库的更多相关文章
- Centos7.5基于MySQL5.7的 InnoDB Cluster 多节点高可用集群环境部署记录
一. MySQL InnoDB Cluster 介绍MySQL的高可用架构无论是社区还是官方,一直在技术上进行探索,这么多年提出了多种解决方案,比如MMM, MHA, NDB Cluster, G ...
- Centos7下ELK+Redis日志分析平台的集群环境部署记录
之前的文档介绍了ELK架构的基础知识,日志集中分析系统的实施方案:- ELK+Redis- ELK+Filebeat - ELK+Filebeat+Redis- ELK+Filebeat+Kafka+ ...
- 磁盘空间引起ES集群shard unassigned的处理过程
1.问题描述 早上醒来发现手机有很多ES状态为red的告警,集群就前几天加了几个每天有十多亿记录的业务,当时估算过磁盘容量,应该是没有问题的,但是现在集群状态突然变成red了,这就有点懵逼了. 2.查 ...
- MySQL+MGR 单主模式和多主模式的集群环境 - 部署手册 (Centos7.5)
MySQL Group Replication(简称MGR)是MySQL官方于2016年12月推出的一个全新的高可用与高扩展的解决方案.MGR是MySQL官方在5.7.17版本引进的一个数据库高可用与 ...
- Mongodb副本集+分片集群环境部署记录
前面详细介绍了mongodb的副本集和分片的原理,这里就不赘述了.下面记录Mongodb副本集+分片集群环境部署过程: MongoDB Sharding Cluster,需要三种角色: Shard S ...
- 1.Hadoop集群安装部署
Hadoop集群安装部署 1.介绍 (1)架构模型 (2)使用工具 VMWARE cenos7 Xshell Xftp jdk-8u91-linux-x64.rpm hadoop-2.7.3.tar. ...
- Mongodb副本集+分片集群环境部署
前面详细介绍了mongodb的副本集和分片的原理,这里就不赘述了.下面记录Mongodb副本集+分片集群环境部署过程: MongoDB Sharding Cluster,需要三种角色: Shard S ...
- kubernetes学习与实践篇(二) kubernetes1.5 的安装和集群环境部署
kubernetes 1.5 的安装和集群环境部署 文章转载自:http://www.cnblogs.com/tynia/p/k8s-cluster.html 简介: Docker:是一个开源的应用容 ...
- 面试系列10 es生产集群的部署架构
如果你确实干过es,那你肯定了解你们生产es集群的实际情况,部署了几台机器?有多少个索引?每个索引有多大数据量?每个索引给了多少个分片?你肯定知道! 但是如果你确实没干过,也别虚,我给你说一个基本的版 ...
随机推荐
- CSS中的BFC详解
引言: 这篇文章是我对BFC的理解及总结,带你揭开BFC的面纱.你将会知道BFC是什么,形成BFC的条件,BFC的相关特性,以及他的实际应用. 一.何为BFC BFC(Block Formatting ...
- NOIP2017滚粗记【下】
(续上篇) Day1: 下午,全竞赛队的人都在竞赛室颓~,再次吐槽下我校网管科的,下午普及考试又把竞赛室的网络切掉了,还好我们机制地准别了一堆单机游戏.普及组考完后,网络又恢复正常了,但晚上9点左右又 ...
- .Net Core 发布WindowsServices
项目代码文件夹 执行命令 dotnet publish -c Release -r win-x64
- 递归、字节流、文件复制_DAY20
1:递归(理解) (1)方法定义中调用方法本身的现象. (2)递归注意事项: A:要有出口,否则就是死递归. B:次数不能太多,否则内存溢出. 特殊事项:构造方法不能递归定义. 例子:cn.itcas ...
- Java之集合(九)LinkedHashMap
转载请注明源出处:http://www.cnblogs.com/lighten/p/7367525.html 1.前言 前一章对Map中的HashMap进行了讲解(虽然只详细介绍了一下红黑树的部分), ...
- android listview实现点击某个item后使其显示在屏幕顶端
在该listview的点击事件中加入一下代码即可 listView.setSelectionFromTop(position, 0);
- BiLSTM+CRF 实体识别
https://www.cnblogs.com/Determined22/p/7238342.html 这篇博客 里面这个公式表示抽象的含义,表示的是最后的分数由他们影响,不是直观意义上的相加. 为什 ...
- python中不可变数据类型和可变数据类型
在学习python过程中我们一定会遇到不可变数据类型和可变数据类型. 1.名词解释 以下所有的内容都是基于内存地址来说的. 不可变数据类型: 当该数据类型的对应变量的值发生了改变,那么它对应的内存地址 ...
- WPF中List的Add()与Insert()方法的区别
先来看看定义: // Summary: // Adds an object to the end of the System.Collections.Generic.List<T>. // ...
- Inno Setup设置在安装Finished页面,点击finish后打开网页
在安装的最后一个页面FinishPage中点击Finished然后打开一个网页 这个功能貌似很简单,不就是在点击finish按钮给它绑定事件,问题立马解决. 在普通的桌面应用程序开发中的确是这样做的, ...