前言

mongodb支持自动分片,集群自动的切分数据,做负载均衡。避免上面的分片管理难度。mongodb分片是将集合切合成小块,分散到若干片里面,每个片负责所有数据的一部分。这些块对应用程序来说是透明的,不需要知道哪些数据分布到哪些片上,甚至不在乎是否有做过分片,应用程序连接mongos进程,mongos知道数据和片的对应关系,将客户端请求转发到正确的片上,如果请求有了响应,mongos将结果收集起来返回给客户端程序。

分片适用场景:

1)服务器磁盘不够用

2)单个mongod不能满足日益频繁写请求

3)将大量数据存放于内存中提高性能

建立分片需要三种角色:

1.shard server

      保存实际数据容器。每个shard可以是单个的mongod实例,也可以是复制集,即使片内又多台服务器,只能有一个主服务器,其他的保存相同数据的副本。为了实现每个shard内部的auto-failover,强烈建议为每个shard配置一组Replica Set。

2.config server

为了将一个特定的collection 存储在多个shard 中,需要为该collection 指定一个shardkey,shardkey 可以决定该条记录属于哪个chunk。Config Servers 就是用来存储:所有shard 节点的配置信息、每个chunk 的shardkey 范围、chunk 在各shard 的分布情况、该集群中所有DB 和collection 的sharding 配置信息。

3.route server

      集群前端路由,路由所有请求,然后将结果聚合。客户端由此接入,询问config server需要到哪些shard上查询或保存数据,再连接到相应的shard进行操作,最后将结果返回给客户端。客户端只需要将原先发送给mongod的请求原封不动的发给mongos(即route server)即可,不必知道数据分布在哪个shard上。

shard key:设置分片时,需要从集合中选一个键,作为数据拆分的依据,这个键就是shard key。shard key的选择决定了插入操作在片之间的分布。shard key保证足够的不一致性,数据才能更好的分布到多台服务器上。同时保持块在一个合理的规模是非常重要的,这样数据平衡和移动块不会消耗大量的资源。

1.  参考说明

参考文档:

https://www.cnblogs.com/ityouknow/p/7566682.html

https://docs.mongodb.com/manual/tutorial/deploy-shard-cluster/

2.  安装环境说明

2.1.  环境说明

服务器

主机名

server1.smartmap.com

server2.smartmap.com

server3.smartmap.com

server4.smartmap.com

IP

192.168.1.31

192.168.1.32

192.168.1.33

192.168.1.34

Subnet mask

255.255.255.0

255.255.255.0

255.255.255.0

255.255.255.0

Gateway

192.168.1.1

192.168.1.1

192.168.1.1

192.168.1.1

DNS

218.30.19.50

61.134.1.5

218.30.19.50

61.134.1.5

218.30.19.50

61.134.1.5

218.30.19.50

61.134.1.5

运行服务

Config

Config

Config

Route

Shard1

Shard1

Shard1

Shard2

Shard2

Shard2

Shard3

Shard3

Shard3

Shard4

Shard4

Shard4

Route

服务

端口

Route

20000

Config

21000

Shard1

27001

Shard2

27002

Shard3

27003

Shard4

27004

2.2.  安装基础软件

[root@server1~]# yum install unzip wget ntp

2.3.  安装与配置NTP

[root@server1 ~]# yum update

[root@server1 ~]# yum install unzip wget ntp

[root@server1 ~]# systemctl is-enabled ntpd

disabled

[root@server1 ~]# systemctl enable ntpd

Created symlink from /etc/systemd/system/multi-user.target.wants/ntpd.service to /usr/lib/systemd/system/ntpd.service.

[root@server1 ~]# systemctl start ntpd

[root@server1 ~]# ntpdate -u 2.asia.pool.ntp.org

2.4.  添加用户

[root@server1 ~]# useradd mongodb

[root@server1 ~]# passwd mongodb

[root@server1 ~]# chmod u+w /etc/sudoers

[root@server1 ~]#

[root@server1 ~]# vi /etc/sudoers

添加如下内容:

mongodb  ALL=(ALL)       NOPASSWD: ALL

3.  安装

3.1.  MongoDB下载

https://www.mongodb.com/download-center?jmp=nav#community

[root@server1 ~]# mkdir /opt/mongodb

[root@server1 ~]# chown -R mongodb:mongodb /opt/mongodb/

3.2.  MongoDB解压

[root@server1 ~]# su – mongodb

[mongodb@server1 mongodb]$ cd /opt/mongodb

[mongodb@server1 mongodb]$ tar -zxvf mongodb-linux-x86_64-rhel70-3.6.4.tgz

[mongodb@server1 mongodb]$ mv mongodb-linux-x86_64-rhel70-3.6.4 mongodb-app

3.3.  创建相关目录

在所有的服务器上建立conf、mongos、config、shard1、shard2、shard3、shard4、目录,因为mongos不存储数据,只需要建立日志文件目录即可

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/conf

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/mongos/log

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/config/data

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/config/log

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard1/data

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard1/log

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard2/data

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard2/log

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard3/data

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard3/log

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard4/data

[mongodb@server1 mongodb]$ mkdir -p /opt/mongodb/mongodb-app/data/shard4/log

3.4.  复制到其它节点

[mongodb@server2 ~]$ scp -r mongodb@192.168.1.31:/opt/mongodb/mongodb-app /opt/mongodb/

[mongodb@server3 ~]$ scp -r mongodb@192.168.1.31:/opt/mongodb/mongodb-app /opt/mongodb/

[mongodb@server4 ~]$ scp -r mongodb@192.168.1.31:/opt/mongodb/mongodb-app /opt/mongodb/

3.5.  环境变量

[mongodb@server1 mongodb]$ sudo vi /etc/profile

export MONGODB_HOME=/opt/mongodb/mongodb-app

export PATH=$MONGODB_HOME/bin:$PATH

[mongodb@server1 mongodb]$ source /etc/profile

[mongodb@server1 mongodb]$ mongod -v

4.  配置

4.1.  config server配置服务器

4.1.1.  创建配置文件

服务器31,32,33上配置以下内容

[mongodb@server1 mongodb]$ vi /opt/mongodb/mongodb-app/conf/config.conf

## content

systemLog:

destination: file

logAppend: true

path: /opt/mongodb/mongodb-app/data/config/log/config.log

# Where and how to store data.

storage:

dbPath: /opt/mongodb/mongodb-app/data/config/data

journal:

enabled: true

# how the process runs

processManagement:

fork: true

pidFilePath: /opt/mongodb/mongodb-app/data/config/log/configsrv.pid

# network interfaces

net:

port:

bindIp: 0.0.0.0

#operationProfiling:

replication:

replSetName: config

sharding:

clusterRole: configsvr

4.1.2.  启动三台服务器的config server

[mongodb@server1 mongodb]$ mongod --config /opt/mongodb/mongodb-app/conf/config.conf

about to fork child process, waiting until server is ready for connections.

forked process: 8072

child process started successfully, parent exiting

[mongodb@server1 mongodb]$

[mongodb@server1 mongodb]$ mongod --shutdown --config /opt/mongodb/mongodb-app/conf/config.conf

4.1.3.  登录任意一台配置服务器,初始化配置副本集

[mongodb@server1 conf]$ mongo 192.168.1.31:21000

MongoDB shell version v3.6.4

connecting to: mongodb://192.168.1.31:21000/test

MongoDB server version: 3.6.4

> use admin;

switched to db admin

> config = {

_id : "config",

members : [

{_id : 0, host : "192.168.1.31:21000" },

{_id : 1, host : "192.168.1.32:21000" },

{_id : 2, host : "192.168.1.33:21000" }

]

}

> rs.initiate(config);

config:SECONDARY> rs.status();

登录

mongo 192.168.1.31:21000

#切换数据库

use admin;

#定义配置变量

config = {

_id : "config",

members : [

{_id : 0, host : "192.168.1.31:21000" },

{_id : 1, host : "192.168.1.32:21000" },

{_id : 2, host : "192.168.1.33:21000" }

]

}

#初始化副本集

rs.initiate(config);

#查看分区状态

rs.status();

4.2.  shard server配置服务器

4.2.1.  shard1配置

4.2.1.1.创建配置文件

服务器31,32,33上配置以下内容

[mongodb@server1 mongodb]$ vi /opt/mongodb/mongodb-app/conf/shard1.conf

# shard1 config

systemLog:

destination: file

logAppend: true

path: /opt/mongodb/mongodb-app/data/shard1/log/shard1.log

# Where and how to store data.

storage:

dbPath: /opt/mongodb/mongodb-app/data/shard1/data

journal:

enabled: true

wiredTiger:

engineConfig:

cacheSizeGB: 20

# how the process runs

processManagement:

fork: true

pidFilePath: /opt/mongodb/mongodb-app/data/shard1/log/shard1.pid

# network interfaces

net:

port:

bindIp: 0.0.0.0

#operationProfiling:

replication:

replSetName: shard1

sharding:

clusterRole: shardsvr

4.2.1.2.启动三台服务器的shard1 server

[mongodb@server1 mongodb]$ mongod  --config  /opt/mongodb/mongodb-app/conf/shard1.conf

about to fork child process, waiting until server is ready for connections.

forked process: 8420

child process started successfully, parent exiting

[mongodb@server1 mongodb]$

[mongodb@server1 mongodb]$ mongod --shutdown --config /opt/mongodb/mongodb-app/conf/shard1.conf

killing process with pid: 8420

4.2.1.3.登录任意一台配置服务器,初始化配置副本集

[mongodb@server1 mongodb]$ mongo 192.168.1.31:27001

MongoDB shell version v3.6.4

connecting to: mongodb://192.168.1.31:27001/test

MongoDB server version: 3.6.4

> use admin;

switched to db admin

> config = {

_id : "shard1",

members : [

{_id : 0, host : "192.168.1.31:27001" },

{_id : 1, host : "192.168.1.32:27001" },

{_id : 2, host : "192.168.1.33:27001" }

]

}

> rs.initiate(config);

{ "ok" : 1 }

shard1:SECONDARY> rs.status();

4.2.2.  shard2配置

4.2.2.1.创建配置文件

服务器34,31,32上配置以下内容

[mongodb@server4 mongodb]$ vi /opt/mongodb/mongodb-app/conf/shard2.conf

# shard2 config

systemLog:

destination: file

logAppend: true

path: /opt/mongodb/mongodb-app/data/shard2/log/shard2.log

# Where and how to store data.

storage:

dbPath: /opt/mongodb/mongodb-app/data/shard2/data

journal:

enabled: true

wiredTiger:

engineConfig:

cacheSizeGB: 20

# how the process runs

processManagement:

fork: true

pidFilePath: /opt/mongodb/mongodb-app/data/shard2/log/shard2.pid

# network interfaces

net:

port:

bindIp: 0.0.0.0

#operationProfiling:

replication:

replSetName: shard2

sharding:

clusterRole: shardsvr

4.2.2.2.启动三台服务器的shard2 server

[mongodb@server4 mongodb]$ mongod --config /opt/mongodb/mongodb-app/conf/shard2.conf

about to fork child process, waiting until server is ready for connections.

forked process: 8025

child process started successfully, parent exiting

[mongodb@server4 mongodb]$

[mongodb@server1 mongodb]$ mongod --shutdown --config /opt/mongodb/mongodb-app/conf/shard2.conf

killing process with pid: 8420

4.2.2.3.登录任意一台配置服务器,初始化配置副本集

[mongodb@server4 mongodb]$ mongo 192.168.1.34:27002

MongoDB shell version v3.6.4

connecting to: mongodb://192.168.1.34:27002/test

MongoDB server version: 3.6.4

Welcome to the MongoDB shell.

> use admin;

switched to db admin

> config = {

_id : "shard2",

members : [

{_id : 0, host : "192.168.1.34:27002" },

{_id : 1, host : "192.168.1.31:27002" },

{_id : 2, host : "192.168.1.32:27002" }

]

}

> rs.initiate(config);

{ "ok" : 1 }

shard2:OTHER> rs.status();

4.2.3.  shard3配置

4.2.3.1.创建配置文件

服务器33,34,31上配置以下内容

[mongodb@server3 mongodb]$ vi /opt/mongodb/mongodb-app/conf/shard3.conf

# shard3 config

systemLog:

destination: file

logAppend: true

path: /opt/mongodb/mongodb-app/data/shard3/log/shard3.log

# Where and how to store data.

storage:

dbPath: /opt/mongodb/mongodb-app/data/shard3/data

journal:

enabled: true

wiredTiger:

engineConfig:

cacheSizeGB: 20

# how the process runs

processManagement:

fork: true

pidFilePath: /opt/mongodb/mongodb-app/data/shard3/log/shard3.pid

# network interfaces

net:

port:

bindIp: 0.0.0.0

#operationProfiling:

replication:

replSetName: shard3

sharding:

clusterRole: shardsvr

4.2.3.2.启动三台服务器的shard3 server

[mongodb@server3 mongodb]$ mongod --config /opt/mongodb/mongodb-app/conf/shard3.conf

about to fork child process, waiting until server is ready for connections.

forked process: 8179

child process started successfully, parent exiting

[mongodb@server3 mongodb]$

[mongodb@server1 mongodb]$ mongod --shutdown --config /opt/mongodb/mongodb-app/conf/shard3.conf

killing process with pid: 8420

4.2.3.3.登录任意一台配置服务器,初始化配置副本集

[mongodb@server3 mongodb]$ mongo 192.168.1.33:27003

MongoDB shell version v3.6.4

connecting to: mongodb://192.168.1.33:27003/test

MongoDB server version: 3.6.4

Welcome to the MongoDB shell.

> use admin

switched to db admin

> config = {

_id : "shard3",

members : [

{_id : 0, host : "192.168.1.33:27003" },

{_id : 1, host : "192.168.1.34:27003" },

{_id : 2, host : "192.168.1.31:27003" }

]

}

> rs.initiate(config);

{ "ok" : 1 }

shard3:OTHER> rs.status();

4.2.4.  shard4配置

4.2.4.1.创建配置文件

服务器32,33,34上配置以下内容

[mongodb@server2 mongodb]$ vi /opt/mongodb/mongodb-app/conf/shard4.conf

# shard4 config

systemLog:

destination: file

logAppend: true

path: /opt/mongodb/mongodb-app/data/shard4/log/shard4.log

# Where and how to store data.

storage:

dbPath: /opt/mongodb/mongodb-app/data/shard4/data

journal:

enabled: true

wiredTiger:

engineConfig:

cacheSizeGB: 20

# how the process runs

processManagement:

fork: true

pidFilePath: /opt/mongodb/mongodb-app/data/shard4/log/shard4.pid

# network interfaces

net:

port:

bindIp: 0.0.0.0

#operationProfiling:

replication:

replSetName: shard4

sharding:

clusterRole: shardsvr

4.2.4.2.启动三台服务器的shard4 server

[mongodb@server2 mongodb]$ mongod --config /opt/mongodb/mongodb-app/conf/shard4.conf

about to fork child process, waiting until server is ready for connections.

forked process: 8436

child process started successfully, parent exiting

[mongodb@server4 mongodb]$ mongod --shutdown --config /opt/mongodb/mongodb-app/conf/shard4.conf

killing process with pid: 8238

4.2.4.3.登录任意一台配置服务器,初始化配置副本集

[mongodb@server2 mongodb]$ mongo 192.168.1.32:27004

MongoDB shell version v3.6.4

connecting to: mongodb://192.168.1.32:27004/test

MongoDB server version: 3.6.4

Welcome to the MongoDB shell.

For interactive help, type "help".

> use admin

switched to db admin

> config = {

_id : "shard4",

members : [

{_id : 0, host : "192.168.1.32:27004" },

{_id : 1, host : "192.168.1.33:27004" },

{_id : 2, host : "192.168.1.34:27004" }

]

}

> rs.initiate(config);

{ "ok" : 1 }

shard4:OTHER> rs.status();

4.3.  配置路由服务器 mongos

4.3.1.  创建配置文件

服务器31,34上配置以下内容

[mongodb@server1 mongodb]$ vi /opt/mongodb/mongodb-app/conf/mongos.conf

systemLog:

destination: file

logAppend: true

path: /opt/mongodb/mongodb-app/data/mongos/log/mongos.log

processManagement:

fork: true

#  pidFilePath: /opt/mongodb/mongodb-app/data/mongos/log/mongos.pid

# network interfaces

net:

port:

bindIp: 0.0.0.0

# 监听的配置服务器,只能有1个或者3个 config为配置服务器的副本集名字

sharding:

configDB: config/192.168.1.31:21000,192.168.1.32:21000,192.168.1.33:21000

4.3.2.  启动三台服务器的config server

注意:先启动配置服务器和分片服务器,后启动路由实例

[mongodb@server1 conf]$ mongos --config /opt/mongodb/mongodb-app/conf/mongos.conf

about to fork child process, waiting until server is ready for connections.

forked process: 9869

child process started successfully, parent exiting

4.4.  启用分片

目前搭建了mongodb配置服务器、路由服务器,各个分片服务器,不过应用程序连接到mongos路由服务器并不能使用分片机制,还需要在程序里设置分片配置,让分片生效。

登陆任意一台mongos

[mongodb@server1 data]$ mongo 192.168.1.31:20000/admin

MongoDB shell version v3.6.4

connecting to: mongodb://192.168.1.31:20000/admin

MongoDB server version: 3.6.4

mongos> use admin;

switched to db admin

mongos> sh.addShard("shard1/192.168.1.31:27001,192.168.1.32:27001,192.168.1.33:27001")

mongos> sh.addShard("shard2/192.168.1.34:27002,192.168.1.31:27002,192.168.1.32:27002")

mongos> sh.addShard("shard3/192.168.1.33:27003,192.168.1.34:27003,192.168.1.31:27003")

mongos> sh.addShard("shard4/192.168.1.32:27004,192.168.1.33:27004,192.168.1.34:27004")

mongos> sh.status()

--- Sharding Status ---

sharding version: {

"_id" : 1,

"minCompatibleVersion" : 5,

"currentVersion" : 6,

"clusterId" : ObjectId("5b00d641da35619896a78891")

}

shards:

{  "_id" : "shard1", "host" : "shard1/192.168.1.31:27001,192.168.1.32:27001,192.168.1.33:27001",  "state" : 1 }

{  "_id" : "shard2", "host" : "shard2/192.168.1.31:27002,192.168.1.32:27002,192.168.1.34:27002",  "state" : 1 }

{  "_id" : "shard3", "host" : "shard3/192.168.1.31:27003,192.168.1.33:27003,192.168.1.34:27003",  "state" : 1 }

{  "_id" : "shard4", "host" : "shard4/192.168.1.32:27004,192.168.1.33:27004,192.168.1.34:27004",  "state" : 1 }

active mongoses:

"3.6.4" : 2

autosplit:

Currently enabled: yes

balancer:

Currently enabled:  yes

Currently running:  no

Failed balancer rounds in last 5 attempts:  0

Migration Results for the last 24 hours:

No recent migrations

databases:

{  "_id" : "config", "primary" : "config",  "partitioned" : true }

mongos>

mongos> sh.getBalancerState()

true

mongos> sh.startBalancer()

{

"ok" : 1,

"$clusterTime" : {

"clusterTime" : Timestamp(1526784919, 4),

"signature" : {

"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),

"keyId" : NumberLong(0)

}

},

"operationTime" : Timestamp(1526784919, 4)

}

5.  验证

5.1.  列出shard

[mongodb@server1 data]$ mongo 192.168.1.31:20000/admin

MongoDB shell version v3.6.4

connecting to: mongodb://192.168.1.31:20000/admin

MongoDB server version: 3.6.4

mongos> use admin;

switched to db admin

mongos> db.runCommand({ listshards:1 });   #列出 shard 个数

{

"shards" : [

{

"_id" : "shard1",

"host" : "shard1/192.168.1.31:27001,192.168.1.32:27001,192.168.1.33:27001",

"state" : 1

},

{

"_id" : "shard2",

"host" : "shard2/192.168.1.31:27002,192.168.1.32:27002,192.168.1.34:27002",

"state" : 1

},

{

"_id" : "shard3",

"host" : "shard3/192.168.1.31:27003,192.168.1.33:27003,192.168.1.34:27003",

"state" : 1

},

{

"_id" : "shard4",

"host" : "shard4/192.168.1.32:27004,192.168.1.33:27004,192.168.1.34:27004",

"state" : 1

}

],

"ok" : 1,

"$clusterTime" : {

"clusterTime" : Timestamp(1526786005, 2),

"signature" : {

"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),

"keyId" : NumberLong(0)

}

},

"operationTime" : Timestamp(1526786005, 2)

}

5.2.  创建 Shard功能的数据库与表

[mongodb@server1 data]$ mongo 192.168.1.31:20000/admin

MongoDB shell version v3.6.4

connecting to: mongodb://192.168.1.31:20000/admin

MongoDB server version: 3.6.4

mongos> use admin;

switched to db admin

mongos> db.runCommand({ enablesharding: "RHY" });

{

"ok" : 1,

"$clusterTime" : {

"clusterTime" : Timestamp(1526791888, 10),

"signature" : {

"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),

"keyId" : NumberLong(0)

}

},

"operationTime" : Timestamp(1526791888, 10)

}

mongos> db.runCommand({ shardcollection: "RHY.ST_RIVER_R", key: {STCD: 1}, unique: false })

{

"collectionsharded" : "RHY.ST_RIVER_R",

"collectionUUID" : UUID("370a8a62-cd5d-488a-b752-6fedba5da507"),

"ok" : 1,

"$clusterTime" : {

"clusterTime" : Timestamp(1526791904, 14),

"signature" : {

"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),

"keyId" : NumberLong(0)

}

},

"operationTime" : Timestamp(1526791904, 14)

}

mongos>

5.3.  向单个MongoDB中导入CSV数据

[root@server1 bin]# ./mongoimport -d RHY -c ST_STBPRP_B --type csv --headerline --file /opt/mongodb/mydata/ST_STBPRP_B.CSV

2018-05-20T16:18:37.849+0800    connected to: localhost

2018-05-20T16:18:38.748+0800    imported 27276 documents

5.4.  向Shard中导入数据

5.4.1.  安装Python包

pip install pymongo

5.4.2.  Python访问MongoDB的Shard集群

import sys

import json

import pymongo

import datetime

from pymongo import MongoClient

client = MongoClient('mongodb://192.168.1.31:20000,192.168.1.34:20000')

db = client.RHY

collection = db.ST_RIVER_R

f = open("D:/bigdata/st_river_r.CSV")

line = f.readline()

print(line)

fieldNames = line.split(',')

# STCD,TM,Z,Q,XSA,XSAVV,XSMXV,FLWCHRCD,WPTN,MSQMT,MSAMT,MSVMT

line = f.readline()

count = 0

records = []

insertCount = 0

while line:

#

count = count + 1

fieldValues = line.split(',')

iflen(fieldValues) == 12or fieldValues[0].strip() != '':

insertObj = {}

STCD = fieldValues[0]

insertObj['STCD'] = STCD

TM = fieldValues[1]

if TM.strip() != '':

TM = datetime.datetime.strptime(TM, '%Y-%m-%d %H:%M:%S')

insertObj['TM'] = TM

Z = fieldValues[2]

if Z.strip() != '':

Z = float(Z)

insertObj['Z'] = Z

Q = fieldValues[3]

if Q.strip() != '':

Q = float(Q)

insertObj['Q'] = Q

# XSA

XSA = fieldValues[4]

if XSA.strip() != '':

XSA = float(XSA)

insertObj['XSA'] = XSA

# XSAVV

XSAVV = fieldValues[5]

if XSAVV.strip() != '':

XSAVV = float(XSAVV)

insertObj['XSAVV'] = XSAVV

#

XSMXV = fieldValues[6]

if XSMXV.strip() != '':

XSMXV = float(XSMXV)

insertObj['XSMXV'] = XSMXV

#

FLWCHRCD = fieldValues[7]

if FLWCHRCD.strip() != '':

insertObj['FLWCHRCD'] = FLWCHRCD

#

WPTN = fieldValues[8]

if WPTN.strip() != '':

insertObj['WPTN'] = WPTN

#

MSQMT = fieldValues[9]

if MSQMT.strip() != '':

insertObj['MSQMT'] = MSQMT

#

MSAMT = fieldValues[10]

if MSAMT.strip() != '':

insertObj['MSAMT'] = MSAMT

#

MSVMT = fieldValues[11]

if MSVMT.strip() != '':

insertObj['MSVMT'] = MSVMT

#

# collection.insert_one(insertObj)

# collection.insert_many(new_posts)

records.append(insertObj)

iflen(records) == 1000:

insertCount = insertCount + 1

if count > 1451000:

collection.insert_many(records)

print(str(count) + '  ' + str(insertCount))

print(count)

records = []

else:

print(line)

#

line = f.readline()

f.close()

client.close()

5.5.  从单个MongoDB中备份数据

[root@server1 bin]# ./mongodump -h 127.0.0.1:27017 -d RHY -o /opt/mongodb/mydata/dump

5.6.  向Shard中恢复数据

[mongodb@server1 bin]$ ./mongorestore -h 192.168.1.31 --port 20000 -d RHY /opt/mongodb/mydata/dump/RHY

2018-05-20T17:15:20.762+0800    the --db and --collection args should only be used when restoring from a BSON file. Other uses are deprecated and will not exist in the future; use --nsInclude instead

2018-05-20T17:15:20.763+0800    building a list of collections to restore from /opt/mongodb/mydata/dump/RHY dir

2018-05-20T17:15:20.767+0800    reading metadata for RHY.ST_RIVER_R from /opt/mongodb/mydata/dump/RHY/ST_RIVER_R.metadata.json

2018-05-20T17:15:20.767+0800    reading metadata for RHY.ST_STBPRP_B from /opt/mongodb/mydata/dump/RHY/ST_STBPRP_B.metadata.json

2018-05-20T17:15:20.779+0800    restoring RHY.ST_RIVER_R from /opt/mongodb/mydata/dump/RHY/ST_RIVER_R.bson

2018-05-20T17:15:20.779+0800    restoring RHY.ST_STBPRP_B from /opt/mongodb/mydata/dump/RHY/ST_STBPRP_B.bson

2018-05-20T17:15:23.772+0800    [####....................]  RHY.ST_STBPRP_B  2.61MB/13.6MB (19.1%)

2018-05-20T17:15:23.772+0800    [........................]   RHY.ST_RIVER_R   849KB/2.89GB   (0.0%)

2018-05-20T17:15:23.772+0800

2018-05-20T17:15:26.759+0800    [##########..............]  RHY.ST_STBPRP_B  6.04MB/13.6MB (44.4%)

2018-05-20T17:15:26.759+0800    [........................]   RHY.ST_RIVER_R  2.15MB/2.89GB   (0.1%)

2018-05-20T17:15:26.759+0800

2018-05-20T17:15:29.759+0800    [#####################...]  RHY.ST_STBPRP_B  12.0MB/13.6MB (87.9%)

2018-05-20T17:15:29.759+0800    [........................]   RHY.ST_RIVER_R  4.63MB/2.89GB   (0.2%)

2018-05-20T17:15:29.759+0800

2018-05-20T17:15:31.447+0800    [########################]  RHY.ST_STBPRP_B  13.6MB/13.6MB (100.0%)

2018-05-20T17:15:31.447+0800    no indexes to restore

2018-05-20T17:15:31.447+0800    finished restoring RHY.ST_STBPRP_B (27276 documents)

2018-05-20T17:15:32.758+0800    [........................]  RHY.ST_RIVER_R  6.79MB/2.89GB (0.2%)

2018-05-20T17:15:35.758+0800    [........................]  RHY.ST_RIVER_R  8.61MB/2.89GB (0.3%)

2018-05-20T17:15:38.758+0800    [........................]  RHY.ST_RIVER_R  11.9MB/2.89GB (0.4%)

2018-05-20T17:15:41.758+0800    [........................]  RHY.ST_RIVER_R  15.7MB/2.89GB (0.5%)

[mongodb@server4 conf]$ mongo 192.168.1.31:20000

MongoDB shell version v3.6.4

connecting to: mongodb://192.168.1.31:20000/test

MongoDB server version: 3.6.4

Server has startup warnings:

2018-05-20T17:01:28.017+0800 I CONTROL  [main]

2018-05-20T17:01:28.018+0800 I CONTROL  [main] ** WARNING: Access control is not enabled for the database.

2018-05-20T17:01:28.018+0800 I CONTROL  [main] **          Read and write access to data and configuration is unrestricted.

2018-05-20T17:01:28.018+0800 I CONTROL  [main]

mongos> use RHY

switched to db RHY

mongos> db.ST_RIVER_R.count()

1589817

mongos> db.ST_RIVER_R.findOne()

{

"_id" : ObjectId("5b012fdee4a39884b75e4880"),

"STCD" : 60812000,

"TM" : "2011-04-13 02:00:00",

"Z" : 405,

"Q" : 60,

"XSA" : "",

"XSAVV" : "",

"XSMXV" : "",

"FLWCHRCD" : "",

"WPTN" : 4,

"MSQMT" : 1,

"MSAMT" : "",

"MSVMT" : ""

}

mongos> db.ST_RIVER_R.find({"STCD":60812000})

{ "_id" : ObjectId("5b012fdee4a39884b75e4880"), "STCD" : 60812000, "TM" : "2011-04-13 02:00:00", "Z" : 405, "Q" : 60, "XSA" : "", "XSAVV" : "", "XSMXV" : "", "FLWCHRCD" : "", "WPTN" : 4, "MSQMT" : 1, "MSAMT" : "", "MSVMT" : "" }

{ "_id" : ObjectId("5b012fdee4a39884b75e48ad"), "STCD" : 60812000, "TM" : "2011-04-19 02:00:00", "Z" : 377.71, "Q" : 13.3, "XSA" : "", "XSAVV" : "", "XSMXV" : "", "FLWCHRCD" : "", "WPTN" : 6, "MSQMT" : 1, "MSAMT" : "", "MSVMT" : "" }

CentOS7.4+MongoBD3.6.4集群(Shard)部署以及大数据量入库的更多相关文章

  1. Centos7.5基于MySQL5.7的 InnoDB Cluster 多节点高可用集群环境部署记录

    一.   MySQL InnoDB Cluster 介绍MySQL的高可用架构无论是社区还是官方,一直在技术上进行探索,这么多年提出了多种解决方案,比如MMM, MHA, NDB Cluster, G ...

  2. Centos7下ELK+Redis日志分析平台的集群环境部署记录

    之前的文档介绍了ELK架构的基础知识,日志集中分析系统的实施方案:- ELK+Redis- ELK+Filebeat - ELK+Filebeat+Redis- ELK+Filebeat+Kafka+ ...

  3. 磁盘空间引起ES集群shard unassigned的处理过程

    1.问题描述 早上醒来发现手机有很多ES状态为red的告警,集群就前几天加了几个每天有十多亿记录的业务,当时估算过磁盘容量,应该是没有问题的,但是现在集群状态突然变成red了,这就有点懵逼了. 2.查 ...

  4. MySQL+MGR 单主模式和多主模式的集群环境 - 部署手册 (Centos7.5)

    MySQL Group Replication(简称MGR)是MySQL官方于2016年12月推出的一个全新的高可用与高扩展的解决方案.MGR是MySQL官方在5.7.17版本引进的一个数据库高可用与 ...

  5. Mongodb副本集+分片集群环境部署记录

    前面详细介绍了mongodb的副本集和分片的原理,这里就不赘述了.下面记录Mongodb副本集+分片集群环境部署过程: MongoDB Sharding Cluster,需要三种角色: Shard S ...

  6. 1.Hadoop集群安装部署

    Hadoop集群安装部署 1.介绍 (1)架构模型 (2)使用工具 VMWARE cenos7 Xshell Xftp jdk-8u91-linux-x64.rpm hadoop-2.7.3.tar. ...

  7. Mongodb副本集+分片集群环境部署

    前面详细介绍了mongodb的副本集和分片的原理,这里就不赘述了.下面记录Mongodb副本集+分片集群环境部署过程: MongoDB Sharding Cluster,需要三种角色: Shard S ...

  8. kubernetes学习与实践篇(二) kubernetes1.5 的安装和集群环境部署

    kubernetes 1.5 的安装和集群环境部署 文章转载自:http://www.cnblogs.com/tynia/p/k8s-cluster.html 简介: Docker:是一个开源的应用容 ...

  9. 面试系列10 es生产集群的部署架构

    如果你确实干过es,那你肯定了解你们生产es集群的实际情况,部署了几台机器?有多少个索引?每个索引有多大数据量?每个索引给了多少个分片?你肯定知道! 但是如果你确实没干过,也别虚,我给你说一个基本的版 ...

随机推荐

  1. CSS中的BFC详解

    引言: 这篇文章是我对BFC的理解及总结,带你揭开BFC的面纱.你将会知道BFC是什么,形成BFC的条件,BFC的相关特性,以及他的实际应用. 一.何为BFC BFC(Block Formatting ...

  2. NOIP2017滚粗记【下】

    (续上篇) Day1: 下午,全竞赛队的人都在竞赛室颓~,再次吐槽下我校网管科的,下午普及考试又把竞赛室的网络切掉了,还好我们机制地准别了一堆单机游戏.普及组考完后,网络又恢复正常了,但晚上9点左右又 ...

  3. .Net Core 发布WindowsServices

    项目代码文件夹 执行命令 dotnet publish -c Release -r win-x64

  4. 递归、字节流、文件复制_DAY20

    1:递归(理解) (1)方法定义中调用方法本身的现象. (2)递归注意事项: A:要有出口,否则就是死递归. B:次数不能太多,否则内存溢出. 特殊事项:构造方法不能递归定义. 例子:cn.itcas ...

  5. Java之集合(九)LinkedHashMap

    转载请注明源出处:http://www.cnblogs.com/lighten/p/7367525.html 1.前言 前一章对Map中的HashMap进行了讲解(虽然只详细介绍了一下红黑树的部分), ...

  6. android listview实现点击某个item后使其显示在屏幕顶端

    在该listview的点击事件中加入一下代码即可 listView.setSelectionFromTop(position, 0);

  7. BiLSTM+CRF 实体识别

    https://www.cnblogs.com/Determined22/p/7238342.html 这篇博客 里面这个公式表示抽象的含义,表示的是最后的分数由他们影响,不是直观意义上的相加. 为什 ...

  8. python中不可变数据类型和可变数据类型

    在学习python过程中我们一定会遇到不可变数据类型和可变数据类型. 1.名词解释 以下所有的内容都是基于内存地址来说的. 不可变数据类型: 当该数据类型的对应变量的值发生了改变,那么它对应的内存地址 ...

  9. WPF中List的Add()与Insert()方法的区别

    先来看看定义: // Summary: // Adds an object to the end of the System.Collections.Generic.List<T>. // ...

  10. Inno Setup设置在安装Finished页面,点击finish后打开网页

    在安装的最后一个页面FinishPage中点击Finished然后打开一个网页 这个功能貌似很简单,不就是在点击finish按钮给它绑定事件,问题立马解决. 在普通的桌面应用程序开发中的确是这样做的, ...