前面已经介绍了GlusterFS分布式存储集群环境部署记录,现在模拟下更换故障Brick的操作:

1)GlusterFS集群系统一共有4个节点,集群信息如下:

  1. 分别在各个节点上配置hosts、同步好系统时间,关闭防火墙和selinux
  2. [root@GlusterFS-slave data]# cat /etc/hosts
  3. 192.168.10.239 GlusterFS-master
  4. 192.168.10.212 GlusterFS-slave
  5. 192.168.10.204 GlusterFS-slave2
  6. 192.168.10.220 GlusterFS-slave3
  7.  
  8. ------------------------------------------------------------------------------------
  9. 分别在四个节点机上使用df创建一个虚拟分区,然后在这个分区上创建存储目录
  10. [root@GlusterFS-master ~]# df -h
  11. Filesystem Size Used Avail Use% Mounted on
  12. /dev/mapper/centos-root 36G 1.8G 34G 5% /
  13. devtmpfs 2.9G 0 2.9G 0% /dev
  14. tmpfs 2.9G 0 2.9G 0% /dev/shm
  15. tmpfs 2.9G 8.5M 2.9G 1% /run
  16. tmpfs 2.9G 0 2.9G 0% /sys/fs/cgroup
  17. /dev/vda1 1014M 143M 872M 15% /boot
  18. /dev/mapper/centos-home 18G 33M 18G 1% /home
  19. tmpfs 581M 0 581M 0% /run/user/0
  20.  
  21. dd命令创建一个虚拟分区出来,格式化并挂载到/data目录下
  22. [root@GlusterFS-master ~]# dd if=/dev/vda1 of=/dev/vdb1
  23. 2097152+0 records in
  24. 2097152+0 records out
  25. 1073741824 bytes (1.1 GB) copied, 2.0979 s, 512 MB/s
  26.  
  27. [root@GlusterFS-master ~]# du -sh /dev/vdb1
  28. 1.0G /dev/vdb1
  29.  
  30. [root@GlusterFS-master ~]# mkfs.xfs -f /dev/vdb1 //这里格式成xfs格式文件,也可以格式化成ext4格式的。
  31. meta-data=/dev/vdb1 isize=512 agcount=4, agsize=65536 blks
  32. = sectsz=512 attr=2, projid32bit=1
  33. = crc=1 finobt=0, sparse=0
  34. data = bsize=4096 blocks=262144, imaxpct=25
  35. = sunit=0 swidth=0 blks
  36. naming =version 2 bsize=4096 ascii-ci=0 ftype=1
  37. log =internal log bsize=4096 blocks=2560, version=2
  38. = sectsz=512 sunit=0 blks, lazy-count=1
  39. realtime =none extsz=4096 blocks=0, rtextents=0
  40.  
  41. [root@GlusterFS-master ~]# mkdir /data
  42.  
  43. [root@GlusterFS-master ~]# mount /dev/vdb1 /data
  44.  
  45. [root@GlusterFS-master ~]# df -h
  46. Filesystem Size Used Avail Use% Mounted on
  47. /dev/mapper/centos-root 36G 1.8G 34G 5% /
  48. devtmpfs 2.9G 34M 2.8G 2% /dev
  49. tmpfs 2.9G 0 2.9G 0% /dev/shm
  50. tmpfs 2.9G 8.5M 2.9G 1% /run
  51. tmpfs 2.9G 0 2.9G 0% /sys/fs/cgroup
  52. /dev/vda1 1014M 143M 872M 15% /boot
  53. /dev/mapper/centos-home 18G 33M 18G 1% /home
  54. tmpfs 581M 0 581M 0% /run/user/0
  55. /dev/loop0 976M 2.6M 907M 1% /data
  56.  
  57. [root@GlusterFS-master ~]# fdisk -l
  58. .......
  59. Disk /dev/loop0: 1073 MB, 1073741824 bytes, 2097152 sectors
  60. Units = sectors of 1 * 512 = 512 bytes
  61. Sector size (logical/physical): 512 bytes / 512 bytes
  62. I/O size (minimum/optimal): 512 bytes / 512 bytes
  63.  
  64. 设置开机自动挂载
  65. [root@GlusterFS-master ~]# echo '/dev/loop0 /data xfs defaults 1 2' >> /etc/fstab
  66.  
  67. 记住:以上操作要在四台节点机器上都要执行一遍,即创建存储目录环境!
  68. ----------------------------------------------------------------------------------
  69.  
  70. 部署glusterfs集群的中间部分操作在此省略,具体可参考:http://www.cnblogs.com/kevingrace/p/8743812.html
  71.  
  72. 创建集群,在GlusterFS-master节点上操作:
  73. [root@GlusterFS-master ~]# gluster peer probe 192.168.10.212
  74. peer probe: success.
  75. [root@GlusterFS-master ~]# gluster peer probe 192.168.10.204
  76. peer probe: success.
  77. [root@GlusterFS-master ~]# gluster peer probe 192.168.10.220
  78. peer probe: success.
  79.  
  80. 查看集群情况
  81. [root@GlusterFS-master ~]# gluster peer status
  82. Number of Peers: 3
  83.  
  84. Hostname: 192.168.10.212
  85. Uuid: f8e69297-4690-488e-b765-c1c404810d6a
  86. State: Peer in Cluster (Connected)
  87.  
  88. Hostname: 192.168.10.204
  89. Uuid: a989394c-f64a-40c3-8bc5-820f623952c4
  90. State: Peer in Cluster (Connected)
  91.  
  92. Hostname: 192.168.10.220
  93. Uuid: dd99743a-285b-4aed-b3d6-e860f9efd965
  94. State: Peer in Cluster (Connected)
  95.  
  96. 在其他节点上查看集群情况,就能看到GlusterFS-master节点了
  97. [root@GlusterFS-slave ~]# gluster peer status
  98. Number of Peers: 3
  99.  
  100. Hostname: GlusterFS-master
  101. Uuid: 5dfd40e2-096b-40b5-bee3-003b57a39007
  102. State: Peer in Cluster (Connected)
  103.  
  104. Hostname: 192.168.10.204
  105. Uuid: a989394c-f64a-40c3-8bc5-820f623952c4
  106. State: Peer in Cluster (Connected)
  107.  
  108. Hostname: 192.168.10.220
  109. Uuid: dd99743a-285b-4aed-b3d6-e860f9efd965
  110. State: Peer in Cluster (Connected)
  111.  
  112. 创建副本卷
  113. [root@GlusterFS-master ~]# gluster volume info
  114. No volumes present
  115.  
  116. [root@GlusterFS-master ~]# gluster volume create models replica 2 192.168.10.239:/data/gluster 192.168.10.212:/data/gluster force
  117. volume create: models: success: please start the volume to access data
  118.  
  119. [root@GlusterFS-master ~]# gluster volume list
  120. models
  121.  
  122. [root@GlusterFS-master ~]# gluster volume info
  123.  
  124. Volume Name: models
  125. Type: Replicate
  126. Volume ID: 8eafb261-e0d2-4f3b-8e09-05475c63dcc6
  127. Status: Created
  128. Number of Bricks: 1 x 2 = 2
  129. Transport-type: tcp
  130. Bricks:
  131. Brick1: 192.168.10.239:/data/gluster
  132. Brick2: 192.168.10.212:/data/gluster
  133.  
  134. 启动models
  135. [root@GlusterFS-master ~]# gluster volume start models
  136. volume start: models: success
  137.  
  138. [root@GlusterFS-master ~]# gluster volume status models
  139. Status of volume: models
  140. Gluster process Port Online Pid
  141. ------------------------------------------------------------------------------
  142. Brick 192.168.10.239:/data/gluster 49156 Y 16040
  143. Brick 192.168.10.212:/data/gluster 49157 Y 5544
  144. NFS Server on localhost N/A N N/A
  145. Self-heal Daemon on localhost N/A Y 16059
  146. NFS Server on 192.168.10.204 N/A N N/A
  147. Self-heal Daemon on 192.168.10.204 N/A Y 12412
  148. NFS Server on 192.168.10.220 N/A N N/A
  149. Self-heal Daemon on 192.168.10.220 N/A Y 17656
  150. NFS Server on 192.168.10.212 N/A N N/A
  151. Self-heal Daemon on 192.168.10.212 N/A Y 5563
  152.  
  153. Task Status of Volume models
  154. ------------------------------------------------------------------------------
  155. There are no active volume tasks
  156.  
  157. 将另外两个节点追加到集群中。即卷扩容
  158. [root@GlusterFS-master ~]# gluster volume add-brick models 192.168.10.204:/data/gluster 192.168.10.220:/data/gluster force
  159. volume add-brick: success
  160.  
  161. [root@GlusterFS-master ~]# gluster volume info
  162.  
  163. Volume Name: models
  164. Type: Distributed-Replicate
  165. Volume ID: 8eafb261-e0d2-4f3b-8e09-05475c63dcc6
  166. Status: Started
  167. Number of Bricks: 2 x 2 = 4
  168. Transport-type: tcp
  169. Bricks:
  170. Brick1: 192.168.10.239:/data/gluster
  171. Brick2: 192.168.10.212:/data/gluster
  172. Brick3: 192.168.10.204:/data/gluster
  173. Brick4: 192.168.10.220:/data/gluster
  174. ------------------------------------------------------------------------------

2)测试Gluster卷

  1. 客户端挂载glusterfs
  2. [root@Client ~]# mount -t glusterfs 192.168.10.239:models /opt/gfsmount
  3. [root@Client gfsmount]# df -h
  4. ........
  5. 192.168.10.239:models 2.0G 65M 2.0G 4% /opt/gfsmount
  6.  
  7. [root@Client ~]# cd /opt/gfsmount/
  8. [root@Client gfsmount]# ls
  9. [root@Client gfsmount]#
  10.  
  11. 写入测试数据
  12. [root@Client gfsmount]# for i in `seq -w 1 100`; do cp -rp /var/log/messages /opt/gfsmount/copy-test-$i; done
  13. [root@Client gfsmount]# ls /opt/gfsmount/
  14. copy-test-001 copy-test-014 copy-test-027 copy-test-040 copy-test-053 copy-test-066 copy-test-079 copy-test-092
  15. copy-test-002 copy-test-015 copy-test-028 copy-test-041 copy-test-054 copy-test-067 copy-test-080 copy-test-093
  16. copy-test-003 copy-test-016 copy-test-029 copy-test-042 copy-test-055 copy-test-068 copy-test-081 copy-test-094
  17. copy-test-004 copy-test-017 copy-test-030 copy-test-043 copy-test-056 copy-test-069 copy-test-082 copy-test-095
  18. copy-test-005 copy-test-018 copy-test-031 copy-test-044 copy-test-057 copy-test-070 copy-test-083 copy-test-096
  19. copy-test-006 copy-test-019 copy-test-032 copy-test-045 copy-test-058 copy-test-071 copy-test-084 copy-test-097
  20. copy-test-007 copy-test-020 copy-test-033 copy-test-046 copy-test-059 copy-test-072 copy-test-085 copy-test-098
  21. copy-test-008 copy-test-021 copy-test-034 copy-test-047 copy-test-060 copy-test-073 copy-test-086 copy-test-099
  22. copy-test-009 copy-test-022 copy-test-035 copy-test-048 copy-test-061 copy-test-074 copy-test-087 copy-test-100
  23. copy-test-010 copy-test-023 copy-test-036 copy-test-049 copy-test-062 copy-test-075 copy-test-088
  24. copy-test-011 copy-test-024 copy-test-037 copy-test-050 copy-test-063 copy-test-076 copy-test-089
  25. copy-test-012 copy-test-025 copy-test-038 copy-test-051 copy-test-064 copy-test-077 copy-test-090
  26. copy-test-013 copy-test-026 copy-test-039 copy-test-052 copy-test-065 copy-test-078 copy-test-091
  27.  
  28. [root@Client gfsmount]# ls -lA /opt/gfsmount|wc -l
  29. 101
  30.  
  31. 在各节点机器上也确认下,发现这100个文件随机地各自分为了两个50份的文件(均衡),分别同步到了第1-2节点和第3-4节点上了。
  32. [root@GlusterFS-master ~]# ls /data/gluster
  33. copy-test-001 copy-test-016 copy-test-028 copy-test-038 copy-test-054 copy-test-078 copy-test-088 copy-test-100
  34. copy-test-004 copy-test-017 copy-test-029 copy-test-039 copy-test-057 copy-test-079 copy-test-090
  35. copy-test-006 copy-test-019 copy-test-030 copy-test-041 copy-test-060 copy-test-081 copy-test-093
  36. copy-test-008 copy-test-021 copy-test-031 copy-test-046 copy-test-063 copy-test-082 copy-test-094
  37. copy-test-011 copy-test-022 copy-test-032 copy-test-048 copy-test-065 copy-test-083 copy-test-095
  38. copy-test-012 copy-test-023 copy-test-033 copy-test-051 copy-test-073 copy-test-086 copy-test-098
  39. copy-test-015 copy-test-024 copy-test-034 copy-test-052 copy-test-077 copy-test-087 copy-test-099
  40. [root@GlusterFS-master ~]# ll /data/gluster|wc -l
  41. 51
  42.  
  43. [root@GlusterFS-slave ~]# ls /data/gluster/
  44. copy-test-001 copy-test-016 copy-test-028 copy-test-038 copy-test-054 copy-test-078 copy-test-088 copy-test-100
  45. copy-test-004 copy-test-017 copy-test-029 copy-test-039 copy-test-057 copy-test-079 copy-test-090
  46. copy-test-006 copy-test-019 copy-test-030 copy-test-041 copy-test-060 copy-test-081 copy-test-093
  47. copy-test-008 copy-test-021 copy-test-031 copy-test-046 copy-test-063 copy-test-082 copy-test-094
  48. copy-test-011 copy-test-022 copy-test-032 copy-test-048 copy-test-065 copy-test-083 copy-test-095
  49. copy-test-012 copy-test-023 copy-test-033 copy-test-051 copy-test-073 copy-test-086 copy-test-098
  50. copy-test-015 copy-test-024 copy-test-034 copy-test-052 copy-test-077 copy-test-087 copy-test-099
  51. [root@GlusterFS-slave ~]# ll /data/gluster/|wc -l
  52. 51
  53.  
  54. [root@GlusterFS-slave2 ~]# ls /data/gluster/
  55. copy-test-002 copy-test-014 copy-test-036 copy-test-047 copy-test-059 copy-test-069 copy-test-080 copy-test-097
  56. copy-test-003 copy-test-018 copy-test-037 copy-test-049 copy-test-061 copy-test-070 copy-test-084
  57. copy-test-005 copy-test-020 copy-test-040 copy-test-050 copy-test-062 copy-test-071 copy-test-085
  58. copy-test-007 copy-test-025 copy-test-042 copy-test-053 copy-test-064 copy-test-072 copy-test-089
  59. copy-test-009 copy-test-026 copy-test-043 copy-test-055 copy-test-066 copy-test-074 copy-test-091
  60. copy-test-010 copy-test-027 copy-test-044 copy-test-056 copy-test-067 copy-test-075 copy-test-092
  61. copy-test-013 copy-test-035 copy-test-045 copy-test-058 copy-test-068 copy-test-076 copy-test-096
  62. [root@GlusterFS-slave2 ~]# ll /data/gluster/|wc -l
  63. 51
  64.  
  65. [root@GlusterFS-slave3 ~]# ls /data/gluster/
  66. copy-test-002 copy-test-014 copy-test-036 copy-test-047 copy-test-059 copy-test-069 copy-test-080 copy-test-097
  67. copy-test-003 copy-test-018 copy-test-037 copy-test-049 copy-test-061 copy-test-070 copy-test-084
  68. copy-test-005 copy-test-020 copy-test-040 copy-test-050 copy-test-062 copy-test-071 copy-test-085
  69. copy-test-007 copy-test-025 copy-test-042 copy-test-053 copy-test-064 copy-test-072 copy-test-089
  70. copy-test-009 copy-test-026 copy-test-043 copy-test-055 copy-test-066 copy-test-074 copy-test-091
  71. copy-test-010 copy-test-027 copy-test-044 copy-test-056 copy-test-067 copy-test-075 copy-test-092
  72. copy-test-013 copy-test-035 copy-test-045 copy-test-058 copy-test-068 copy-test-076 copy-test-096
  73. [root@GlusterFS-slave3 ~]# ll /data/gluster/|wc -l
  74. 51

3)模拟brick故障

  1. 1)查看当前存储状态
  2. GlusterFS-slave3节点机器上操作
  3. [root@GlusterFS-slave3 ~]# gluster volume status
  4. Status of volume: models
  5. Gluster process Port Online Pid
  6. ------------------------------------------------------------------------------
  7. Brick 192.168.10.239:/data/gluster 49156 Y 16040
  8. Brick 192.168.10.212:/data/gluster 49157 Y 5544
  9. Brick 192.168.10.204:/data/gluster 49157 Y 12432
  10. Brick 192.168.10.220:/data/gluster 49158 Y 17678
  11. NFS Server on localhost N/A N N/A
  12. Self-heal Daemon on localhost N/A Y 17697
  13. NFS Server on GlusterFS-master N/A N N/A
  14. Self-heal Daemon on GlusterFS-master N/A Y 16104
  15. NFS Server on 192.168.10.204 N/A N N/A
  16. Self-heal Daemon on 192.168.10.204 N/A Y 12451
  17. NFS Server on 192.168.10.212 N/A N N/A
  18. Self-heal Daemon on 192.168.10.212 N/A Y 5593
  19.  
  20. Task Status of Volume models
  21. ------------------------------------------------------------------------------
  22. There are no active volume tasks
  23.  
  24. 注:注意到Online项全部为"Y"
  25.  
  26. 2)制造故障(注意这里模拟的是文件系统故障,假设物理硬盘没有问题或已经更换阵列中的硬盘)
  27. GlusterFS-slave3节点机器上操作
  28. [root@GlusterFS-slave3 ~]# vim /etc/fstab //注释掉如下行
  29. ......
  30. #/dev/loop0 /data xfs defaults 1 2
  31.  
  32. 重启服务器
  33. [root@GlusterFS-slave3 ~]# reboot
  34.  
  35. 重启后,发现GlusterFS-slave3节点的/data没有挂载上
  36. [root@GlusterFS-slave3 ~]# df -h
  37.  
  38. 重启后,发现GlusterFS-slave3节点的存储目录不在了,数据没有了。
  39. [root@GlusterFS-slave3 ~]# ls /data/
  40. [root@GlusterFS-slave3 ~]#
  41.  
  42. 重启服务器后,记得启动glusterd服务
  43. [root@GlusterFS-slave3 ~]# /usr/local/glusterfs/sbin/glusterd
  44. [root@GlusterFS-slave3 ~]# ps -ef|grep gluster
  45. root 11122 1 4 23:13 ? 00:00:00 /usr/local/glusterfs/sbin/glusterd
  46. root 11269 1 2 23:13 ? 00:00:00 /usr/local/glusterfs/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /usr/local/glusterfs/var/lib/glusterd/glustershd/run/glustershd.pid -l /usr/local/glusterfs/var/log/glusterfs/glustershd.log -S /var/run/98e3200bc6620c9d920e9dc65624dbe0.socket --xlator-option *replicate*.node-uuid=dd99743a-285b-4aed-b3d6-e860f9efd965
  47. root 11280 5978 0 23:13 pts/0 00:00:00 grep --color=auto gluster
  48.  
  49. 3)查看当前存储状态
  50. [root@GlusterFS-slave3 ~]# gluster volume status
  51. Status of volume: models
  52. Gluster process Port Online Pid
  53. ------------------------------------------------------------------------------
  54. Brick 192.168.10.239:/data/gluster 49156 Y 16040
  55. Brick 192.168.10.212:/data/gluster 49157 Y 5544
  56. Brick 192.168.10.204:/data/gluster 49157 Y 12432
  57. Brick 192.168.10.220:/data/gluster N/A N N/A
  58. NFS Server on localhost N/A N N/A
  59. Self-heal Daemon on localhost N/A Y 11269
  60. NFS Server on GlusterFS-master N/A N N/A
  61. Self-heal Daemon on GlusterFS-master N/A Y 16104
  62. NFS Server on 192.168.10.212 N/A N N/A
  63. Self-heal Daemon on 192.168.10.212 N/A Y 5593
  64. NFS Server on 192.168.10.204 N/A N N/A
  65. Self-heal Daemon on 192.168.10.204 N/A Y 12451
  66.  
  67. Task Status of Volume models
  68. ------------------------------------------------------------------------------
  69. There are no active volume tasks
  70.  
  71. 注意:发现GlusterFS-slave3节点(192.168.10.220)的Online项状态为"N"了!
  72.  
  73. 4)恢复故障brick方法
  74.  
  75. 4.1)结束故障brick的进程
  76. 如上通过"gluster volume status"命令,如果查看到状态Online项为"N"GlusterFS-slave3节点存在PID号(不显示N/A),则应当使用"kill -15 pid"杀死它!
  77. 一般当Online项为"N"时就不显示pid号了。
  78.  
  79. 4.2)创建新的数据目录(注意绝不可以与之前目录一样)
  80. [root@GlusterFS-slave3 ~]# dd if=/dev/vda1 of=/dev/vdb1
  81. 2097152+0 records in
  82. 2097152+0 records out
  83. 1073741824 bytes (1.1 GB) copied, 2.05684 s, 522 MB/s
  84. [root@GlusterFS-slave3 ~]# du -sh /dev/vdb1
  85. 1.0G /dev/vdb1
  86. [root@GlusterFS-slave3 ~]# mkfs.xfs -f /dev/vdb1
  87. meta-data=/dev/vdb1 isize=512 agcount=4, agsize=65536 blks
  88. = sectsz=512 attr=2, projid32bit=1
  89. = crc=1 finobt=0, sparse=0
  90. data = bsize=4096 blocks=262144, imaxpct=25
  91. = sunit=0 swidth=0 blks
  92. naming =version 2 bsize=4096 ascii-ci=0 ftype=1
  93. log =internal log bsize=4096 blocks=2560, version=2
  94. = sectsz=512 sunit=0 blks, lazy-count=1
  95. realtime =none extsz=4096 blocks=0, rtextents=0
  96.  
  97. 重新挂载
  98. [root@GlusterFS-slave3 ~]# mount /dev/vdb1 /data
  99.  
  100. [root@GlusterFS-slave3 ~]# vim /etc/fstab //去掉下面注释
  101. ......
  102. /dev/loop0 /data xfs defaults 1 2
  103.  
  104. 4.3)查询故障节点的备份节点(GlusterFS-slave2)目录的扩展属性(使用"yum search getfattr"命令getfattr工具的安装途径)
  105. [root@GlusterFS-slave2 ~]# yum install -y attr.x86_64
  106. [root@GlusterFS-slave2 ~]# getfattr -d -m. -e hex /data/gluster
  107. getfattr: Removing leading '/' from absolute path names
  108. # file: data/gluster
  109. trusted.gfid=0x00000000000000000000000000000001
  110. trusted.glusterfs.dht=0x00000001000000007fffffffffffffff
  111. trusted.glusterfs.volume-id=0x8eafb261e0d24f3b8e0905475c63dcc6
  112.  
  113. 4.4)挂载卷并触发自愈
  114. 在客户端先卸载掉之前的挂载
  115. [root@Client ~]# umount /data/gluster
  116.  
  117. 然后重新挂载GlusterFS-slave3(其实挂载哪一个节点的都可以)
  118. [root@Client ~]# mount -t glusterfs 192.168.10.220:models /opt/gfsmount
  119. [root@Client ~]# df -h
  120. .......
  121. 192.168.10.220:models 2.0G 74M 2.0G 4% /opt/gfsmount
  122.  
  123. 新建一个卷中不存在的目录并删除
  124. [root@Client ~]# cd /opt/gfsmount/
  125. [root@Client gfsmount]# mkdir testDir001
  126. [root@Client gfsmount]# rm -rf testDir001
  127.  
  128. 设置扩展属性触发自愈
  129. [root@Client gfsmount]# setfattr -n trusted.non-existent-key -v abc /opt/gfsmount
  130. [root@Client gfsmount]# setfattr -x trusted.non-existent-key /opt/gfsmount
  131.  
  132. 4.5)检查当前节点是否挂起xattrs
  133. 再次查询故障节点的备份节点(GlusterFS-slave2)目录的扩展属性
  134. [root@GlusterFS-slave2 ~]# getfattr -d -m. -e hex /data/gluster
  135. getfattr: Removing leading '/' from absolute path names
  136. # file: data/gluster
  137. trusted.afr.dirty=0x000000000000000000000000
  138. trusted.afr.models-client-2=0x000000000000000000000000
  139. trusted.afr.models-client-3=0x000000000000000200000002
  140. trusted.gfid=0x00000000000000000000000000000001
  141. trusted.glusterfs.dht=0x00000001000000007fffffffffffffff
  142. trusted.glusterfs.volume-id=0x8eafb261e0d24f3b8e0905475c63dcc6
  143.  
  144. 注意:留意第5行,表示xattrs已经将源标记为GlusterFS-slave3:/data/gluster
  145.  
  146. 4.6)检查卷的状态是否显示需要替换
  147. [root@GlusterFS-slave3 ~]# gluster volume heal models info
  148. Brick GlusterFS-master:/data/gluster/
  149. Number of entries: 0
  150.  
  151. Brick GlusterFS-slave:/data/gluster/
  152. Number of entries: 0
  153.  
  154. Brick GlusterFS-slave2:/data/gluster/
  155. /
  156. Number of entries: 1
  157.  
  158. Brick 192.168.10.220:/data/gluster
  159. Status: Transport endpoint is not connected
  160.  
  161. 注:状态提示传输端点未连接(最后一行)
  162.  
  163. 4.7)使用强制提交完成操作
  164. [root@GlusterFS-slave3 ~]# gluster volume replace-brick models 192.168.10.220:/data/gluster 192.168.10.220:/data/gluster1 commit force
  165. 提示如下表示正常完成:
  166. volume replace-brick: success: replace-brick commit force operation successful
  167.  
  168. -------------------------------------------------------------------------------------
  169. 注意:也可以将数据恢复到另外一台服务器,详细命令如下(192.168.10.230为新增的另一个glusterfs节点)(可选):
  170. # gluster peer probe 192.168.10.230
  171. # gluster volume replace-brick models 192.168.10.220:/data/gluster 192.168.10.230:/data/gluster commit force
  172. -------------------------------------------------------------------------------------
  173.  
  174. 4.8)检查存储的在线状态
  175. [root@GlusterFS-slave3 ~]# gluster volume status
  176. Status of volume: models
  177. Gluster process Port Online Pid
  178. ------------------------------------------------------------------------------
  179. Brick 192.168.10.239:/data/gluster 49156 Y 16040
  180. Brick 192.168.10.212:/data/gluster 49157 Y 5544
  181. Brick 192.168.10.204:/data/gluster 49157 Y 12432
  182. Brick 192.168.10.220:/data/gluster1 49159 Y 11363
  183. NFS Server on localhost N/A N N/A
  184. Self-heal Daemon on localhost N/A Y 11375
  185. NFS Server on 192.168.10.204 N/A N N/A
  186. Self-heal Daemon on 192.168.10.204 N/A Y 12494
  187. NFS Server on 192.168.10.212 N/A N N/A
  188. Self-heal Daemon on 192.168.10.212 N/A Y 5625
  189. NFS Server on GlusterFS-master N/A N N/A
  190. Self-heal Daemon on GlusterFS-master N/A Y 16161
  191.  
  192. Task Status of Volume models
  193. ------------------------------------------------------------------------------
  194. There are no active volume tasks
  195.  
  196. 从上面信息可以看出,192.168.10.220GlusterFS-slave3)节点的Online项的状态为"Y"了,不过存储目录是/data/gluster1
  197.  
  198. 这个时候,查看GlusterFS-slave3节点的存储目录,发现数据已经恢复了
  199. [root@GlusterFS-slave3 ~]# ls /data/gluster/
  200. copy-test-002 copy-test-014 copy-test-036 copy-test-047 copy-test-059 copy-test-069 copy-test-080 copy-test-097
  201. copy-test-003 copy-test-018 copy-test-037 copy-test-049 copy-test-061 copy-test-070 copy-test-084
  202. copy-test-005 copy-test-020 copy-test-040 copy-test-050 copy-test-062 copy-test-071 copy-test-085
  203. copy-test-007 copy-test-025 copy-test-042 copy-test-053 copy-test-064 copy-test-072 copy-test-089
  204. copy-test-009 copy-test-026 copy-test-043 copy-test-055 copy-test-066 copy-test-074 copy-test-091
  205. copy-test-010 copy-test-027 copy-test-044 copy-test-056 copy-test-067 copy-test-075 copy-test-092
  206. copy-test-013 copy-test-035 copy-test-045 copy-test-058 copy-test-068 copy-test-076 copy-test-096
  207. [root@GlusterFS-slave3 ~]# ll /data/gluster/|wc -l
  208. 51
  209.  
  210. 温馨提示:
  211. 上面模拟的故障是gluster节点的存储目录所在的分区挂载失败,导致存储目录不在的数据修复方法。
  212. 如果存储目录删除了,还可以根据文档:http://www.cnblogs.com/kevingrace/p/8778123.html中介绍的复制卷数据故障的相关方法进行数据恢复。

GlusterFS分布式存储系统中更换故障Brick的操作记录的更多相关文章

  1. GlusterFS分布式存储系统中更换故障Brick的操作记录1

    前面已经介绍了GlusterFS分布式存储集群环境部署记录,现在模拟下更换故障Brick的操作: 1)GlusterFS集群系统一共有4个节点,集群信息如下: 分别在各个节点上配置hosts.同步好系 ...

  2. 关于分布式存储系统中-CAP原则(CAP定理)与BASE理论比较

    CAP原则又称CAP定理,指的是在一个分布式系统中, Consistency(一致性). Availability(可用性).Partition tolerance(分区容错性),三者不可得兼. CA ...

  3. CentOS 7.6 部署 GlusterFS 分布式存储系统

    文章目录 GlusterFS简介 环境介绍 开始GlusterFS部署 配置hosts解析 配置GlusterFS 创建文件系统 安装GlusterFS 启动GlusterFS 将节点加入到主机池 创 ...

  4. Linux实战教学笔记52:GlusterFS分布式存储系统

    一,分布式文件系统理论基础 1.1 分布式文件系统出现 计算机通过文件系统管理,存储数据,而现在数据信息爆炸的时代中人们可以获取的数据成指数倍的增长,单纯通过增加硬盘个数来扩展计算机文件系统的存储容量 ...

  5. GlusterFS分布式存储系统

    一,分布式文件系统理论基础 1.1 分布式文件系统出现 计算机通过文件系统管理,存储数据,而现在数据信息爆炸的时代中人们可以获取的数据成指数倍的增长,单纯通过增加硬盘个数来扩展计算机文件系统的存储容量 ...

  6. jenkins中通过git发版操作记录

    之前说到的jenkins自动化构建发版是通过svn方式,今天这里介绍下通过git方式发本的操作记录. 一.不管是通过svn发版还是git发版,都要首先下载svn或git插件.登陆jenkins,依次点 ...

  7. GlusterFS分布式存储系统复制集更换故障Brick操作记录

    场景: GlusterFS 3节点的复制集,由于磁盘故障,其中一个复制集需要重装系统,所以需要重装glusterfs并将该节点加入glusterfs集群 一. 安装GlusterFS 首先在重装系统节 ...

  8. 分布式存储系统之Ceph集群存储池操作

    前文我们了解了ceph的存储池.PG.CRUSH.客户端IO的简要工作过程.Ceph客户端计算PG_ID的步骤的相关话题,回顾请参考https://www.cnblogs.com/qiuhom-187 ...

  9. 在分布式数据库中CAP原理CAP+BASE

    本篇博文的内容均来源于网络,本人只是整理,仅供学习! 一.关系型数据库 关系型数据库遵循ACID规则 事务在英文中是transaction,和现实世界中的交易很类似,它有如下四个特性: 1.A (At ...

随机推荐

  1. 弱符号__attribute__((weak))

    弱符号是什么? 弱符号: 若两个或两个以上全局符号(函数或变量名)名字一样,而其中之一声明为weak symbol(弱符号),则这些全局符号不会引发重定义错误.链接器会忽略弱符号,去使用普通的全局符号 ...

  2. Sender IP字段为"0.0.0.0"的ARP请求报文

    今天在研究免费ARP的过程中,抓到了一种Sender IP字段为“0.0.0.0”的ARP请求报文(广播),抓包截图如下: 这让我很疑惑.一个正常的ARP请求不应该只是Target MAC字段为全0吗 ...

  3. 鸟哥的 Linux 私房菜Shell Scripts篇(三)

    参考: http://linux.vbird.org/linux_basic/0340bashshell-scripts.php#script_be http://www.runoob.com/lin ...

  4. [Compression] Hadoop 压缩

    0. 说明 Hadoop 压缩介绍 && 压缩格式总结 && 压缩编解码器测试 1. 介绍 [文件压缩的好处] 文件压缩的好处如下: 减少存储文件所需要的磁盘空间 加速 ...

  5. Django学习---笔记一

    一. 新建虚拟机在虚拟中完成项目 1.新建虚拟机 mkvirtalenv  虚拟机名称 2.进入虚拟机 cd   新建的虚拟机名称 3.安装Django pip install django 4.Dj ...

  6. 【Java多线程】线程状态、线程池状态

    线程状态: 线程共包括以下5种状态.1. 新建状态(New) 线程对象被创建后,就进入了新建状态.例如,Thread thread = new Thread().2. 就绪状态(Runnable) 也 ...

  7. Linux - CentOS7上的时间同步

    1. 时区的概念 1.1 时区简介 地球是自西向东自转,东边比西边先看到太阳,东边的时间也比西边的早.东边时刻与西边时刻的差值不仅要以时计,而且还要以分和秒来计算,这给人们带来不便.所以为了克服时间上 ...

  8. 【转】escape,encodeURI,encodeURIComponent有什么区别?

    在这个页面里面试着搜了一下 「UTF-8」 ,居然没有搜到. escape 和 encodeURI 都属于 Percent-encoding,基本功能都是把 URI 非法字符转化成合法字符,转化后形式 ...

  9. JEECG平台权限设计

    JEECG平台权限设计 链接存放位置:https://github.com/PlayTaoist/jeecg-lession/tree/master/%E6%9D%83%E9%99%90%E7%AE% ...

  10. Jmeter之mysql性能测试

    Jmeter官网地址:https://jmeter.apache.org/ 作为开发人员,必要的性能测试还是需要掌握的,虽然配置druid可以比较直观获得sql的执行时间,那些表被访问的比较多等等,但 ...