管理和自定义crushmap

定义pg到osd的映射关系

通过crush算法使三副本映射到理想的主机或者机架

更改故障域提高可靠性

pg到osd映射由crush实现

下载时需要将对象从osd搜索到,组成文件,那么对象多了就会效率变低,那么从pg组里面搜索。提高效率

对象放在pg要通过hash算法 95个pg / 100 取余 对象放在第95个pg里

pg属于存储池

默认池有32个pg

pg映射到osd,这个就属于crush算法

三种存储(rbd cephfs rgw)底层都是一样的

文件切分,是在客户端完成的

客户端被告知映射视图

然后开始存osd

crush的作用,就是根据pg id 得一个osd列表

crush map 的解译 编译 更新

导出一个二进制crush文件

  1. [root@clienta ~]# cephadm shell
  2. [ceph: root@clienta /]# ceph osd getcrushmap -o crushmap.bin
  3. 20
  4. [ceph: root@clienta /]# ls
  5. bin crushmap.bin etc lib lost+found mnt proc run srv tmp var
  6. boot dev home lib64 media opt root sbin sys usr
  7. [ceph: root@clienta /]#

将二进制转换为文本文件

  1. [ceph: root@clienta /]# crushtool -d crushmap.bin -o crushmap.txt
  2. [ceph: root@clienta /]# cat crushmap.txt
  3. # begin crush map
  4. tunable choose_local_tries 0
  5. tunable choose_local_fallback_tries 0
  6. tunable choose_total_tries 50
  7. tunable chooseleaf_descend_once 1
  8. tunable chooseleaf_vary_r 1
  9. tunable chooseleaf_stable 1
  10. tunable straw_calc_version 1
  11. tunable allowed_bucket_algs 54
  12. # devices
  13. device 0 osd.0 class hdd
  14. device 1 osd.1 class hdd
  15. device 2 osd.2 class hdd
  16. device 3 osd.3 class hdd
  17. device 4 osd.4 class hdd
  18. device 5 osd.5 class hdd
  19. device 6 osd.6 class hdd
  20. device 7 osd.7 class hdd
  21. device 8 osd.8 class hdd
  22. # types
  23. type 0 osd
  24. type 1 host
  25. type 2 chassis
  26. type 3 rack
  27. type 4 row
  28. type 5 pdu
  29. type 6 pod
  30. type 7 room
  31. type 8 datacenter
  32. type 9 zone
  33. type 10 region
  34. type 11 root
  35. # buckets
  36. host serverc {
  37. id -3 # do not change unnecessarily
  38. id -4 class hdd # do not change unnecessarily
  39. # weight 0.029
  40. alg straw2
  41. hash 0 # rjenkins1
  42. item osd.0 weight 0.010
  43. item osd.1 weight 0.010
  44. item osd.2 weight 0.010
  45. }
  46. host serverd {
  47. id -5 # do not change unnecessarily
  48. id -6 class hdd # do not change unnecessarily
  49. # weight 0.029
  50. alg straw2
  51. hash 0 # rjenkins1
  52. item osd.3 weight 0.010
  53. item osd.5 weight 0.010
  54. item osd.7 weight 0.010
  55. }
  56. host servere {
  57. id -7 # do not change unnecessarily
  58. id -8 class hdd # do not change unnecessarily
  59. # weight 0.029
  60. alg straw2
  61. hash 0 # rjenkins1
  62. item osd.4 weight 0.010
  63. item osd.6 weight 0.010
  64. item osd.8 weight 0.010
  65. }
  66. root default {
  67. id -1 # do not change unnecessarily
  68. id -2 class hdd # do not change unnecessarily
  69. # weight 0.088
  70. alg straw2
  71. hash 0 # rjenkins1
  72. item serverc weight 0.029
  73. item serverd weight 0.029
  74. item servere weight 0.029
  75. }
  76. # rules
  77. rule replicated_rule {
  78. id 0
  79. type replicated
  80. min_size 1
  81. max_size 10
  82. step take default
  83. step chooseleaf firstn 0 type host
  84. step emit
  85. }
  86. # end crush map

devices 会识别你的硬盘是ssd还是hdd

有时候会识别错误,但是可以人为干预

types 为故障域

三副本情况下

osd,找所有osd找三个 (三个osd在一个主机上,这样就容易丢数据)

host,所有主机找三个

rack,机架级别的

room、datacenter 房间,数据中心(也是故障域)

osd只能识别到host级别

其他级别就得自定义



放在同一个根下面就会有关系

三副本,要是你用这个图的数据中心级别,那么两个数据中心是不够的

默认故障域关系

  1. [ceph: root@clienta /]# ceph osd tree
  2. ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
  3. -1 0.08817 root default #root
  4. -3 0.02939 host serverc
  5. 0 hdd 0.00980 osd.0 up 1.00000 1.00000
  6. 1 hdd 0.00980 osd.1 up 1.00000 1.00000
  7. 2 hdd 0.00980 osd.2 up 1.00000 1.00000
  8. -5 0.02939 host serverd
  9. 3 hdd 0.00980 osd.3 up 1.00000 1.00000
  10. 5 hdd 0.00980 osd.5 up 1.00000 1.00000
  11. 7 hdd 0.00980 osd.7 up 1.00000 1.00000
  12. -7 0.02939 host servere
  13. 4 hdd 0.00980 osd.4 up 1.00000 1.00000
  14. 6 hdd 0.00980 osd.6 up 1.00000 1.00000
  15. 8 hdd 0.00980 osd.8 up 1.00000 1.00000
  16. [ceph: root@clienta /]#
  17. # buckets
  18. host serverc {
  19. id -3 # do not change unnecessarily
  20. id -4 class hdd # do not change unnecessarily
  21. # weight 0.029
  22. alg straw2
  23. hash 0 # rjenkins1
  24. item osd.0 weight 0.010
  25. item osd.1 weight 0.010
  26. item osd.2 weight 0.010
  27. }

我们能改变的只是怎么去分布就好了。他这个自动已经通过算法识别好了,没必要改

权重1Tb为1 我一个osd 10G,主机权重为osd的和

  1. root default {
  2. id -1 # do not change unnecessarily
  3. id -2 class hdd # do not change unnecessarily
  4. # weight 0.088
  5. alg straw2
  6. hash 0 # rjenkins1
  7. item serverc weight 0.029
  8. item serverd weight 0.029
  9. item servere weight 0.029
  10. }

写一个rack级别

三个节点在不同机架

  1. rack rack1 {
  2. id -9 # do not change unnecessarily
  3. id -10 class hdd # do not change unnecessarily
  4. # weight 0.088
  5. alg straw2
  6. hash 0 # rjenkins1
  7. item serverc weight 0.029
  8. }
  9. rack rack2 {
  10. id -11 # do not change unnecessarily
  11. id -12 class hdd # do not change unnecessarily
  12. # weight 0.088
  13. alg straw2
  14. hash 0 # rjenkins1
  15. item serverd weight 0.029
  16. }
  17. rack rack3 {
  18. id -13 # do not change unnecessarily
  19. id -14 class hdd # do not change unnecessarily
  20. # weight 0.088
  21. alg straw2
  22. hash 0 # rjenkins1
  23. item servere weight 0.029
  24. }
  25. [ceph: root@clienta /]# cp crushmap.txt crushmap-new.txt
  26. 增加到new里面并且反编译
  27. [ceph: root@clienta /]# crushtool -c crushmap-new.txt -o crushmap-new.bin
  28. [ceph: root@clienta /]# ceph osd setcrushmap -i crushmap-new.bin
  29. 21
  30. 更新之后2021 + 1 数据变了就知道更新了
  31. [ceph: root@clienta /]# ceph osd tree
  32. ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
  33. -13 0.02899 rack rack3
  34. -3 0.02899 host serverc
  35. 0 hdd 0.00999 osd.0 up 1.00000 1.00000
  36. 1 hdd 0.00999 osd.1 up 1.00000 1.00000
  37. 2 hdd 0.00999 osd.2 up 1.00000 1.00000
  38. -11 0.02899 rack rack2
  39. -3 0.02899 host serverd
  40. 0 hdd 0.00999 osd.3 up 1.00000 1.00000
  41. 1 hdd 0.00999 osd.5 up 1.00000 1.00000
  42. 2 hdd 0.00999 osd.7 up 1.00000 1.00000
  43. -9 0.02899 rack rack1
  44. -3 0.02899 host servere
  45. 0 hdd 0.00999 osd.4 up 1.00000 1.00000
  46. 1 hdd 0.00999 osd.6 up 1.00000 1.00000
  47. 2 hdd 0.00999 osd.8 up 1.00000 1.00000
  48. -1 0.08698 root default
  49. -3 0.02899 host serverc
  50. 0 hdd 0.00999 osd.0 up 1.00000 1.00000
  51. 1 hdd 0.00999 osd.1 up 1.00000 1.00000
  52. 2 hdd 0.00999 osd.2 up 1.00000 1.00000
  53. -5 0.02899 host serverd
  54. 3 hdd 0.00999 osd.3 up 1.00000 1.00000
  55. 5 hdd 0.00999 osd.5 up 1.00000 1.00000
  56. 7 hdd 0.00999 osd.7 up 1.00000 1.00000
  57. -7 0.02899 host servere
  58. 4 hdd 0.00999 osd.4 up 1.00000 1.00000
  59. 6 hdd 0.00999 osd.6 up 1.00000 1.00000
  60. 8 hdd 0.00999 osd.8 up 1.00000 1.00000
  61. [ceph: root@clienta /]#
  62. 无根节点,就是三个rack不在一起。一定要有根节点。
  63. 创建存储池可以选择根节点。好多个根节点,就得指定

dc如果是根,那它就到头了。不过不是,那还可以向上延伸。多个根之间没有任何关系。

  1. root dc1 {
  2. id -15 # do not change unnecessarily
  3. id -16 class hdd # do not change unnecessarily
  4. # weight 0.088
  5. alg straw2
  6. hash 0 # rjenkins1
  7. item rack1 weight 0.029
  8. }
  9. root dc2 {
  10. id -17 # do not change unnecessarily
  11. id -18 class hdd # do not change unnecessarily
  12. # weight 0.088
  13. alg straw2
  14. hash 0 # rjenkins1
  15. item rack2 weight 0.029
  16. item rack3 weight 0.029
  17. }
  18. root dc3 {
  19. id -19 # do not change unnecessarily
  20. id -20 class hdd # do not change unnecessarily
  21. # weight 0.088
  22. alg straw2
  23. hash 0 # rjenkins1
  24. item rack1 weight 0.029
  25. item rack2 weight 0.029
  26. item rack3 weight 0.029
  27. }

增加三个根节点dc

一个机架一个主机,所以一个主机权重就是机架权重

  1. [ceph: root@clienta /]# ceph osd tree
  2. ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
  3. -19 0.08698 root dc3
  4. -9 0.02899 rack rack1
  5. -3 0.02899 host serverc
  6. 0 hdd 0.00999 osd.0 up 1.00000 1.00000
  7. 1 hdd 0.00999 osd.1 up 1.00000 1.00000
  8. 2 hdd 0.00999 osd.2 up 1.00000 1.00000
  9. -11 0.02899 rack rack2
  10. -5 0.02899 host serverd
  11. 3 hdd 0.00999 osd.3 up 1.00000 1.00000
  12. 5 hdd 0.00999 osd.5 up 1.00000 1.00000
  13. 7 hdd 0.00999 osd.7 up 1.00000 1.00000
  14. -13 0.02899 rack rack3
  15. -7 0.02899 host servere
  16. 4 hdd 0.00999 osd.4 up 1.00000 1.00000
  17. 6 hdd 0.00999 osd.6 up 1.00000 1.00000
  18. 8 hdd 0.00999 osd.8 up 1.00000 1.00000
  19. -17 0.05798 root dc2
  20. -11 0.02899 rack rack2
  21. -5 0.02899 host serverd
  22. 3 hdd 0.00999 osd.3 up 1.00000 1.00000
  23. 5 hdd 0.00999 osd.5 up 1.00000 1.00000
  24. 7 hdd 0.00999 osd.7 up 1.00000 1.00000
  25. -13 0.02899 rack rack3
  26. -7 0.02899 host servere
  27. 4 hdd 0.00999 osd.4 up 1.00000 1.00000
  28. 6 hdd 0.00999 osd.6 up 1.00000 1.00000
  29. 8 hdd 0.00999 osd.8 up 1.00000 1.00000
  30. -15 0.02899 root dc1
  31. -9 0.02899 rack rack1
  32. -3 0.02899 host serverc
  33. 0 hdd 0.00999 osd.0 up 1.00000 1.00000
  34. 1 hdd 0.00999 osd.1 up 1.00000 1.00000
  35. 2 hdd 0.00999 osd.2 up 1.00000 1.00000
  36. -1 0.08698 root default
  37. -3 0.02899 host serverc
  38. 0 hdd 0.00999 osd.0 up 1.00000 1.00000
  39. 1 hdd 0.00999 osd.1 up 1.00000 1.00000
  40. 2 hdd 0.00999 osd.2 up 1.00000 1.00000
  41. -5 0.02899 host serverd
  42. 3 hdd 0.00999 osd.3 up 1.00000 1.00000
  43. 5 hdd 0.00999 osd.5 up 1.00000 1.00000
  44. 7 hdd 0.00999 osd.7 up 1.00000 1.00000
  45. -7 0.02899 host servere
  46. 4 hdd 0.00999 osd.4 up 1.00000 1.00000
  47. 6 hdd 0.00999 osd.6 up 1.00000 1.00000
  48. 8 hdd 0.00999 osd.8 up 1.00000 1.00000



对应架构图

无法引用,没有设置规则

  1. pool 6 'pool1' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 208 flags hashpspool stripe_width 0

创建一个pool1 他的规则仍然是crush_rule 0,没有任何改变

  1. [ceph: root@clienta /]# ceph osd crush rule ls
  2. replicated_rule
  3. # rules
  4. rule replicated_rule {
  5. id 0
  6. type replicated #类型
  7. min_size 1 #副本数在1-10之间,副本超过11这个规则用不了
  8. max_size 10
  9. step take default # 这个规则引用了default根 并没用我的dc根
  10. step chooseleaf firstn 0 type host # 叶子节点就是rack那个 他这为host(默认),故障域(机架的 主机的 osd的)
  11. step emit
  12. }

自己做一个rule

  1. # rules
  2. rule replicated_rule1 {
  3. id 1
  4. type replicated
  5. min_size 1
  6. max_size 10
  7. step take dc3 #根节点
  8. step chooseleaf firstn 0 type rack #故障域
  9. step emit
  10. }

应用它

  1. [ceph: root@clienta /]# vi crushmap-new.txt
  2. [ceph: root@clienta /]#
  3. [ceph: root@clienta /]# crushtool -c crushmap-new.txt -o crushmap-new.bin
  4. [ceph: root@clienta /]# ceph osd setcrushmap -i crushmap-new.bin
  5. 24
  6. [ceph: root@clienta /]# ceph osd crush rule ls
  7. replicated_rule
  8. replicated_rule1
  9. [ceph: root@clienta /]# ceph pg dump pgs_brief | grep ^8
  10. dumped pgs_brief
  11. 8.4 active+clean [6,5,1] 6 [6,5,1] 6
  12. 8.7 active+clean [1,5,4] 1 [1,5,4] 1
  13. 8.6 active+clean [6,2,7] 6 [6,2,7] 6
  14. 8.1 active+clean [3,1,8] 3 [3,1,8] 3
  15. 8.0 active+clean [1,7,4] 1 [1,7,4] 1
  16. 8.3 active+clean [8,3,2] 8 [8,3,2] 8
  17. 8.2 active+clean [5,8,1] 5 [5,8,1] 5
  18. 8.d active+clean [2,7,6] 2 [2,7,6] 2
  19. 8.c active+clean [0,5,4] 0 [0,5,4] 0
  20. 8.f active+clean [4,5,1] 4 [4,5,1] 4
  21. 8.a active+clean [5,4,1] 5 [5,4,1] 5
  22. 8.9 active+clean [6,3,0] 6 [6,3,0] 6
  23. 8.b active+clean [0,8,5] 0 [0,8,5] 0
  24. 8.8 active+clean [1,3,6] 1 [1,3,6] 1
  25. 8.e active+clean [5,2,4] 5 [5,2,4] 5
  26. 8.5 active+clean [6,2,7] 6 [6,2,7] 6
  27. 8.1a active+clean [0,7,4] 0 [0,7,4] 0
  28. 8.1b active+clean [5,4,0] 5 [5,4,0] 5
  29. 8.18 active+clean [4,2,7] 4 [4,2,7] 4
  30. 8.19 active+clean [8,5,1] 8 [8,5,1] 8
  31. 8.1e active+clean [1,7,6] 1 [1,7,6] 1
  32. 8.1f active+clean [7,6,1] 7 [7,6,1] 7
  33. 8.1c active+clean [2,8,7] 2 [2,8,7] 2
  34. 8.1d active+clean [6,7,2] 6 [6,7,2] 6
  35. 8.12 active+clean [8,7,0] 8 [8,7,0] 8
  36. 8.13 active+clean [3,4,1] 3 [3,4,1] 3
  37. 8.10 active+clean [0,4,3] 0 [0,4,3] 0
  38. 8.11 active+clean [2,8,3] 2 [2,8,3] 2
  39. 8.16 active+clean [5,4,0] 5 [5,4,0] 5
  40. 8.17 active+clean [8,2,5] 8 [8,2,5] 8
  41. 8.14 active+clean [4,2,7] 4 [4,2,7] 4
  42. 8.15 active+clean [3,8,1] 3 [3,8,1] 3
  43. [ceph: root@clienta /]#

分布在了三个rack上

  1. # rules
  2. rule replicated_rule1 {
  3. id 1
  4. type replicated
  5. min_size 1
  6. max_size 10
  7. step take dc3
  8. step chooseleaf firstn 0 type osd
  9. step emit
  10. }

更改规则为osd,可以发现0,1,5 0,1属于一个host,rack

  1. [ceph: root@clienta /]# vi crushmap-new.txt
  2. [ceph: root@clienta /]# crushtool -c crushmap-new.txt -o crushmap-new.bin
  3. [ceph: root@clienta /]# ceph osd setcrushmap -i crushmap-new.bin
  4. 26
  5. [ceph: root@clienta /]# ceph pg dump pgs_brief | grep ^8
  6. dumped pgs_brief
  7. 8.4 active+clean [6,5,3] 6 [6,5,3] 6
  8. 8.7 active+clean [1,5,3] 1 [1,5,3] 1
  9. 8.6 active+clean [6,2,8] 6 [6,2,8] 6
  10. 8.1 active+clean [3,7,1] 3 [3,7,1] 3
  11. 8.0 active+clean [1,0,7] 1 [1,0,7] 1
  12. 8.3 active+clean [8,4,3] 8 [8,4,3] 8
  13. 8.2 active+clean [5,8,7] 5 [5,8,7] 5
  14. 8.d active+clean [2,7,6] 2 [2,7,6] 2
  15. 8.c active+clean [0,5,1] 0 [0,5,1] 0
  16. 8.f active+clean [4,5,1] 4 [4,5,1] 4
  17. 8.a active+clean [5,4,6] 5 [5,4,6] 5
  18. 8.9 active+clean [6,3,5] 6 [6,3,5] 6
  19. 8.b active+clean [0,8,2] 0 [0,8,2] 0
  20. 8.8 active+clean [1,3,6] 1 [1,3,6] 1
  21. 8.e active+clean [5,2,1] 5 [5,2,1] 5
  22. 8.5 active+clean [6,2,7] 6 [6,2,7] 6
  23. 8.1a active+clean [0,7,4] 0 [0,7,4] 0
  24. 8.1b active+clean [5,4,0] 5 [5,4,0] 5
  25. 8.18 active+clean [4,2,7] 4 [4,2,7] 4
  26. 8.19 active+clean [8,4,5] 8 [8,4,5] 8
  27. 8.1e active+clean [1,7,6] 1 [1,7,6] 1
  28. 8.1f active+clean [7,5,6] 7 [7,5,6] 7
  29. 8.1c active+clean [2,8,7] 2 [2,8,7] 2
  30. 8.1d active+clean [6,7,2] 6 [6,7,2] 6
  31. 8.12 active+clean [8,7,0] 8 [8,7,0] 8
  32. 8.13 active+clean [3,4,1] 3 [3,4,1] 3
  33. 8.10 active+clean [0,4,1] 0 [0,4,1] 0
  34. 8.11 active+clean [2,8,6] 2 [2,8,6] 2
  35. 8.16 active+clean [5,4,8] 5 [5,4,8] 5
  36. 8.17 active+clean [8,6,2] 8 [8,6,2] 8
  37. 8.14 active+clean [4,2,7] 4 [4,2,7] 4
  38. 8.15 active+clean [3,8,1] 3 [3,8,1] 3
  39. [ceph: root@clienta /]#

冲突案例

  1. # rules
  2. rule replicated_rule2 {
  3. id 1
  4. type replicated
  5. min_size 1
  6. max_size 10
  7. step take dc2 #根节点 (class ssd)
  8. step chooseleaf firstn 0 type rack #firstn0 你有三副本就选择三个rack
  9. step emit
  10. }

这个是池是三副本默认值,但是dc2是只有两个机架(两个主机,我一个机架一个主机)。那么firstn0他硬要选择三副本,所以会引发冲突

将replicated_rule2加入配置文件

  1. [ceph: root@clienta /]# crushtool -c crushmap-new.txt -o crushmap-new.bin
  2. [ceph: root@clienta /]# ceph osd setcrushmap -i crushmap-new.bin
  3. 27
  4. [ceph: root@clienta /]# ceph osd pool create pool4 replicated_rule2
  5. pool 'pool4' created
  6. [ceph: root@clienta /]# ceph pg dump pgs_brief | grep ^9
  7. dumped pgs_brief
  8. 9.5 active+undersized [5,6] 5 [5,6] 5
  9. 9.6 active+undersized [4,5] 4 [4,5] 4
  10. 9.7 active+undersized [6,3] 6 [6,3] 6
  11. 9.0 active+undersized [5,4] 5 [5,4] 5
  12. 9.1 active+undersized [3,4] 3 [3,4] 3
  13. 9.2 active+undersized [8,5] 8 [8,5] 8
  14. 9.3 active+undersized [7,4] 7 [7,4] 7
  15. 9.c active+undersized [3,4] 3 [3,4] 3
  16. 9.d active+undersized [3,4] 3 [3,4] 3
  17. 9.e active+undersized [7,4] 7 [7,4] 7
  18. 9.b active+undersized [5,4] 5 [5,4] 5
  19. 9.8 active+undersized [8,3] 8 [8,3] 8
  20. 9.a active+undersized [3,4] 3 [3,4] 3
  21. 9.9 active+undersized [5,8] 5 [5,8] 5
  22. 9.f active+undersized [4,5] 4 [4,5] 4
  23. 9.4 active+undersized [8,3] 8 [8,3] 8
  24. 9.1b active+undersized [5,4] 5 [5,4] 5
  25. 9.1a active+undersized [8,7] 8 [8,7] 8
  26. 9.19 active+undersized [6,3] 6 [6,3] 6
  27. 9.18 active+undersized [5,4] 5 [5,4] 5
  28. 9.1f active+undersized [6,7] 6 [6,7] 6
  29. 9.1e active+undersized [7,8] 7 [7,8] 7
  30. 9.1d active+undersized [6,3] 6 [6,3] 6
  31. 9.1c active+undersized [5,4] 5 [5,4] 5
  32. 9.13 active+undersized [8,5] 8 [8,5] 8
  33. 9.12 active+undersized [5,8] 5 [5,8] 5
  34. 9.11 active+undersized [8,3] 8 [8,3] 8
  35. 9.10 active+undersized [5,4] 5 [5,4] 5
  36. 9.17 active+undersized [8,7] 8 [8,7] 8
  37. 9.16 active+undersized [5,4] 5 [5,4] 5
  38. 9.15 active+undersized [7,4] 7 [7,4] 7
  39. 9.14 active+undersized [4,3] 4 [4,3] 4
  40. [ceph: root@clienta /]#

undersized超出了,不是很健康

  1. [ceph: root@clienta /]# ceph -s
  2. cluster:
  3. id: 2ae6d05a-229a-11ec-925e-52540000fa0c
  4. health: HEALTH_WARN
  5. Degraded data redundancy: 32 pgs undersized
  6. services:
  7. mon: 4 daemons, quorum serverc.lab.example.com,clienta,serverd,servere (age 3h)
  8. mgr: serverc.lab.example.com.aiqepd(active, since 3h), standbys: clienta.nncugs, servere.kjwyko, serverd.klrkci
  9. osd: 9 osds: 9 up (since 3h), 9 in (since 9M)
  10. rgw: 2 daemons active (2 hosts, 1 zones)
  11. data:
  12. pools: 9 pools, 233 pgs
  13. objects: 221 objects, 4.9 KiB
  14. usage: 245 MiB used, 90 GiB / 90 GiB avail
  15. pgs: 201 active+clean
  16. 32 active+undersized
  17. io:
  18. client: 71 KiB/s rd, 0 B/s wr, 71 op/s rd, 47 op/s wr

ceph -s 警告了

pg属于降级状态,未达到规定副本数

修复状态

更改配置文件

  1. # rules
  2. rule replicated_rule2 {
  3. id 2
  4. type replicated
  5. min_size 1
  6. max_size 10
  7. step take dc2
  8. step chooseleaf firstn 2 type rack #定义2副本
  9. step emit
  10. step take dc1
  11. step chooseleaf firstn 1 type rack #定义剩下的1个副本
  12. step emit
  13. }
  14. [ceph: root@clienta /]# vi crushmap-new.txt
  15. [ceph: root@clienta /]# crushtool -c crushmap-new.txt -o crushmap-new.bin
  16. [ceph: root@clienta /]# ceph osd setcrushmap -i crushmap-new.bin
  17. 28
  18. [ceph: root@clienta /]# ceph pg dump pgs_brief | grep ^9
  19. dumped pgs_brief
  20. 9.5 active+clean [5,6,0] 5 [5,6,0] 5
  21. 9.6 activating [4,5,2] 4 [4,5,2] 4
  22. 9.7 activating [6,3,2] 6 [6,3,2] 6
  23. 9.0 activating [5,4,1] 5 [5,4,1] 5
  24. 9.1 active+clean [3,4,0] 3 [3,4,0] 3
  25. 9.2 activating [8,5,1] 8 [8,5,1] 8
  26. 9.3 active+clean [7,4,0] 7 [7,4,0] 7
  27. 9.c activating [3,4,2] 3 [3,4,2] 3
  28. 9.d active+clean [3,4,0] 3 [3,4,0] 3
  29. 9.e activating [7,4,2] 7 [7,4,2] 7
  30. 9.b active+clean [5,4,0] 5 [5,4,0] 5
  31. 9.8 activating [8,3,2] 8 [8,3,2] 8

firstn = 0 则是在根节点下寻找3个叶子(副本数)节点存放

firstn > 0 则在根节点寻找2个叶子节点存放副本(三副本),剩余副本则使用下面的规则

firstn -1 < 0 则是在根节点下寻找副本数减去其绝对值个叶子节点存放副本 (负数?这真的有必要吗?)

把上面做的还原

创建基于ssd的存储池

元数据 检索要求磁盘较快(ssd)

删掉原有磁盘类型

手动改类型

  1. [ceph: root@clienta /]# ceph osd crush rm-device-class osd.1
  2. done removing class of osd(s): 1
  3. [ceph: root@clienta /]# ceph osd crush rm-device-class osd.5
  4. done removing class of osd(s): 5
  5. [ceph: root@clienta /]# ceph osd crush rm-device-class osd.6
  6. done removing class of osd(s): 6
  7. [ceph: root@clienta /]#
  8. [ceph: root@clienta /]# ceph osd tree
  9. ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
  10. -1 0.08698 root default
  11. -3 0.02899 host serverc
  12. 1 0.00999 osd.1 up 1.00000 1.00000
  13. 0 hdd 0.00999 osd.0 up 1.00000 1.00000
  14. 2 hdd 0.00999 osd.2 up 1.00000 1.00000
  15. -5 0.02899 host serverd
  16. 5 0.00999 osd.5 up 1.00000 1.00000
  17. 3 hdd 0.00999 osd.3 up 1.00000 1.00000
  18. 7 hdd 0.00999 osd.7 up 1.00000 1.00000
  19. -7 0.02899 host servere
  20. 6 0.00999 osd.6 up 1.00000 1.00000
  21. 4 hdd 0.00999 osd.4 up 1.00000 1.00000
  22. 8 hdd 0.00999 osd.8 up 1.00000 1.00000
  23. [ceph: root@clienta /]#

更改为ssd

  1. [ceph: root@clienta /]# for i in 1 5 6;do ceph osd crush set-device-class ssd osd.$i; done
  2. [ceph: root@clienta /]# ceph osd tree
  3. ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
  4. -1 0.08698 root default
  5. -3 0.02899 host serverc
  6. 0 hdd 0.00999 osd.0 up 1.00000 1.00000
  7. 2 hdd 0.00999 osd.2 up 1.00000 1.00000
  8. 1 ssd 0.00999 osd.1 up 1.00000 1.00000
  9. -5 0.02899 host serverd
  10. 3 hdd 0.00999 osd.3 up 1.00000 1.00000
  11. 7 hdd 0.00999 osd.7 up 1.00000 1.00000
  12. 5 ssd 0.00999 osd.5 up 1.00000 1.00000
  13. -7 0.02899 host servere
  14. 4 hdd 0.00999 osd.4 up 1.00000 1.00000
  15. 8 hdd 0.00999 osd.8 up 1.00000 1.00000
  16. 6 ssd 0.00999 osd.6 up 1.00000 1.00000
  17. [ceph: root@clienta /]#
  18. [ceph: root@clienta /]# ceph osd crush class ls
  19. [
  20. "hdd",
  21. "ssd"
  22. ]
  23. [ceph: root@clienta /]#

命令行创建规则

  1. [ceph: root@clienta /]# ceph osd crush rule create-replicated ssd_rule default host ssd
  2. [ceph: root@clienta /]# ceph osd crush rule ls
  3. replicated_rule
  4. ssd_rule
  5. [ceph: root@clienta /]#

创建存储池

  1. [ceph: root@clienta /]# ceph osd crush rule create-replicated ssd_rule default host ssd
  2. [ceph: root@clienta /]# ceph osd crush rule ls
  3. replicated_rule
  4. ssd_rule
  5. [ceph: root@clienta /]# ceph osd pool create pool1 ssd_rule
  6. pool 'pool1' created
  7. [ceph: root@clienta /]# ceph osd pool ls detail
  8. pool 1 'device_health_metrics' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 249 flags hashpspool stripe_width 0 pg_num_min 1 application mgr_devicehealth
  9. pool 2 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 48 flags hashpspool stripe_width 0 application rgw
  10. pool 3 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 50 flags hashpspool stripe_width 0 application rgw
  11. pool 4 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 52 flags hashpspool stripe_width 0 application rgw
  12. pool 5 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 184 lfor 0/184/182 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 8 application rgw
  13. pool 10 'pool1' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 266 flags hashpspool stripe_width 0
  14. [ceph: root@clienta /]# ceph pg dump pgs_brief | grep ^10
  15. dumped pgs_brief
  16. 10.6 active+clean [6,1,5] 6 [6,1,5] 6
  17. 10.5 active+clean [1,5,6] 1 [1,5,6] 1
  18. 10.4 active+clean [5,1,6] 5 [5,1,6] 5
  19. 10.3 active+clean [5,6,1] 5 [5,6,1] 5
  20. 10.2 active+clean [6,5,1] 6 [6,5,1] 6
  21. 10.1 active+clean [1,5,6] 1 [1,5,6] 1
  22. 10.0 active+clean [6,5,1] 6 [6,5,1] 6
  23. 10.f active+clean [6,1,5] 6 [6,1,5] 6
  24. 10.e active+clean [1,5,6] 1 [1,5,6] 1
  25. 10.d active+clean [6,5,1] 6 [6,5,1] 6
  26. 10.8 active+clean [1,5,6] 1 [1,5,6] 1
  27. 10.b active+clean [6,5,1] 6 [6,5,1] 6
  28. 10.9 active+clean [1,5,6] 1 [1,5,6] 1
  29. 10.a active+clean [5,1,6] 5 [5,1,6] 5
  30. 10.c active+clean [1,5,6] 1 [1,5,6] 1
  31. 10.7 active+clean [5,6,1] 5 [5,6,1] 5
  32. 10.18 active+clean [1,5,6] 1 [1,5,6] 1
  33. 10.19 active+clean [1,6,5] 1 [1,6,5] 1
  34. 10.1a active+clean [6,1,5] 6 [6,1,5] 6
  35. 10.1b active+clean [6,5,1] 6 [6,5,1] 6
  36. 10.1c active+clean [5,1,6] 5 [5,1,6] 5
  37. 10.1d active+clean [6,1,5] 6 [6,1,5] 6
  38. 10.1e active+clean [1,5,6] 1 [1,5,6] 1
  39. 10.1f active+clean [6,1,5] 6 [6,1,5] 6
  40. 10.10 active+clean [1,6,5] 1 [1,6,5] 1
  41. 10.11 active+clean [5,6,1] 5 [5,6,1] 5
  42. 10.12 active+clean [6,5,1] 6 [6,5,1] 6
  43. 10.13 active+clean [5,1,6] 5 [5,1,6] 5
  44. 10.14 active+clean [1,6,5] 1 [1,6,5] 1
  45. 10.15 active+clean [6,1,5] 6 [6,1,5] 6
  46. 10.16 active+clean [6,1,5] 6 [6,1,5] 6
  47. 10.17 active+clean [1,6,5] 1 [1,6,5] 1
  48. [ceph: root@clienta /]#

全部在指定磁盘上

可以整一个性能较好的池(都是ssd盘)

命令行在配置文件里做了这些操作

  1. [ceph: root@clienta /]# ceph osd getcrushmap -o crushmap.bin
  2. 44
  3. [ceph: root@clienta /]# crushtool -d crushmap.bin -o crushmap.txt
  4. [ceph: root@clienta /]# vi crushmap.txt
  5. rule ssd_rule {
  6. id 1
  7. type replicated
  8. min_size 1
  9. max_size 10
  10. step take default class ssd
  11. step chooseleaf firstn 0 type host
  12. step emit
  13. }
  14. # devices
  15. device 0 osd.0 class hdd
  16. device 1 osd.1 class ssd
  17. device 2 osd.2 class hdd
  18. device 3 osd.3 class hdd
  19. device 4 osd.4 class hdd
  20. device 5 osd.5 class ssd
  21. device 6 osd.6 class ssd
  22. device 7 osd.7 class hdd
  23. device 8 osd.8 class hdd

执行lab命令(会报错,改变devices标签的时候,选择手动改标签)

lab start map-crush

  1. [ceph: root@clienta /]# ceph osd crush add-bucket cl260 root
  2. added bucket cl260 type root to crush map
  3. [ceph: root@clienta /]# ceph osd crush add-bucket rack1 rack
  4. added bucket rack1 type rack to crush map
  5. [ceph: root@clienta /]# ceph osd crush add-bucket rack2 rack
  6. added bucket rack2 type rack to crush map
  7. [ceph: root@clienta /]# ceph osd crush add-bucket rack3 rack
  8. added bucket rack3 type rack to crush map
  9. [ceph: root@clienta /]# ceph osd tree
  10. ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
  11. -16 0 rack rack3
  12. -15 0 rack rack2
  13. -14 0 rack rack1
  14. -13 0 root cl260
  15. -1 0.08698 root default
  16. -3 0.02899 host serverc
  17. 0 hdd 0.00999 osd.0 up 1.00000 1.00000
  18. 2 hdd 0.00999 osd.2 up 1.00000 1.00000
  19. 1 ssd 0.00999 osd.1 up 1.00000 1.00000
  20. -5 0.02899 host serverd
  21. 3 hdd 0.00999 osd.3 up 1.00000 1.00000
  22. 7 hdd 0.00999 osd.7 up 1.00000 1.00000
  23. 5 ssd 0.00999 osd.5 up 1.00000 1.00000
  24. -7 0.02899 host servere
  25. 4 hdd 0.00999 osd.4 up 1.00000 1.00000
  26. 8 hdd 0.00999 osd.8 up 1.00000 1.00000
  27. 6 ssd 0.00999 osd.6 up 1.00000 1.00000
  28. [ceph: root@clienta /]#
  29. [ceph: root@clienta /]# ceph osd crush add-bucket hostc host
  30. added bucket hostc type host to crush map
  31. [ceph: root@clienta /]# ceph osd crush add-bucket hostd host
  32. added bucket hostd type host to crush map
  33. [ceph: root@clienta /]# ceph osd crush add-bucket hoste host
  34. added bucket hoste type host to crush map
  35. [ceph: root@clienta /]# ceph osd crush move rack1 root=cl260
  36. moved item id -14 name 'rack1' to location {root=cl260} in crush map
  37. [ceph: root@clienta /]# ceph osd crush move rack2 root=cl260
  38. moved item id -15 name 'rack2' to location {root=cl260} in crush map
  39. [ceph: root@clienta /]# ceph osd crush move rack3 root=cl260
  40. moved item id -16 name 'rack3' to location {root=cl260} in crush map
  41. [ceph: root@clienta /]# ceph osd crush move hostc rack=rack1
  42. moved item id -17 name 'hostc' to location {rack=rack1} in crush map
  43. [ceph: root@clienta /]# ceph osd crush move hostd rack=rack2
  44. moved item id -18 name 'hostd' to location {rack=rack2} in crush map
  45. [ceph: root@clienta /]# ceph osd crush move hoste rack=rack3
  46. moved item id -19 name 'hoste' to location {rack=rack3} in crush map
  47. [ceph: root@clienta /]# ceph osd tree
  48. ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
  49. -13 0 root cl260
  50. -14 0 rack rack1
  51. -17 0 host hostc
  52. -15 0 rack rack2
  53. -18 0 host hostd
  54. -16 0 rack rack3
  55. -19 0 host hoste
  56. -1 0.08698 root default
  57. -3 0.02899 host serverc
  58. 0 hdd 0.00999 osd.0 up 1.00000 1.00000
  59. 2 hdd 0.00999 osd.2 up 1.00000 1.00000
  60. 1 ssd 0.00999 osd.1 up 1.00000 1.00000
  61. -5 0.02899 host serverd
  62. 3 hdd 0.00999 osd.3 up 1.00000 1.00000
  63. 7 hdd 0.00999 osd.7 up 1.00000 1.00000
  64. 5 ssd 0.00999 osd.5 up 1.00000 1.00000
  65. -7 0.02899 host servere
  66. 4 hdd 0.00999 osd.4 up 1.00000 1.00000
  67. 8 hdd 0.00999 osd.8 up 1.00000 1.00000
  68. 6 ssd 0.00999 osd.6 up 1.00000 1.00000
  69. [ceph: root@clienta /]#
  70. [ceph: root@clienta /]# ceph osd crush set osd.1 1.0 root=cl260 rack=rack1 host=hostc
  71. set item id 1 name 'osd.1' weight 1 at location {host=hostc,rack=rack1,root=cl260} to crush map
  72. [ceph: root@clienta /]# ceph osd crush set osd.5 1.0 root=cl260 rack=rack1 host=hostc
  73. set item id 5 name 'osd.5' weight 1 at location {host=hostc,rack=rack1,root=cl260} to crush map
  74. [ceph: root@clienta /]# ceph osd crush set osd.6 1.0 root=cl260 rack=rack1 host=hostc
  75. set item id 6 name 'osd.6' weight 1 at location {host=hostc,rack=rack1,root=cl260} to crush map
  76. [ceph: root@clienta /]#
  77. 权重1.0,大小都一样所以1.0
  78. [ceph: root@clienta /]# ceph osd crush set osd.1 1.0 root=cl260 rack=rack1 host=hostc
  79. set item id 1 name 'osd.1' weight 1 at location {host=hostc,rack=rack1,root=cl260} to crush map
  80. [ceph: root@clienta /]# ceph osd crush set osd.5 1.0 root=cl260 rack=rack1 host=hostc
  81. set item id 5 name 'osd.5' weight 1 at location {host=hostc,rack=rack1,root=cl260} to crush map
  82. [ceph: root@clienta /]# ceph osd crush set osd.6 1.0 root=cl260 rack=rack1 host=hostc
  83. set item id 6 name 'osd.6' weight 1 at location {host=hostc,rack=rack1,root=cl260} to crush map
  84. [ceph: root@clienta /]# ceph osd crush set osd.0 1.0 root=cl260 rack=rack2 host=hostd
  85. set item id 0 name 'osd.0' weight 1 at location {host=hostd,rack=rack2,root=cl260} to crush map
  86. [ceph: root@clienta /]# ceph osd crush set osd.3 1.0 root=cl260 rack=rack2 host=hostd
  87. set item id 3 name 'osd.3' weight 1 at location {host=hostd,rack=rack2,root=cl260} to crush map
  88. [ceph: root@clienta /]# ceph osd crush set osd.4 1.0 root=cl260 rack=rack2 host=hostd
  89. set item id 4 name 'osd.4' weight 1 at location {host=hostd,rack=rack2,root=cl260} to crush map
  90. [ceph: root@clienta /]# ceph osd crush set osd.2 1.0 root=cl260 rack=rack3 host=hoste
  91. set item id 2 name 'osd.2' weight 1 at location {host=hoste,rack=rack3,root=cl260} to crush map
  92. [ceph: root@clienta /]# ceph osd crush set osd.7 1.0 root=cl260 rack=rack3 host=hoste
  93. set item id 7 name 'osd.7' weight 1 at location {host=hoste,rack=rack3,root=cl260} to crush map
  94. [ceph: root@clienta /]# ceph osd crush set osd.8 1.0 root=cl260 rack=rack3 host=hoste
  95. set item id 8 name 'osd.8' weight 1 at location {host=hoste,rack=rack3,root=cl260} to crush map
  96. [ceph: root@clienta /]# ceph osd tree
  97. ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
  98. -13 9.00000 root cl260
  99. -14 3.00000 rack rack1
  100. -17 3.00000 host hostc
  101. 1 ssd 1.00000 osd.1 up 1.00000 1.00000
  102. 5 ssd 1.00000 osd.5 up 1.00000 1.00000
  103. 6 ssd 1.00000 osd.6 up 1.00000 1.00000
  104. -15 3.00000 rack rack2
  105. -18 3.00000 host hostd
  106. 0 hdd 1.00000 osd.0 up 1.00000 1.00000
  107. 3 hdd 1.00000 osd.3 up 1.00000 1.00000
  108. 4 hdd 1.00000 osd.4 up 1.00000 1.00000
  109. -16 3.00000 rack rack3
  110. -19 3.00000 host hoste
  111. 2 hdd 1.00000 osd.2 up 1.00000 1.00000
  112. 7 hdd 1.00000 osd.7 up 1.00000 1.00000
  113. 8 hdd 1.00000 osd.8 up 1.00000 1.00000
  114. -1 0 root default
  115. -3 0 host serverc
  116. -5 0 host serverd
  117. -7 0 host servere
  118. [ceph: root@clienta /]#

改配置文件

  1. [ceph: root@clienta /]# ceph osd getcrushmap -o cm-org.bin
  2. 66
  3. [ceph: root@clienta /]# crushtool -d cm-org.bin -o cm-org.txt
  4. [ceph: root@clienta /]# cp cm-org.txt cm-new.txt
  5. [ceph: root@clienta /]# vi cm-new.txt
  6. [ceph: root@clienta /]#
  7. rule ssd_first {
  8. id 2
  9. type replicated
  10. min_size 1
  11. max_size 10
  12. step take rack1 class ssd
  13. step chooseleaf firstn 1 type host
  14. step emit
  15. step take cl260 class hdd
  16. step chooseleaf firstn -1 type rack
  17. step emit
  18. }

第一个副本为主osd,给他放到ssd上

剩下的副本在不同的rack(hdd)里选择一个osd

  1. [ceph: root@clienta /]# crushtool -c cm-new.txt -o cm-new.bin
  2. [ceph: root@clienta /]# ceph osd setcrushmap -i cm-new.bin
  3. ceph 67
  4. [ceph: root@clienta /]# ceph osd tree
  5. ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
  6. -13 9.00000 root cl260
  7. -14 3.00000 rack rack1
  8. -17 3.00000 host hostc
  9. 1 ssd 1.00000 osd.1 up 1.00000 1.00000
  10. 5 ssd 1.00000 osd.5 up 1.00000 1.00000
  11. 6 ssd 1.00000 osd.6 up 1.00000 1.00000
  12. -15 3.00000 rack rack2
  13. -18 3.00000 host hostd
  14. 0 hdd 1.00000 osd.0 up 1.00000 1.00000
  15. 3 hdd 1.00000 osd.3 up 1.00000 1.00000
  16. 4 hdd 1.00000 osd.4 up 1.00000 1.00000
  17. -16 3.00000 rack rack3
  18. -19 3.00000 host hoste
  19. 2 hdd 1.00000 osd.2 up 1.00000 1.00000
  20. 7 hdd 1.00000 osd.7 up 1.00000 1.00000
  21. 8 hdd 1.00000 osd.8 up 1.00000 1.00000
  22. -1 0 root default
  23. -3 0 host serverc
  24. -5 0 host serverd
  25. -7 0 host servere
  26. [ceph: root@clienta /]# ceph osd crush rule ls
  27. replicated_rule
  28. ssd_rule
  29. ssd_first
  30. [ceph: root@clienta /]#

创建存储池查看效果

  1. [ceph: root@clienta /]# ceph osd pool create ssdpool ssd_first
  2. pool 'ssdpool' created
  3. [ceph: root@clienta /]# ceph osd pool ls detail
  4. pool 1 'device_health_metrics' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 249 flags hashpspool stripe_width 0 pg_num_min 1 application mgr_devicehealth
  5. pool 2 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 48 flags hashpspool stripe_width 0 application rgw
  6. pool 3 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 50 flags hashpspool stripe_width 0 application rgw
  7. pool 4 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 52 flags hashpspool stripe_width 0 application rgw
  8. pool 5 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 autoscale_mode on last_change 184 lfor 0/184/182 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 8 application rgw
  9. pool 10 'pool1' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 266 flags hashpspool stripe_width 0
  10. pool 11 'ssdpool' replicated size 3 min_size 2 crush_rule 2 object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 338 flags hashpspool stripe_width 0
  11. [ceph: root@clienta /]# ceph pg dump pgs_brief | grep ^11
  12. dumped pgs_brief
  13. 11.7 active+clean [1,7,4] 1 [1,7,4] 1
  14. 11.4 active+clean [5,0,8] 5 [5,0,8] 5
  15. 11.5 active+clean [1,8,3] 1 [1,8,3] 1
  16. 11.2 active+clean [5,8,3] 5 [5,8,3] 5
  17. 11.3 active+clean [6,7,3] 6 [6,7,3] 6
  18. 11.0 active+clean [1,2,0] 1 [1,2,0] 1
  19. 11.1 active+clean [6,0,2] 6 [6,0,2] 6
  20. 11.e active+clean [5,4,7] 5 [5,4,7] 5
  21. 11.f active+clean [6,0,7] 6 [6,0,7] 6
  22. 11.c active+clean [1,4,8] 1 [1,4,8] 1
  23. 11.9 active+clean [1,8,3] 1 [1,8,3] 1
  24. 11.a active+clean [5,7,3] 5 [5,7,3] 5
  25. 11.8 active+clean [6,8,4] 6 [6,8,4] 6
  26. 11.b active+clean [6,2,4] 6 [6,2,4] 6
  27. 11.d active+clean [6,7,3] 6 [6,7,3] 6
  28. 11.6 active+clean [1,8,3] 1 [1,8,3] 1
  29. 11.19 active+clean [1,8,3] 1 [1,8,3] 1
  30. 11.18 active+clean [6,2,4] 6 [6,2,4] 6
  31. 11.1b active+clean [6,2,4] 6 [6,2,4] 6
  32. 11.1a active+clean [5,3,7] 5 [5,3,7] 5
  33. 11.1d active+clean [1,7,4] 1 [1,7,4] 1
  34. 11.1c active+clean [6,2,4] 6 [6,2,4] 6
  35. 11.1f active+clean [6,4,8] 6 [6,4,8] 6
  36. 11.1e active+clean [5,3,7] 5 [5,3,7] 5
  37. 11.11 active+clean [1,0,7] 1 [1,0,7] 1
  38. 11.10 active+clean [5,2,3] 5 [5,2,3] 5
  39. 11.13 active+clean [5,0,8] 5 [5,0,8] 5
  40. 11.12 active+clean [5,2,0] 5 [5,2,0] 5
  41. 11.15 active+clean [5,0,2] 5 [5,0,2] 5
  42. 11.14 active+clean [6,3,2] 6 [6,3,2] 6
  43. 11.17 active+clean [5,0,8] 5 [5,0,8] 5
  44. 11.16 active+clean [1,0,7] 1 [1,0,7] 1
  45. [ceph: root@clienta /]#

提升读效率,对外提供服务会好

写的时候第一个快了。主osd写完,后复制到另外两hdd

纠删码池和复制池是通用定义域,但是规则会有区别,各用各的

  1. 关于纠删码池的规则
  2. ceph osd erasure-code-profile set myprofile k=3 m=2 crush-root=DC2 crush-failure-domain=rack crush-device-class=ssd
  3. ceph osd pool create myecpool 50 50 erasure myprofile
  4. ceph osd crush rule ls
  5. [ceph: root@clienta /]# ceph osd erasure-code-profile set myprofile2 crush-root=cl260 crush-failure-domain=osd
  6. [ceph: root@clienta /]# ceph osd pool create myecpool2 erasure myprofile2
  7. pool 'myecpool2' created
  8. [ceph: root@clienta /]# ceph pg dump pgs_brief | grep ^13
  9. dumped pgs_brief
  10. 13.1 creating+peering [8,4,5,7] 8 [8,4,5,7] 8
  11. 13.2 creating+peering [8,0,5,4] 8 [8,0,5,4] 8
  12. 13.3 creating+peering [2,6,0,4] 2 [2,6,0,4] 2

手动改映射

  1. [ceph: root@clienta /]# ceph pg map 11.7
  2. osdmap e338 pg 11.7 (11.7) -> up [1,7,4] acting [1,7,4]
  3. [ceph: root@clienta /]# ceph osd pg-upmap-items 11.7 7 8
  4. set 11.7 pg_upmap_items mapping to [7->8]
  5. [ceph: root@clienta /]# ceph pg map 11.7
  6. osdmap e340 pg 11.7 (11.7) -> up [1,8,4] acting [1,8,4]
  7. [ceph: root@clienta /]#

命令概括

  1. 1. 假设每台主机的最后一个osdssd
  2. for i in 0 3 6;do ceph osd crush rm-device-class osd.$i;done
  3. for i in 0 3 6;do ceph osd crush set-device-class ssd osd.$i;done
  4. ceph osd crush class ls
  5. ceph osd crush rule create-replicated ssd_rule default host ssd
  6. ceph osd crush rule ls
  7. 1. 创建基于ssd_rule规则的存储池
  8. ceph osd pool create cache 64 64 ssd_rule
  9. 1. 将一个现有的池迁移至ssdosd
  10. ceph osd pool set cephfs_metadata crush_rule ssd_rule
  11. 1. 写入数据,测试数据分布
  12. rados -p cache put test test.txt
  13. ceph osd map cache test
  14. 3.命令行管理crushmap
  15. 1.移除osd.1 osd.5 osd.6的设备类型
  16. ceph osd crush rm-device-class osd.1
  17. ceph osd crush rm-device-class osd.5
  18. ceph osd crush rm-device-class osd.6
  19. 2.设置osd.1 osd.5 osd.6的设备类型
  20. ceph osd crush set-device-class ssd osd.1
  21. ceph osd crush set-device-class ssd osd.5
  22. ceph osd crush set-device-class ssd osd.6
  23. 3.添加root节点
  24. ceph osd crush add-bucket cl260 root
  25. 4.添加rack节点
  26. ceph osd crush add-bucket rack1 rack
  27. ceph osd crush add-bucket rack2 rack
  28. ceph osd crush add-bucket rack3 rack
  29. 5.添加主机节点
  30. ceph osd crush add-bucket hostc host
  31. ceph osd crush add-bucket hostd host
  32. ceph osd crush add-bucket hoste host
  33. 6.rack移动到root节点下
  34. ceph osd crush move rack1 root=cl260
  35. ceph osd crush move rack2 root=cl260
  36. ceph osd crush move rack3 root=cl260
  37. 7.host移动到rack
  38. ceph osd crush move hostc rack=rack1
  39. ceph osd crush move hostd rack=rack2
  40. ceph osd crush move hoste rack=rack3
  41. 8.osd移动到host
  42. ceph osd crush set osd.1 1.0 root=cl260 rack=rack1 host=hostc
  43. ceph osd crush set osd.5 1.0 root=cl260 rack=rack1 host=hostc
  44. ceph osd crush set osd.6 1.0 root=cl260 rack=rack1 host=hostc
  45. ceph osd crush set osd.0 1.0 root=cl260 rack=rack1 host=hostd
  46. ceph osd crush set osd.3 1.0 root=cl260 rack=rack1 host=hostd
  47. ceph osd crush set osd.4 1.0 root=cl260 rack=rack1 host=hostd
  48. ceph osd crush set osd.2 1.0 root=cl260 rack=rack1 host=hoste
  49. ceph osd crush set osd.7 1.0 root=cl260 rack=rack1 host=hoste
  50. ceph osd crush set osd.8 1.0 root=cl260 rack=rack1 host=hoste
  51. 9添加规则
  52. ceph osd getcrushmap -o cm-org.bin
  53. crushtool -d cm-org.bin -o cm-org.txt
  54. cp cm-org.txt cm-new.txt
  55. vi cm-new.txt
  56. rule ssd_first {
  57. id 2
  58. type replicated
  59. min_size 1
  60. max_size 10
  61. step take rack1
  62. step chooseleaf firstn 1 type host # 第1个副本在rack1上
  63. step emit
  64. step take cl260 class hdd
  65. step chooseleaf firstn -1 type rack # 剩余副本在cl260根下的hdd上
  66. step emit
  67. }
  68. crushtool -c cm-new.txt -o cm-new.bin
  69. ceph osd setcrushmap -i cm-new.bin
  70. ceph osd tree
  71. ceph osd crush ls
  72. ceph osd crush rule ls
  73. ceph osd pool create ssdpool ssd_first
  74. ceph pg dump pgs_brief | grep ^10

ceph 009 管理定义crushmap 故障域的更多相关文章

  1. ceph crush算法和crushmap浅析

    1 什么是crushmap crushmap就相当于是ceph集群的一张数据分布地图,crush算法通过该地图可以知道数据应该如何分布:找到数据存放位置从而直接与对应的osd进行数据访问和写入:故障域 ...

  2. 理解 OpenStack + Ceph (7): Ceph 的基本操作和常见故障排除方法

    本系列文章会深入研究 Ceph 以及 Ceph 和 OpenStack 的集成: (1)安装和部署 (2)Ceph RBD 接口和工具 (3)Ceph 物理和逻辑结构 (4)Ceph 的基础数据结构 ...

  3. JMeter HTTP Cookie管理器的跨域使用

    Jmeter的一个测试计划只能有一个cookie管理器,当多个manager同时存在时,无法指定是用的哪一个manager.如果想让cookie manager跨域使用,修改JMeter.proper ...

  4. JQUERY-自定义插件-ajax-跨域访问

    正课: 1. ***自定义插件: 2. Ajax 3. *****跨域访问: 1. ***自定义插件: 前提: 已经用html,css,js实现了 2种风格: 1. jQueryUI侵入式: 1. c ...

  5. CentOS服务器上搭建Gitlab安装步骤、中文汉化详细步骤、日常管理以及异常故障排查

    一, 服务器快速搭建gitlab方法 可以参考gitlab中文社区 的教程centos7安装gitlab:https://www.gitlab.cc/downloads/#centos7centos6 ...

  6. CREATE DOMAIN - 定义一个新域

    SYNOPSIS CREATE DOMAIN name [AS] data_type [ DEFAULT expression ] [ constraint [ ... ] ] where const ...

  7. kubernetes云平台管理实战: 故障自愈实战(四)

    一.创建实验文件 [root@k8s-master ~]# cat myweb-rc.yml apiVersion: v1 kind: ReplicationController metadata: ...

  8. ceph 池管理

    存储池的概念:比如有10个1T的硬盘,作为一个集群,那就可以在这个集群里划分几个池,给不同的组件使用 问题描述: 删除pool的时候提示下面的错误: ceph osd pool delete test ...

  9. ceph pool 管理

    创建池 [root@node1 ~]# ceph osd pool create monitor pool 'monitor' created 查看池 [root@node1 ~]# ceph osd ...

随机推荐

  1. 2006NOIP普及组:明明的随机数

    明明的随机数 时间限制:1000ms        内存限制:65536KB 题目描述: 明明想在学校中请一些同学一起做一项问卷调查,为了实验的客观性,他先用计算机生成了N个1到1000之间的随机整数 ...

  2. 人体调优不完全指南「GitHub 热点速览 v.22.22」

    本周特推又是一个人体调优项目,换而言之就是如何健康生活,同之前的 HowToLiveLonger研究全因死亡率不同,这个项目更容易在生活中实践,比如,早起晒太阳这么一件"小事"便有 ...

  3. net core天马行空系列-微服务篇:全声明式http客户端feign快速接入微服务中心nacos

    1.前言 hi,大家好,我是三合,距离上一篇博客已经过去了整整两年,这两年里,博主通关了<人生>这个游戏里的两大关卡,买房和结婚.最近闲了下来,那么当然要继续写博客了,今天这篇博客的主要内 ...

  4. CabloyJS - GitHub Readme

    简体中文 | English CabloyJS CabloyJS是一款顶级NodeJS全栈业务开发框架, 基于KoaJS + EggJS + VueJS + Framework7 文档 官网 & ...

  5. Spring Boot 实践 :Spring Boot + MyBatis

    Spring Boot 实践系列,Spring Boot + MyBatis . 目的 将 MyBatis 与 Spring Boot 应用程序一起使用来访问数据库. 本次使用的Library spr ...

  6. 3.C++逐行读取txt文件数据,利用getline -windows编程

      引言:今天学会了getline的用法,顺手编写一个逐行读取txt文件的程序.关于getline的用法可以看我之前的博客:2.C++标准库函数:getline函数 定界流输入截取函数 -zobol的 ...

  7. uniapp使用scroll-view与swiper组件实现tab滑动切换页面需要注意的问题

    效果图: tab栏可以滑动,切换页面跟随tab栏同步滑动.这里需要注意的是使用swiper组件时,它会有一个默认的高度,你必须动态的获取数据列表的高度覆盖原来的默认高度. 下面是代码 html < ...

  8. 『现学现忘』Docker基础 — 40、发布镜像到Docker Hub

    目录 1.准备工作 2.Docker登陆命令 3.Docker提交命令 4.总结: 5.补充:docker tag命令 1.准备工作 Docker Hub地址:https://hub.docker.c ...

  9. NC50528 滑动窗口

    NC50528 滑动窗口 题目 题目描述 给一个长度为N的数组,一个长为K的滑动窗体从最左端移至最右端,你只能看到窗口中的K个数,每次窗体向右移动一位,如下图: 你的任务是找出窗体在各个位置时的最大值 ...

  10. 干货 |《2022B2B新增长系列之企服行业橙皮书》重磅发布

    企服行业面临的宏观环境和微观环境已然发生了明显的变化.一方面,消费级互联网成为过去式,爆发式增长的时代结束.资本.媒体的目光已经悄然聚焦到以企服行业所代表的产品互联网身上,B2B企业正稳步走向C位. ...