搭建rook ceph存储集群
一、安装Rook集群
下载代码
1
2git clone --single-branch --branch v1.8.3 https://github.com/rook/rook.git
cd rook/deploy/examples
修改operator.yaml文件(主要是修改镜像文件下载地址,无法直接在国内网络环境下从k8s.gcr.io下载镜像,最好使用翻墙工具在自己电脑上下好镜像然后在生产服务器上直接使用)。如下图所示:
1
2
3
4
5
6ROOK_CSI_CEPH_IMAGE: "testharbor.zuoyejia.com/k8s_image/cephcsi@sha256:19634b6ef9fc6df2902cf6ff0b3dbccc56a6663d0cbfd065da44ecd2f955d848"
ROOK_CSI_REGISTRAR_IMAGE: "testharbor.zuoyejia.com/k8s_image/csi-node-driver-registrar@sha256:01b341312ea19cefc29f46fa0dd54255530b9039dd80834f50d582ecd93cc3ca"
ROOK_CSI_RESIZER_IMAGE: "testharbor.zuoyejia.com/k8s_image/csi-resizer@sha256:d2d2e429a0a87190ee73462698a02a08e555055246ad87ad979b464b999fedae"
ROOK_CSI_PROVISIONER_IMAGE: "testharbor.zuoyejia.com/k8s_image/csi-provisioner@sha256:bbae7cde811054f6a51060ba7a42d8bf2469b8c574abb50fec8b46c13e32541e"
ROOK_CSI_SNAPSHOTTER_IMAGE: "testharbor.zuoyejia.com/k8s_image/csi-snapshotter@sha256:551b9692943f915b5ee4b7274e3a918692a6175bb028f1f0236a38596c46cbe0"
ROOK_CSI_ATTACHER_IMAGE: "testharbor.zuoyejia.com/k8s_image/csi-attacher@sha256:221c1c6930fb1cb93b57762a74ccb59194c4c74a63c0fd49309d1158d4f8c72c"修改cluster.yaml文件(主要修改镜像地址、mon数量、mgr数量、dashboard的ssl)
- mgr数量:Ceph集群需要可用,所以最好配置为2(这里有个深坑这里的高可用是主备模式,在配置dashboard的时候需要注意)
dashboard的ssl:修改这个是因为我们使用ingress的方式,并且在ingress的前面还有一层是华为云的ELB(华为云的ELB不支持后端协议为tcp,所以这里只能将dashboard的ssl关掉,在ELB层配置https),
如下图所示:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36mon:
# Set the number of mons to be started. Generally recommended to be 3.
# For highest availability, an odd number of mons should be specified.
count: 3
# The mons should be on unique nodes. For production, at least 3 nodes are recommended for this reason.
# Mons should only be allowed on the same node for test environments where data loss is acceptable.
allowMultiplePerNode: false
mgr:
# When higher availability of the mgr is needed, increase the count to 2.
# In that case, one mgr will be active and one in standby. When Ceph updates which
# mgr is active, Rook will update the mgr services to match the active mgr.
count: 2
modules:
# Several modules should not need to be included in this list. The "dashboard" and "monitoring" modules
# are already enabled by other settings in the cluster CR.
- name: pg_autoscaler
enabled: true
# enable the ceph dashboard for viewing cluster status
dashboard:
enabled: true
# serve the dashboard under a subpath (useful when you are accessing the dashboard via a reverse proxy)
# urlPrefix: /ceph-dashboard
# serve the dashboard at the given port.
# port: 8443
# serve the dashboard using SSL
# ssl: true
# enable prometheus alerting for cluster
monitoring:
# requires Prometheus to be pre-installed
enabled: false
# namespace to deploy prometheusRule in. If empty, namespace of the cluster will be used.
# Recommended:
# If you have a single rook-ceph cluster, set the rulesNamespace to the same namespace as the cluster or keep it empty.
# If you have multiple rook-ceph clusters in the same k8s cluster, choose the same namespace (ideally, namespace with prometheus
# deployed) to set rulesNamespace for all the clusters. Otherwise, you will get duplicate alerts with multiple alert definitions.
rulesNamespace: rook-ceph
部署Rook Operator
1
2cd rook/deploy/examples
kubectl apply -f crds.yaml -f common.yaml -f operator.yaml
创建Ceph集群
1
kubectl apply -f cluster.yaml
检测Ceph集群是否正常。如下图所示
1
kubectl -n rook-ceph get pod
二、搭建Ceph Dashboard面板
配置在集群外部查看Dashboard。这里我们使用service http的NodePort模式。这个时候我们会发现Dashboard并不可以访问,解决办法看2
1
2cd rook/deploy/examples
kubectl create -f dashboard-external-http.yaml
检测mgr主备高可用模式下哪个pod真正能用
由于主备模式下有一个Pod是不能用的,所以在配置service的时候可能代理的mgr pod不能用,所以导致Dashboard不能访问。找到第一步查看上面的svc。我们看到pod所在的端口(targetPort)是7000
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20apiVersion: v1
kind: Service
metadata:
name: rook-ceph-mgr-dashboard-external-http
namespace: rook-ceph # namespace:cluster
labels:
app: rook-ceph-mgr
rook_cluster: rook-ceph # namespace:cluster
spec:
ports:
- name: dashboard
port: 7000
protocol: TCP
targetPort: 7000
selector:
app: rook-ceph-mgr
ceph_daemon_id: a
rook_cluster: rook-ceph
sessionAffinity: None
type: NodePort找到mgr的两个podIP地址
1
kubectl -n rook-ceph get pod -owide
使用curl访问7000端口查看哪个Pod正常返回
1
2curl 10.244.8.83:7000
curl 10.244.68.188:7000查看这两个pod的lables信息,并找到ceph_daemon_id信息
1
kubectl get pod -n rook-ceph --show-labels
修改dashboard-external-http.yaml文件。在selector中加上可用Pod的ceph_daemon_id
1
2
3
4selector:
app: rook-ceph-mgr
ceph_daemon_id: a
rook_cluster: rook-ceph
重新加载dashboard-external-http.yaml service。这次发现使用公网ip+Nodeport可以正常访问Dashboard
1
kubectl apply -f dashboard-external-http.yaml
创建 ingress文件dashboard-ingress-http.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25#
# This example is for Kubernetes running an ngnix-ingress
# and an ACME (e.g. Let's Encrypt) certificate service
#
# The nginx-ingress annotations support the dashboard
# running using HTTPS with a self-signed certificate
#
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: rook-ceph-mgr-dashboard
namespace: rook-ceph # namespace:cluster
spec:
ingressClassName: nginx
rules:
- host: rookceph.zuoyejia.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: rook-ceph-mgr-dashboard-external-http
port:
number: 7000
加载ingress
1
kubectl apply -f dashboard-ingress-http.yaml
- 注意:本文第4、5部需要使用ingress-nginx和华为云的ELB。如果没有的话直接使用官方文档上的文件。再次提醒,service上必须加上可以访问正常Pod的选择器。
三、Ceph存储
1.1 块存储(RBD)
RDB: RADOS Block Devices
RADOS: Reliable, Autonomic Distributed Object Store
配置
RWO:(ReadWriteOnce)
常用 块存储 。RWO模式;STS删除,pvc不会删除,需要自己手动维护https://www.rook.io/docs/rook/v1.8/ceph-block.html
1
kubectl create -f deploy/examples/csi/rbd/storageclass.yaml
1.2 共享文件存储(CephFS)
配置
常用 文件存储。 RWX模式;如:10个Pod共同操作一个地方
https://rook.io/docs/rook/v1.8/ceph-filesystem.html1
2
3cd rook
kubectl apply -f filesystem.yaml
kubectl create -f deploy/examples/csi/cephfs/storageclass.yaml
四、卸载Rook Ceph
参考:https://rook.io/docs/rook/v1.8/ceph-teardown.html
1. 清理集群
1 | rm -rf /var/lib/rook |
2.删除块和部署的文件
1 | kubectl delete -f crds.yaml -f common.yaml -f operator.yaml |
3.删除CephCluster CRD
1 | # 1.编辑CephCluster并添加cleanupPolicy |
4.删除Operator 及相关资源
1 | kubectl delete -f operator.yaml |
5.删除主机上的数据
1 | DISK="/dev/vdb" |
6.故障排除
查看Pod
1
kubectl -n rook-ceph get pod
查看集群CRD
1
kubectl -n rook-ceph get cephcluster
删除CRD
1
2
3
4
5# 删除CRD
for CRD in $(kubectl get crd -n rook-ceph | awk '/ceph.rook.io/ {print $1}'); do
kubectl get -n rook-ceph "$CRD" -o name | \
xargs -I {} kubectl patch -n rook-ceph {} --type merge -p '{"metadata":{"finalizers": [null]}}'
done
如果如果namespace仍然停留在Terminting状态,可以检查哪些资源正在阻止删除并删除finalizers并删除
1
2kubectl api-resources --verbs=list --namespaced -o name \
| xargs -n 1 kubectl get --show-kind --ignore-not-found -n rook-ceph
删除finalizers资源
1
2
3
4
5
6kubectl -n rook-ceph patch configmap rook-ceph-mon-endpoints --type merge -p '{"metadata":{"finalizers": []}}'
kubectl -n rook-ceph patch secrets rook-ceph-mon --type merge -p '{"metadata":{"finalizers": []}}'
# 如果cluster和replicapool存在执行下面命令
kubectl -n rook-ceph patch cephclusters.ceph.rook.io rook-ceph -p '{"metadata":{"finalizers": []}}' --type=merge
kubectl -n rook-ceph patch cephblockpool.ceph.rook.io replicapool -p '{"metadata":{"finalizers": []}}' --type=merge