搭建rook ceph存储集群

一、安装Rook集群

https://www.rook.io/docs/rook/v1.8/quickstart.html

下载代码

1 2	git clone --single-branch --branch v1.8.3 https://github.com/rook/rook.git cd rook/deploy/examples

修改operator.yaml文件（主要是修改镜像文件下载地址，无法直接在国内网络环境下从k8s.gcr.io下载镜像，最好使用翻墙工具在自己电脑上下好镜像然后在生产服务器上直接使用）。如下图所示：

ROOK_CSI_CEPH_IMAGE: "testharbor.zuoyejia.com/k8s_image/cephcsi@sha256:19634b6ef9fc6df2902cf6ff0b3dbccc56a6663d0cbfd065da44ecd2f955d848"
ROOK_CSI_REGISTRAR_IMAGE: "testharbor.zuoyejia.com/k8s_image/csi-node-driver-registrar@sha256:01b341312ea19cefc29f46fa0dd54255530b9039dd80834f50d582ecd93cc3ca"
ROOK_CSI_RESIZER_IMAGE: "testharbor.zuoyejia.com/k8s_image/csi-resizer@sha256:d2d2e429a0a87190ee73462698a02a08e555055246ad87ad979b464b999fedae"
ROOK_CSI_PROVISIONER_IMAGE: "testharbor.zuoyejia.com/k8s_image/csi-provisioner@sha256:bbae7cde811054f6a51060ba7a42d8bf2469b8c574abb50fec8b46c13e32541e"
ROOK_CSI_SNAPSHOTTER_IMAGE: "testharbor.zuoyejia.com/k8s_image/csi-snapshotter@sha256:551b9692943f915b5ee4b7274e3a918692a6175bb028f1f0236a38596c46cbe0"
ROOK_CSI_ATTACHER_IMAGE: "testharbor.zuoyejia.com/k8s_image/csi-attacher@sha256:221c1c6930fb1cb93b57762a74ccb59194c4c74a63c0fd49309d1158d4f8c72c"

修改cluster.yaml文件（主要修改镜像地址、mon数量、mgr数量、dashboard的ssl）

mgr数量：Ceph集群需要可用，所以最好配置为2（这里有个深坑这里的高可用是主备模式，在配置dashboard的时候需要注意）

dashboard的ssl：修改这个是因为我们使用ingress的方式，并且在ingress的前面还有一层是华为云的ELB（华为云的ELB不支持后端协议为tcp，所以这里只能将dashboard的ssl关掉，在ELB层配置https），

如下图所示：

mon:
        # Set the number of mons to be started. Generally recommended to be 3.
        # For highest availability, an odd number of mons should be specified.
        count: 3
        # The mons should be on unique nodes. For production, at least 3 nodes are recommended for this reason.
        # Mons should only be allowed on the same node for test environments where data loss is acceptable.
        allowMultiplePerNode: false
mgr:
        # When higher availability of the mgr is needed, increase the count to 2.
        # In that case, one mgr will be active and one in standby. When Ceph updates which
        # mgr is active, Rook will update the mgr services to match the active mgr.
        count: 2
        modules:
          # Several modules should not need to be included in this list. The "dashboard" and "monitoring" modules
          # are already enabled by other settings in the cluster CR.
          - name: pg_autoscaler
            enabled: true
      # enable the ceph dashboard for viewing cluster status
dashboard:
        enabled: true
        # serve the dashboard under a subpath (useful when you are accessing the dashboard via a reverse proxy)
        # urlPrefix: /ceph-dashboard
        # serve the dashboard at the given port.
        # port: 8443
        # serve the dashboard using SSL
        # ssl: true
      # enable prometheus alerting for cluster
monitoring:
        # requires Prometheus to be pre-installed
        enabled: false
        # namespace to deploy prometheusRule in. If empty, namespace of the cluster will be used.
        # Recommended:
        # If you have a single rook-ceph cluster, set the rulesNamespace to the same namespace as the cluster or keep it empty.
        # If you have multiple rook-ceph clusters in the same k8s cluster, choose the same namespace (ideally, namespace with prometheus
        # deployed) to set rulesNamespace for all the clusters. Otherwise, you will get duplicate alerts with multiple alert definitions.
        rulesNamespace: rook-ceph

部署Rook Operator

1 2	cd rook/deploy/examples kubectl apply -f crds.yaml -f common.yaml -f operator.yaml

创建Ceph集群
1
kubectl apply -f cluster.yaml

检测Ceph集群是否正常。如下图所示
1
kubectl -n rook-ceph get pod

二、搭建Ceph Dashboard面板

配置在集群外部查看Dashboard。这里我们使用service http的NodePort模式。这个时候我们会发现Dashboard并不可以访问，解决办法看2
1
2
cd rook/deploy/examples
kubectl create -f dashboard-external-http.yaml

检测mgr主备高可用模式下哪个pod真正能用
由于主备模式下有一个Pod是不能用的，所以在配置service的时候可能代理的mgr pod不能用，所以导致Dashboard不能访问。

找到第一步查看上面的svc。我们看到pod所在的端口(targetPort)是7000

apiVersion: v1
kind: Service
metadata:
  name: rook-ceph-mgr-dashboard-external-http
  namespace: rook-ceph # namespace:cluster
  labels:
    app: rook-ceph-mgr
    rook_cluster: rook-ceph # namespace:cluster
spec:
  ports:
    - name: dashboard
      port: 7000
      protocol: TCP
      targetPort: 7000
  selector:
    app: rook-ceph-mgr
    ceph_daemon_id: a
    rook_cluster: rook-ceph
  sessionAffinity: None
  type: NodePort

找到mgr的两个podIP地址
1
kubectl -n rook-ceph get pod -owide
使用curl访问7000端口查看哪个Pod正常返回
1
2
curl 10.244.8.83:7000
curl 10.244.68.188:7000
查看这两个pod的lables信息，并找到ceph_daemon_id信息
1
kubectl get pod -n rook-ceph --show-labels

修改dashboard-external-http.yaml文件。在selector中加上可用Pod的ceph_daemon_id

selector:
            app: rook-ceph-mgr
            ceph_daemon_id: a
            rook_cluster: rook-ceph

重新加载dashboard-external-http.yaml service。这次发现使用公网ip+Nodeport可以正常访问Dashboard
1
kubectl apply -f dashboard-external-http.yaml

创建 ingress文件dashboard-ingress-http.yaml

#
# This example is for Kubernetes running an ngnix-ingress
# and an ACME (e.g. Let's Encrypt) certificate service
#
# The nginx-ingress annotations support the dashboard
# running using HTTPS with a self-signed certificate
#
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: rook-ceph-mgr-dashboard
  namespace: rook-ceph # namespace:cluster
spec:
  ingressClassName: nginx
  rules:
    - host: rookceph.zuoyejia.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: rook-ceph-mgr-dashboard-external-http
                port:
                  number: 7000

加载ingress

1	kubectl apply -f dashboard-ingress-http.yaml

注意：本文第4、5部需要使用ingress-nginx和华为云的ELB。如果没有的话直接使用官方文档上的文件。再次提醒，service上必须加上可以访问正常Pod的选择器。

三、Ceph存储

1.1 块存储(RBD)

RDB： RADOS Block Devices

RADOS： Reliable, Autonomic Distributed Object Store

配置
RWO:（ReadWriteOnce）
常用块存储。RWO模式；STS删除，pvc不会删除，需要自己手动维护

https://www.rook.io/docs/rook/v1.8/ceph-block.html
1
kubectl create -f deploy/examples/csi/rbd/storageclass.yaml

1.2 共享文件存储(CephFS)

配置
常用文件存储。 RWX模式；如：10个Pod共同操作一个地方
https://rook.io/docs/rook/v1.8/ceph-filesystem.html
1
2
3
cd rook
kubectl apply -f filesystem.yaml
kubectl create -f deploy/examples/csi/cephfs/storageclass.yaml

四、卸载Rook Ceph

参考：https://rook.io/docs/rook/v1.8/ceph-teardown.html

1. 清理集群

1	rm -rf /var/lib/rook

2.删除块和部署的文件

kubectl delete -f crds.yaml -f common.yaml -f operator.yaml
kubectl delete -f cluster.yaml
kubectl delete -n rook-ceph cephblockpool replicapool
kubectl delete storageclass rook-ceph-block

3.删除CephCluster CRD

# 1.编辑CephCluster并添加cleanupPolicy
# 2.删除CephClusterCR
# 3.确认已删除集群 CR
kubectl -n rook-ceph patch cephcluster rook-ceph --type merge -p '{"spec":{"cleanupPolicy":{"confirmation":"yes-really-destroy-data"}}}'
kubectl -n rook-ceph delete cephcluster rook-ceph
kubectl -n rook-ceph get cephcluster

4.删除Operator 及相关资源

1
2
3

kubectl delete -f operator.yaml
kubectl delete -f common.yaml
kubectl delete -f crds.yaml

5.删除主机上的数据

DISK="/dev/vdb"
sgdisk --zap-all $DISK
dd if=/dev/zero of="$DISK" bs=1M count=100 oflag=direct,dsync
# 如果是SSD
# blkdiscard $DISK
partprobe $DISK

ls /dev/mapper/ceph-* | xargs -I% -- dmsetup remove %
rm -rf /dev/ceph-*
rm -rf /dev/mapper/ceph--*

6.故障排除

查看Pod
1
kubectl -n rook-ceph get pod

查看集群CRD
1
kubectl -n rook-ceph get cephcluster

删除CRD

# 删除CRD
for CRD in $(kubectl get crd -n rook-ceph | awk '/ceph.rook.io/ {print $1}'); do
    kubectl get -n rook-ceph "$CRD" -o name | \
    xargs -I {} kubectl patch -n rook-ceph {} --type merge -p '{"metadata":{"finalizers": [null]}}'
done

如果如果namespace仍然停留在Terminting状态，可以检查哪些资源正在阻止删除并删除finalizers并删除

1 2	kubectl api-resources --verbs=list --namespaced -o name \ \| xargs -n 1 kubectl get --show-kind --ignore-not-found -n rook-ceph

删除finalizers资源

kubectl -n rook-ceph patch configmap rook-ceph-mon-endpoints --type merge -p '{"metadata":{"finalizers": []}}'
kubectl -n rook-ceph patch secrets rook-ceph-mon --type merge -p '{"metadata":{"finalizers": []}}'

# 如果cluster和replicapool存在执行下面命令
kubectl -n rook-ceph patch cephclusters.ceph.rook.io rook-ceph -p '{"metadata":{"finalizers": []}}' --type=merge
kubectl -n rook-ceph patch cephblockpool.ceph.rook.io replicapool -p '{"metadata":{"finalizers": []}}' --type=merge