Ceph OSD scaling and rebalancing

🌐 This document is available in both English and Ukrainian. Use the language toggle in the top right corner to switch between versions.

This guide covers how to scale a Ceph cluster by adding new OSDs (Object Storage Daemons) and optimize the rebalancing process to prevent performance degradation.

Set up a Grafana alert to trigger when disk usage exceeds 75%. This gives you time to scale the cluster before storage runs out.

1. Scaling OSDs

When your Ceph cluster is running low on storage, add new PVCs and increase the number of OSDs by updating the StorageCluster CR resource.

1.1. Adding OSDs via StorageCluster CR

  1. Edit the StorageCluster resource in the openshift-storage namespace:

    oc edit StorageCluster -n openshift-storage ocs-storagecluster
  2. Locate the storage.storageClassDeviceSets section and increase the count value to add more OSD groups. Ensure the number of replica and storage size per OSD group matches your requirements.

Example: Adding one new group with 3 replicas of 512Gi each
storage:
  storageClassDeviceSets:
    - name: ocs-deviceset-gp3-csi
      count: 2  # was 1 — add one more group (1)
      replica: 3 (2)
      resources:
        limits:
          cpu: "4"
          memory: 16Gi
        requests:
          cpu: "2"
          memory: 8Gi
      volumeClaimTemplates:
        - metadata:
            name: data
          spec:
            resources:
              requests:
                storage: 512Gi (3)
            storageClassName: gp3-csi
            volumeMode: Block
            accessModes:
              - ReadWriteOnce
1 count — number of OSD groups to deploy. Increasing by 1 adds a new group based on the given replica count.
2 replica — number of OSDs in each group. For example, replica: 3 deploys three OSDs per group.
3 storage — PVC size for each OSD. Here, each replica is allocated 512Gi.

Before scaling, check the Capacity of existing PVCs. New OSDs must match the existing size to avoid imbalance.

ceph osd scaling and rebalancing 1

1.2. Verifying the result

Run the following commands to verify that the cluster has scaled:

  1. Check Ceph cluster status:

    ceph -s --conf=/var/lib/rook/openshift-storage/openshift-storage.config

    This command shows the overall cluster state, including OSD count, monitor health, PG states, and replication status.

  2. Check the OSD tree to confirm new OSDs were added:

    ceph osd tree --conf=/var/lib/rook/openshift-storage/openshift-storage.config

    This command displays the cluster OSD hierarchy, including host associations and status.

2. Configuring and optimizing rebalancing

After adding or removing OSDs, Ceph automatically triggers rebalancing — redistributing objects across OSDs to maintain balanced storage usage.

This process can be resource-intensive and may affect cluster performance. Use the parameters below to either throttle or accelerate the process, depending on your needs.

2.1. Throttling rebalancing

Use this option when maintaining stable performance during business hours is the priority.

  1. Limit the number of parallel backfill threads:

    ceph config set osd osd_max_backfills 1 --conf=/var/lib/rook/openshift-storage/openshift-storage.config
    
    ceph config set osd osd_recovery_max_active 1 --conf=/var/lib/rook/openshift-storage/openshift-storage.config

    These settings reduce concurrent backfill and recovery operations to minimize impact on disk and CPU.

  2. Add delay between recovery actions:

    ceph config set osd osd_recovery_sleep 0.1 --conf=/var/lib/rook/openshift-storage/openshift-storage.config

    This introduces a 0.1s delay between recovery actions to reduce a disk and CPU load.

2.2. Speeding up rebalancing (Recommended during maintenance windows)

During low-traffic periods (e.g., overnight or weekends), you can speed up rebalancing by allowing more parallel operations:

ceph config set osd osd_max_backfills 4 --conf=/var/lib/rook/openshift-storage/openshift-storage.config

ceph config set osd osd_recovery_max_active 4 --conf=/var/lib/rook/openshift-storage/openshift-storage.config

These settings increase the number of simultaneous backfill and recovery operations. Use only during scheduled maintenance windows.

2.3. Monitoring Rebalancing Progress

Use these commands to monitor rebalancing activity and Placement Group status:

ceph -s --conf=/var/lib/rook/openshift-storage/openshift-storage.config

Displays current cluster status, including recovery, backfill, and degraded PGs.

ceph pg stat --conf=/var/lib/rook/openshift-storage/openshift-storage.config

Shows detailed PG statistics and their current state (active, clean, backfilling, recovering).

If the cluster remains in a degraded state for an extended time after scaling or OSD removal, check for inactive PGs or disk-related errors in the Ceph logs.