Migrating registry database instance disks from ceph-rbd to cloud storage disk types
- 1. Main disk types
- 2. Replacement process overview
- 3. Replacement algorithm
- Step 1: Preparation for migration
- Step 2: Closing external access
- Step 3: Pre-migration backup
- Step 4: Preparing configurations
- Step 5: Verifying backups
- Step 6: Recreating the
operational
instance - Step 7: Restoring the
operational
database - Step 8: Recreating and restoring the
analytical
instance - Step 9: Post-migration verification
- Step 10: Updating
registry-postgres
component code - Step 11: Running pipelines
- Step 12: Finalizing the migration
- 4. Conclusion
🌐 This document is available in both English and Ukrainian. Use the language toggle in the top right corner to switch between versions. |
This guide describes the process of replacing disks for the operational and analytical instances of the registry database from the default ceph-rbd
storage class to native cloud provider disks, such as gp3
(AWS) or thin
, thin-csi
(vSphere).
1. Main disk types
The table below lists the supported disk types for migration:
Disk Type | Description |
---|---|
|
AWS native disk with flexible volume expansion and configurable performance parameters (IOPS, throughput). |
|
vSphere native disk with support for flexible volume expansion. |
|
vSphere native disk without support for flexible volume expansion. |
2. Replacement process overview
The replacement process involves recreating the PostgresCluster
resources for both analytical
and operational
instances with updated disks in a new storage class (gp3
, thin
, or thin-csi
). The databases are then restored from existing backups using standard PostgreSQL tools.
3. Replacement algorithm
Step 1: Preparation for migration
-
Check instance logs: Ensure that there are no errors in the logs of the
analytical
andoperational
pods, especially related to replication to the analytical instance. -
Verify backup availability:
-
Ensure that MinIO contains up-to-date full backups for both
analytical
andoperational
instances, created automatically according to the schedule. -
By default, backups are created once a day.
-
Step 2: Closing external access
Restrict registry access:
Close external access to the registry (user portals, bp-web-service-gateway
, and so on).
Step 3: Pre-migration backup
-
Create a full registry backup: Use Velero to back up the registry. Access the central Jenkins, find your registry folder, and run the Create-registry-backup pipeline.
-
Wait for the build to complete successfully.
-
Adjust backup retention settings: In the
PostgresCluster
configuration, change therepo1-retention-full
parameter to'5'
or higher.Backup configuration exampleglobal: log-level-file: detail repo1-path: /postgres-backup/smt-dev/operational repo1-retention-full: '5' (1) repo1-retention-full-type: count repo1-s3-uri-style: path repo1-storage-port: '443' repo1-storage-verify-tls: 'n' start-fast: 'y'
Parameter explanation:
1 repo1-retention-full: '5'
— Specifies the number of retained full backups. In this case, the system will keep the latest five backups. -
Create new full backups of instances: Use standard PostgreSQL tools to create new full backups for
analytical
andoperational
instances.
For detailed instructions on creating a one-time backup: One-time backup creation. |
Step 4: Preparing configurations
-
Save current configurations:
-
Save the YAML files for
PostgresCluster
resources (analytical
andoperational
). -
Save the
postgres
user passwords from theoperational-pguser-postgres
andanalytical-pguser-postgres
secrets.
-
-
Update configuration for new disks. Modify the following fields in the saved resource files:
Updating Configuration for New Disksinstances: - dataVolumeClaimSpec: accessModes: - ReadWriteOnce resources: requests: storage: 21Gi (1) storageClassName: thin (2)
Parameter explanation:
1 storage: 21Gi
— Volume size for the new instance disk. Adjust this value as needed.2 storageClassName: thin
— Specifies the storage class for the new disk. In this example,thin
(vSphere) is used. Other options includethin-csi
(vSphere) orgp3
(AWS). -
Remove the following fields:
metadata
,finalizers
,status
.
Ensure to remove annotations related to full backups added in Step 3. |
Step 5: Verifying backups
Check backup status:
-
Wait at least 15–20 minutes after backup creation.
-
Verify the PostgreSQLDetails Grafana dashboard to ensure the latest backups are marked
"successful"
.
Ensure that the latest backups are recent and created just moments before verification. |
Step 6: Recreating the operational
instance
-
Create a new instance:
-
Delete the current
operational
resource. -
Create a new resource using the updated configuration from Step 4.
-
-
Verify component restoration:
-
Ensure that all components of the
operational
resource are restored. -
Verify the new disk is attached to the correct storage class.
-
Restore the user password. Insert the saved
postgres
password from Step 4 into the newly createdoperational-pguser-postgres
secret.
-
Step 7: Restoring the operational
database
Restore the database: Use standard PostgreSQL tools to restore the database from the created backup.
Detailed restoration guide: Restore to target time or backup |
Use either point-in-time recovery or restore from a specific backup. If using point-in-time recovery, it’s recommended to select a time 10–15 minutes before the backup creation time from Step 3 to ensure data completeness. |
Step 8: Recreating and restoring the analytical
instance
Repeat Steps 6 and 7 for the analytical
instance.
Step 9: Post-migration verification
Check instance logs:
-
Ensure there are no errors in the
analytical
andoperational
pod logs. -
Verify that replication to the analytical instance is functioning correctly.
Step 10: Updating registry-postgres
component code
Update registry-postgres
component settings in your registry.
Modify the code to reflect the new disk types and sizes:
-
For
thin
(vSphere) — hardcode the values. -
For
thin-csi
(vSphere) andgp3
(AWS) — usevalues.yaml
.
It’s recommended to make these changes in a new branch and update the registry’s helmfile.yaml .
|
Step 11: Running pipelines
Run registry pipelines:
-
Execute the MASTER-Build pipeline in the Platform’s central Jenkins.
-
(Optionally) After a successful MASTER-Build, run the registry’s Jenkins pipeline to publish the regulation.
Step 12: Finalizing the migration
-
Check pipeline results: Ensure all pipelines completed successfully.
-
Reopen external registry access: Restore access for users.
4. Conclusion
After completing all steps, both operational
and analytical
instances should operate with new disks in the appropriate storage classes, with all data restored from backups.