Replace/Install a new SSD in Server: Difference between revisions

From Delft Solutions
Jump to navigation Jump to search
(Created page with "We use Ceph so RAID is unnecessary for our installment. Ceph provides data replication and redundancy across multiple OSDs (Object Storage Daemons) in the cluster. It automatically replicates data to ensure availability, so RAID's redundancy is redundant.")
 
No edit summary
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
We use Ceph so RAID is unnecessary for our installment.
We use Ceph so RAID is unnecessary for our installment based on these reasons:
Ceph provides data replication and redundancy across multiple OSDs (Object Storage Daemons) in the cluster. It automatically replicates data to ensure availability, so RAID's redundancy is redundant.
 
* Built-in Redundancy: Ceph provides data replication and redundancy across multiple OSDs (Object Storage Daemons) in the cluster. It automatically replicates data to ensure availability, so RAID's redundancy is redundant.
*
* Performance Impact: RAID can introduce additional latency and reduce the performance of SSDs. Ceph is designed to work efficiently with raw disks, and adding RAID can slow down operations.
*
* Wasted Resources: Using RAID means you're dedicating some of your disk capacity to redundancy (like RAID 1 mirroring). Ceph already replicates data across multiple disks or nodes, so this would lead to unnecessary resource usage.
*
* Complexity: RAID adds another layer of complexity that isn't needed. Managing disks individually with Ceph simplifies management and reduces potential points of failure.
*
 
== How to Install a New SSD ==
=== Physically Install the SSD ===
Create a maintenance mode in Zabbix for the server
Check the slot number of the empty drive bays in iDRAK (or you can check if the SSD is blinking and if it is active or not)
Assemble the caddy to the SSD
Pull out the empty drive bay
Physically Install the SSD
 
=== Verify the SSD is Recognized by the System ===
Login in to Proxmox and run `lsblk` to list all the SSDs and verify that SSD is recognized
 
=== Prepare the SSD for Ceph ===
Install necessary packages
```apt-get update
apt-get install ceph ceph-osd```
Create and Add the OSD: `pveceph createosd /dev/sdX`
Use command `ceph osd tree` to make sure the new OSD is added
Check the Ceph status: `ceph -s`
 
=== Change the Raid configuration ===
 
Most people change the raid config from within the OS, then you don't need downtime. We don't have the configuration utility installed so we generally either use the iDRAC or the boot utility
We usually need configure the RAID through the iDRAC or the boot utility.
 
=== This is what we did before ===
Before reboting the server in Proxmox, you need to create OSD and check 'ceph pg dump | grep remapped' to make sure there is no issue (no remapped PG) and we have multiple copies in Scorpion and we can proceed to rebooting the server
Reboot the server to put the SSD in Non-Raid (we did it but I am not sure it is a good way)
After rebooting we expect the SSD is detected, we can check in iDRAK manager, if the SSD is blinking
Go to the lifecycle controller and boot the server, we expect to see the SSD in raid controller
Check if the SSD shows up in the system diagnostics and make sure the SSD is being recognized by the server
Make sure ceph is healthy

Latest revision as of 01:59, 3 September 2024

We use Ceph so RAID is unnecessary for our installment based on these reasons:

  • Built-in Redundancy: Ceph provides data replication and redundancy across multiple OSDs (Object Storage Daemons) in the cluster. It automatically replicates data to ensure availability, so RAID's redundancy is redundant.
  • Performance Impact: RAID can introduce additional latency and reduce the performance of SSDs. Ceph is designed to work efficiently with raw disks, and adding RAID can slow down operations.
  • Wasted Resources: Using RAID means you're dedicating some of your disk capacity to redundancy (like RAID 1 mirroring). Ceph already replicates data across multiple disks or nodes, so this would lead to unnecessary resource usage.
  • Complexity: RAID adds another layer of complexity that isn't needed. Managing disks individually with Ceph simplifies management and reduces potential points of failure.

How to Install a New SSD

Physically Install the SSD

Create a maintenance mode in Zabbix for the server Check the slot number of the empty drive bays in iDRAK (or you can check if the SSD is blinking and if it is active or not) Assemble the caddy to the SSD Pull out the empty drive bay Physically Install the SSD

Verify the SSD is Recognized by the System

Login in to Proxmox and run `lsblk` to list all the SSDs and verify that SSD is recognized

Prepare the SSD for Ceph

Install necessary packages ```apt-get update apt-get install ceph ceph-osd``` Create and Add the OSD: `pveceph createosd /dev/sdX` Use command `ceph osd tree` to make sure the new OSD is added Check the Ceph status: `ceph -s`

Change the Raid configuration

Most people change the raid config from within the OS, then you don't need downtime. We don't have the configuration utility installed so we generally either use the iDRAC or the boot utility We usually need configure the RAID through the iDRAC or the boot utility.

This is what we did before

Before reboting the server in Proxmox, you need to create OSD and check 'ceph pg dump | grep remapped' to make sure there is no issue (no remapped PG) and we have multiple copies in Scorpion and we can proceed to rebooting the server Reboot the server to put the SSD in Non-Raid (we did it but I am not sure it is a good way) After rebooting we expect the SSD is detected, we can check in iDRAK manager, if the SSD is blinking Go to the lifecycle controller and boot the server, we expect to see the SSD in raid controller Check if the SSD shows up in the system diagnostics and make sure the SSD is being recognized by the server Make sure ceph is healthy