After my initial failure of replacing a NVME Caching card and hitting a bug in the 2016 version i was on, i replaced another one today. As we where starting our cluster out with Intel 750 drives, and these NVME PCIe cards only have 70gb of write’s pr day. So i decided to replace them with the Intel DC P3600. The first failed as can be seen here.
Drain the node and Pause it, then shut it down
Replace the NVME Card and boot it back up
Open PowerShell and type to see the new NVME card.
Get-PhysicalDisk -CanPool $true
Now let’s set the old NVME card in Status retired.
Get-PhysicalDisk -Usage Journal -HealthStatus Warning | Set-PhysicalDisk -Usage Retired
This will retire the disk run
Get-PhysicalDisk -Usage Retired
Now let’s add the new NVME disk
Add-PhysicalDisk -PhysicalDisks (Get-PhysicalDisk -CanPool $True) -StoragePoolFriendlyName (Important to set full name here)
Now let’s set the disk as Journal drive and see how it looks after. S2D might even do this automaticaly
Set-PhysicalDisk -FriendlyName \"INTEL SSDPEDME400G4\" -Usage Journal Get-PhysicalDisk -FriendlyName Intel*
Now run this command
Repair-ClusterS2D -RecoverUnboundDrives -Node (Name of node)
Now it’s added and you can resume the node. Then pause the node and resume it again and the disks will go out of maitenance mode.
$faileddisk = Get-PhysicalDisk -Usage Retired $faileddisk Remove-PhysicalDisk -PhysicalDisks $faileddisk -StoragePoolFriendlyName \"Use full name of storagepool\" Get-Physicaldisk -Usage Retired
If you run you will see that the retired disk is gone.
After you enable the node again the storage rebuild job will run. This needs to finish before you replace another drive. And you won\’t be able to pause a node before it’s finished.