A few days ago a friend of mine asked me if i had any idea on how to get his SSD’s and HDD’s to be attached to the NVME cache devices. As he had added alot of disks over the last 7 months to his S2D cluster. And the normal behavior is that any new Disk will be auto bound to the Cache
Device.
He noticed things where getting slower and slower. So we run this command to check the bind list of drives against the NVME cache devices. And he noticed that only 6 of 27 drivers pr node where bound to the Cache devices. This means that all IO bypasses the cache devices. And is not good.
(get-counter -ListSet 'Cluster Storage Hybrid Disks').PathsWithInstances
That will get you an output like this
Now what you will see that if you have 2 NVME Cache devices every other SSD and HDD will bind to 1 NVME device. So half of the disks will be on one NVME device and the other half on the other NVME device. So if you have 20 drives pr server you will see 20 disks in the list bound to 2 NVME Cache devices.
Now if this is not the case you will need to rebind them. In many cases you can do this without moving any roles off the nodes. This will require that you have disks in the same slots in the enclosures and that you have enough space to failover data. Not this was done on a 2 node cluster.
- Let’s find the disks that are missing, in the above screenshot you will see the physical disks in the first nr series and the Cache device second. So let’s say disk 10 is not bound it would have been bound to 10:1 for instance.
gwmi -Namespace root\wmi ClusPortDeviceInformation | sort ConnectedNode,ConnectedNodeDeviceNumber,ProductId | ft ConnectedNode,ConnectedNodeDeviceNumber,ProductId
- Let’s say you want to rebind disk 10 in both enclosures you will run the following command’s to remove and rebind. Now i don\’t have 10 disk’s in this server but let’s imagine 🙂
#This will create the function for the physical disks you want to remove $pd=Get-PhysicalDisk | ? {$_.slotnumber -eq "10" -and ($_.friendlyname -like "*SSDSC2BB01")} #Set disk as retired Get-PhysicalDisk | ? {$_.slotnumber -eq "10" -and ($_.friendlyname -like "*SSDSC2BB01")} | Set-Physicaldisk -Usage Retired #Remove the PhysicalDisk's and answer yes Remove-PhysicalDisk -StoragePoolFriendlyName s2d* -PhysicalDisks $pd #Now there are 2 ways for the next step. So you can use any of these 2 PS commands Enable-ClusterS2D -Autoconfig:0 -CacheState Enabled –Verbose Repair-ClusterS2D -RecoverUnboundDrives #Now add the disk's back again to the pool Add-PhysicalDisk -StoragePoolFriendlyName s2d* -PhysicalDisks $pd #Now run this again to check and see if the disks are back on the server again. Need to be checked on all nodes. (get-counter -ListSet 'Cluster Storage Hybrid Disks').PathsWithInstances #If it's missing you will need to do the following Repair-ClusterS2D -RecoverUnboundDrives -Node "Name of Node" #Then Run Enable-ClusterS2D -Autoconfig:0 -CacheState Enabled –Verbose #Now there will be some repairs going on and a rebalance. So make sure all job's are done before doing next drives. Check that by running Get-StorageJob
- Now you should have the disks that you just run this on. If you have more disk’s you will need to start from top to bottom again. Do not do this until all storagejob’s are done.
Id like to thank my good friend Vidar Friis for providing this solution after having this issue on his production cluster.
reference:Â https://jtpedersen.com/2017/11/how-to-rebind-mirror-or-performance-drives-back-to-s2d-cache-device/