In this guide i will give you a quick overview on how to troubleshoot your Storage Spaces, like ordinary Storage Pools with and without Tiering and Storage Spaces Direct. I will do the troubleshooting based on an issue we had with our test Storage Spaces Direct Cluster.

What happen was that we where starting to experience really bad performance on the VM’s. Response times where going trough the roof. We had response times of 8000 ms on normal os operations. What we traced it down to was faulty SSD drives. These where Kingston V310 consumer SSD drives. These did not have power loss protection on them, and that’s a problem as S2D or windows storage want’s to write to a safe place. The caching on these Kingston drives worked for a while. But after to much writing it failed. You can read all about SSD and power loss protection here.

So what i will cover here is how to identify slow disks.

First you will notice performance issues on your VM’s or physical server. Open resource monitor and look at Response times


First rule out RDMA if this is for a Scale-Out-Fileserver or S2D

Download this powershell script and run it on your server.

You should get an output like this, and look for the SMB Connectivity Error Check. Look at errors the last 5 min and run the test a few times and look for if the numbers increase. As this is an indication for RDMA problems.


As you can see there where some issues on RDMA here, but nothing serious. The nr 4 node is being looked after. I will come back with a guide for that.

Now let’s have look at the disks on the physical machine.

If you have S2D you should see if all drives in the storagepool is attached to the Caching device. I have NVME as caching device and ssd and hdd as performance and capacity. The caching device will be the :0 and the disks will be 1:0 2:0 and so on depending on how many drives you have.

So what you do is run this command.

(get-counter -ListSet 'Cluster Storage Hybrid Disks').PathsWithInstances


And you will get this list. It’s pretty long. I have 6 HDD and 2 SSD totaling 8 drives. And all 8 is attached to the Caching device. This is important.


Now let’s jump to performance counter and add a counter there.


Now to know which disk is  SSD and wich is HDD you will need to run this next command. Disk 0 is the caching device.

gwmi -Namespace root\wmi ClusPortDeviceInformation | sort ConnectedNode,ConnectedNodeDeviceNumber,ProductId | ft ConnectedNode,ConnectedNodeDeviceNumber,ProductId



For a singel node the command is

Now we can go back to Perf monitor. We can see on node 1 that the SSD’s are disk 4 and 5. In the screenshot i have here the SSD’s where 1 and 2


If you look closely at the Avg. Disk sec/Read,  Avg. Disk sec/Transfer and Avg. Disk sec/Write you will see high numbers. They should be in the 0,001-0,003 or as low as possible. We did see numbers in the range of 1,870. And not this is Average numbers not max, so you can imagine the high latency on this. This was caused by Windows Server telling the storage disk to write to a safe place. And when the SSD caching is not working on the SSD the safe place is directly on the Nand cells, wich has about 200 iops. The ssd should do 25k iops in writing.

To replace failed drives in a storage pool i used this guide. I will paste it in here for reference.

# Find the failed Disk

# Shutdown, take the disk out and reboot. Set the missing disk to a variable
$missingDisk = Get-PhysicalDisk | Where-Object { $_.OperationalStatus -eq 'Lost Communication' }

# Retire the missing disk
$missingDisk | Set-PhysicalDisk -Usage Retired

# Find the name of your new disk

# Set the replacement disk object to a variable
$replacementDisk = Get-PhysicalDisk –FriendlyName PhysicalDisk1

# Add replacement Disk to the Storage Pool
Add-PhysicalDisk –PhysicalDisks $replacementDisk –StoragePoolFriendlyName pool

# Repair each Volume
Repair-VirtualDisk –FriendlyName <VolumeName>

# Get the status of the rebuilds

# Remove failed Virtual Disks
Remove-VirtualDisk –FriendlyName <FriendlyName>

# Remove the failed Physical Disk from the pool
Remove-PhysicalDisk –PhysicalDisks $missingDisk –StoragePoolFriendlyName pool