Home » Storage Spaces » Is storage expensive – part 5 – Monitoring

Storage spaces related events are logged under Microsoft-windows-storagespaces-driver/operational log. A SCOM rule can be created to alerts on these events

Event Criticality Description
100 Error Storage Spaces – Failed to read the storage pool configuration
102 Error Storage Spaces – Majority of the physical drives of storage pool failed a configuration update
103 Error Storage Spaces – The capacity consumption of the storage pool has exceeded the threshold limit set on the pool
104 Information Storage Spaces – The capacity consumption of the storage pool is now below the threshold limit set on the pool
200 Error Storage Spaces – Windows was unable to read the drive header for physical drive
201 Error Storage Spaces – Physical drive has invalid meta-data
202 Error Storage Spaces – Physical drive has invalid meta-data
203 Error Storage Spaces – An IO failure has occurred on Physical drive
300 Error Storage Spaces – Physical drive failed to read the configuration or returned corrupt data for storage space
301 Error Storage Spaces – All pool drives failed to read the configuration or returned corrupt data for storage space
302 Error Storage Spaces – Majority of the pool drives hosting space meta-data for storage space failed a space meta-data updat
303 Error Storage Spaces – Drives hosting data for storage space have failed or are missing
304 Warning Storage Spaces – One or more drives hosting data for storage space have failed or are missing, The virtual disk is in a degraded state
305 Information Storage Spaces – A Virtual disk is now healthy
306 Error Storage Spaces – The attempt to map, or allocate more storage for, the storage space has failed
308 Information Storage Spaces – A repair attempt for storage space was initiated by the driver
400 Information Storage Spaces – A path to storage enclosure has been detected
401 Information Storage Spaces – A path to a storage enclosure has been removed
402 Warning Storage Spaces – The health of a storage enclosure has changed from Healthy to Unhealthy

 

SCOM console è Authoring è Rule è create new rule

Create a new Management pack for this rule

 

 

 

 

Instead of crating rest of the rule this way, used PowerShell methods of “Russ Slaten” and shortened a bit by taking reference of the rule created manually above. Opened the scom PowerShell by right clicking on any of the alert and executed bellow


$MG = Get-SCOMManagementGroup -ErrorVariable SCOM_MG
 $MP = Get-SCOMManagementPack -DisplayName "Windows Storage Spaces Alerts Custom"
 $Rule_base = Get-SCOMRule -DisplayName "Storage Spaces - Failed to read the storage pool configuration"

$Rule_criticality_list = @{
 "EventID"="Criticality"
"100"="Error"
 "102"="Error"
 "103"="Error"
 "104"="Information"
 "200"="Error"
 "201"="Error"
 "202"="Error"
 "203"="Error"
 "300"="Error"
 "301"="Error"
 "302"="Error"
 "303"="Error"
 "304"="Warning"
 "305"="Information"
 "306"="Error"
 "308"="Information"
 "400"="Information"
 "401"="Information"
 "402"="Warning"
 }
 $Rule_description_list = @{
 "100"="Storage Spaces - Failed to read the storage pool configuration"
 "102"="Storage Spaces - Majority of the physical drives of storage pool failed a configuration update"
 "103"="Storage Spaces - The capacity consumption of the storage pool has exceeded the threshold limit set on the pool"
 "104"="Storage Spaces - The capacity consumption of the storage pool is now below the threshold limit set on the pool"
 "200"="Storage Spaces - Windows was unable to read the drive header for physical drive"
 "201"="Storage Spaces - Physical drive has invalid meta-data"
 "202"="Storage Spaces - Physical drive has invalid meta-data"
 "203"="Storage Spaces - An IO failure has occurred on Physical drive"
"300"="Storage Spaces - Physical drive failed to read the configuration or returned corrupt data for storage space"
 "301"="Storage Spaces - All pool drives failed to read the configuration or returned corrupt data for storage space"
 "302"="Storage Spaces - Majority of the pool drives hosting space meta-data for storage space failed a space meta-data updat"
 "303"="Storage Spaces - Drives hosting data for storage space have failed or are missing"
 "304"="Storage Spaces - One or more drives hosting data for storage space have failed or are missing, The virtual disk is in a degraded state"
 "305"="Storage Spaces - A Virtual disk is now healthy"
 "306"="Storage Spaces - The attempt to map, or allocate more storage for, the storage space has failed"
 "308"="Storage Spaces - A repair attempt for storage space was initiated by the driver"
 "400"="Storage Spaces - A path to storage enclosure has been detected"
 "401"="Storage Spaces - A path to a storage enclosure has been removed"
 "402"="Storage Spaces - The health of a storage enclosure has changed from Healthy to Unhealthy"
 }
 "102","103","104","200","201","202","203","300","301","302","303","304","305","306","308","400","401","402"|%{
 $Rule_EventID = $_
 $Rule_criticality = $Rule_criticality_list.get_item($Rule_EventID)
 $Rule_DisplayName = $Rule_description_list.get_item($Rule_EventID)

Switch ($Rule_criticality) {
 Error { $Alert_type = 2 }
 Warning { $Alert_type = 1 }
 Information { $Alert_type = 0 }
 }

$RuleID = "ScriptedRule"+([guid]::NewGuid()).Guid.Replace("-","")
 $Rule = New-Object Microsoft.EnterpriseManagement.Configuration.ManagementPackRule($MP, $RuleID)

$DSModuleType = $MG.GetMonitoringModuleTypes("Microsoft.Windows.EventProvider")[0]
 $DSModule = New-Object Microsoft.EnterpriseManagement.Configuration.ManagementPackDataSourceModule($Rule, "DS")
 $DSModule.TypeID = [Microsoft.EnterpriseManagement.Configuration.ManagementPackDataSourceModuleType]$DSModuleType
 $DSModule.Configuration = $Rule_base.DataSourceCollection.Configuration.Replace("100",$Rule_EventID)
 $Rule.DataSourceCollection.Add($DSModule)

$AlertMessageID = '{0}.AlertMessage' -f $RuleID
 $AlertMessageObject = New-Object Microsoft.EnterpriseManagement.Configuration.ManagementPackStringResource($MP, $AlertMessageID)
 $AlertMessageObject.DisplayName = $Rule_DisplayName
 $WAModuleType = $MG.GetMonitoringModuleTypes("System.Health.GenerateAlert")[0]
 $WAModule = New-Object Microsoft.EnterpriseManagement.Configuration.ManagementPackWriteActionModule($Rule, "GenerateAlert")
 $WAModule.TypeID = [Microsoft.EnterpriseManagement.Configuration.ManagementPackWriteActionModuleType]$WAModuleType
$WAModule.Configuration = $Rule_base.WriteActionCollection.Configuration.Replace($Rule_base.Name,$Rule.Name)
 $WAModule.Configuration = $WAModule.Configuration.Replace("2","$Alert_type")
 $WAModule.Configuration = $WAModule.Configuration.Replace("2","$Alert_type")
 $Rule.WriteActionCollection.Add($WAModule)

$Rule.Target = $Rule_base.Target
 $Rule.Category = $Rule_base.Category
 $Rule.DisplayName = $Rule_DisplayName
 $Rule.Enabled = [Microsoft.EnterpriseManagement.Configuration.ManagementPackMonitoringLevel]::True
 $MP.Verify()
 $MP.AcceptChanges()
 }

This is a very pointed script with no checks around, hence would fail if any of the parameters like name of the rule differ. With these in place, storage spaces issues start triggering alerts in SCOM.


When it comes to physical disk, SMART (Self-Monitoring, Analysis and Reporting Technology) information of disk comes handy. Get-StorageReliabilityCounter command can retrieve the information for a single or all of the disk.

Get-PhysicalDisk | ?{ $_.ObjectId -Match "{05aa4f22-0162-f757-4d5e-6b93c8ec3bc4}" } | Get-StorageReliabilityCounter

For getting the value for one disk or all of the disk with bellow cmdlet

Get-PhysicalDisk | Get-StorageReliabilityCounter

Parameter like ReadErrors, ReadLatency, WriteErrors and WriteLatency with high value indicates a disk is experiencing problem and would require replacement. Enclosure management tool to list drives would have option to retrieve the same information as well.


Along with virtual and physical disk, storage spaces would have dependency on the SAS HBA that connects the enclosures to host machine. Warning event gives an early indication whereas error confirming an issue that requires attention

 



 


Leave a Reply

Your email address will not be published. Required fields are marked *

*
*