Recently I was involved in troubleshooting the performance issue. All the queries were running slow on SQL instance. Most of the queries usually complete in seconds and that day it was running more than 15 minutes and not return any output.
I have followed below approach to fix this issue.
Step 1: Checked for the blocking if there is any on the SQL instace but didn’t find any.
select * from sys.sysprocesses where blocked <> 0
Step 2: Checked for the active sessions on the server. To find out if there is any session which in KILLED/Rollback state but didn’t find out any. Sessions in KILLED/Rollback states are resource incentive sessions based on different conditions.
Step 3: Checked for the overall wait types on the SQL Server and found RESOURCE_SEMAPHORE and Page IO related waits.
SELECT TOP 10 wait_type , max_wait_time_ms wait_time_ms , signal_wait_time_ms , wait_time_ms - signal_wait_time_ms AS resource_wait_time_ms , 100.0 * wait_time_ms / SUM(wait_time_ms) OVER ( ) AS percent_total_waits , 100.0 * signal_wait_time_ms / SUM(signal_wait_time_ms) OVER ( ) AS percent_total_signal_waits , 100.0 * ( wait_time_ms - signal_wait_time_ms ) / SUM(wait_time_ms) OVER ( ) AS percent_total_resource_waits FROM sys.dm_os_wait_stats WHERE wait_time_ms > 0 AND wait_type NOT IN ( 'SLEEP_TASK', 'BROKER_TASK_STOP', 'BROKER_TO_FLUSH', 'SQLTRACE_BUFFER_FLUSH','CLR_AUTO_EVENT', 'CLR_MANUAL_EVENT', 'LAZYWRITER_SLEEP', 'SLEEP_SYSTEMTASK', 'SLEEP_BPOOL_FLUSH', 'BROKER_EVENTHANDLER', 'XE_DISPATCHER_WAIT', 'FT_IFTSHC_MUTEX', 'CHECKPOINT_QUEUE', 'FT_IFTS_SCHEDULER_IDLE_WAIT', 'BROKER_TRANSMITTER', 'FT_IFTSHC_MUTEX', 'KSOURCE_WAKEUP', 'LAZYWRITER_SLEEP', 'LOGMGR_QUEUE', 'ONDEMAND_TASK_QUEUE', 'REQUEST_FOR_DEADLOCK_SEARCH', 'XE_TIMER_EVENT', 'BAD_PAGE_PROCESS', 'DBMIRROR_EVENTS_QUEUE', 'BROKER_RECEIVE_WAITFOR', 'PREEMPTIVE_OS_GETPROCADDRESS', 'PREEMPTIVE_OS_AUTHENTICATIONOPS', 'WAITFOR', 'DISPATCHER_QUEUE_SEMAPHORE', 'XE_DISPATCHER_JOIN', 'RESOURCE_QUEUE' ) ORDER BY wait_time_ms DESC
Step 4: Analyzed the wait types, while analyzing the wait types I found the RESOURCE_SEMAPHORE wait type. Let’s see what is RESOURCE_SEMAPHORE wait type and what does it indicate.
RESOURCE SEMAPHORE SQL wait type “Occurs when a memory request for query cannot be granted immediately due to other concurrent queries or memory pressure, Resource Semaphore high waits and wait times may indicate excessive number of concurrent queries, or excessive memory request amounts”
Let’s query DMV to get more idea on Resource Semaphore wait. In the output of the below query output you have to check grantee_count and waiter_count column values, grantee_count is the number of queries which have their memory granted and the waiter_count is the number of queries which are waiting in queue to get memory
SELECT * FROM sys.dm_exec_query_resource_semaphores
Now let’s check queries waiting for memory using “sys.dm_exec_query_memory_grants” DMV. Columns grant_time and granted_memory_kb will be NULL for those queries which are waiting to get their requested memory
SELECT * FROM sys.dm_exec_query_memory_grants where grant_time is null
I have also checked the “Memory Grants Pending” performance counter to get more idea. If Memory Grants Pending values is great than 0, than there is a problem.
SELECT OBJECT_NAME,cntr_value AS [Memory Grants Pending] FROM sys.dm_os_performance_counters WHERE OBJECT_NAME = 'MSSQL$MyInstance:Memory Manager' AND counter_name = 'Memory Grants Pending'
Step 5 To resolve the issue immediately, as it was active-active cluster and there was memory available to allocate to SQL Server, I have increased the memory.
Long term fix
Find out the memory consuming queries and tune them
Schedule BULK Processing Jobs during off business hours
Make sure stats and indexes are up to date
If the server load increase, plan to add more memory