ServMan Performance Troubleshooting

Troubleshooting Critical Systems when ServMan Performance is Slow

The ServMan application relies heavily upon the full network infrastructure to perform. This document outlines a common troubleshooting protocol to help diagnose performance problems.

Document the exact steps you take to get to the slow process.
Record the time it takes to do the slow process task.
Repeat this a few times to get a general benchmark. If the slowdown is tied to saving a screen or loading a screen, then try performing the same screen but a different record (eg. if on the order screen, try a different order number). If the performance is significantly different using different records, then you should note the different document ID's and contact ServMan support to have them review the data load differences between the documents to see if there are some configuration options that can be utilized. Make note if all users are slow or just an individual user.

Troubleshooting the workstation or Terminal Server

It is critical that this be the EXACT computer that is experiencing the problem. If you have terminal server load balancing you will have to identify the exact terminal server and sign in directly.

As the system Administrator, Load task manager (go full screen to capture more info) on the computer that is running the SERVMAN.EXE file and capture the following:

- Get 3 Screen shots 5 seconds apart of the "Processes" tab sorted by CPU usage.

- Get 3 Screen shots 5 seconds apart of the "Processes" tab sorted by Memory usage.

- Get 3 Screen shots 5 seconds apart of the "Processes" tab sorted by Disk usage.

- Get 3 Screen shots 5 seconds apart of the "Processes" tab sorted by Network usage.

- Click on the "Performance Tab" and get 3 Screen shots 5 seconds apart.

Troubleshooting SQL Server Performance

- Login to the SQL Server Management Studio and right-click on the server and choose Reports >> Standards Reports >> Activity All Blocking Transactions. If there are results, this is likely the problem.

- Repeatedly run the following SQL query. If you see the same query returned after several executions, then copy the results of that query for review. Queries should be designed to run fast and should disappear from the results on subsequent runs.

-- this query will show all currently running queries on the server

select st.text as StatementText,

case when p.statement_end_offset<=0 then

st.text

when p.statement_end_offset>p.statement_start_offset and p.statement_start_offset>0 then

SUBSTRING(st.text, p.statement_start_offset/2.0, 256000)

else st.text

end as SQLText,

q.nt_username, p.*,q.* from sys.dm_exec_requests p

left outer join sys.sysprocesses q on q.spid=p.session_id

cross apply sys.dm_exec_sql_text(p.plan_handle) st

where st.[text] not like '%when p.statement_end_offset>p.statement_start_offset%'

order by q.nt_username

Troubleshooting the SQL Server Computer

- Replicate the steps performed on the work station in step3 above.

Virtual Environments Only: Troubleshooting all VM Host Computer's involved

- Replicate the steps performed on the work station in step 3 above.

Troubleshooting the Application Server (if different than SQL server)

- Replicate the steps performed on the work station in step 3 above.

- Record the time it takes to do the same slow ServMan task that you did in step 2.

- Review findings or submit to qualified engineer

SQL Server Database Engine Tip from Microsoft

An unsuitable configuration of the SQL Server TempDB system DB is one of the most common causes of performance degradation in ServMan; therefore, it’s paramount to manage, troubleshoot, and appropriately tune the TempDB resources for smooth functioning of the behind of the scene SQL Server instance of ServMan. If you consume all the TempDB resources, you surely will render your SQL Server instance unusable!!!

The following features of SQL Server (implemented or used in some way on ServMan) make use of TempDB:

? Queries

? Triggers

? Snapshot isolation and read-committed snapshots

? Index Creation and online index creation

? Temporary tables, table variables, and table-valued functions

? DBCC CHECK

? User-defined functions

? Cursors

? Database mail, etc.

The worst you can do is to leave TempDB settings to their default values, as in the following SQL Server 2005 TempDB properties window screenshot:

Consider implementing the following top-10 Microsoft best practices to improve the TempDB performance:

1) To maximize disk bandwidth, divide TempDB into multiple data files equally sized (one primary .mdf file, and several secondary .ndf files), one file per physical CPU, because only one thread is active per CPU at one time. Note that dual core counts as 2 CPUs; logical processors (hyperthreading) do not. SQL Server would round-robin (proportional fill) the single page allocations in the files in TempDB, alleviating the contention somewhat.

Using multiple files reduces TempDB storage contention and yields significantly better scalability. However, avoid creating too many files because this can reduce performance and increase management overhead. Microsoft recommends a maximum of 8 data files, but this is only advice, the number of files might not be the same as the number of physical CPUs.

To create multiple secondary files for TempDB you can use the GUI SQL Server Management Studio (right-click TempDB, select Properties from the shortcut menu, and click the Files tab), or Transact-SQL as well.

It’s normal that people make mistakes splitting the TempDB according to the number of physical CPUs. You can use the attached test script to catch some of these problems, like the number of files not matching the number of scheduler and the data file size not being identical or T1118 not being set.

2) Identify if there are I/O bottlenecks by monitoring the following perfmon counters for physical devices associated with TempDB:

a. PhysicalDisk Object: Avg. Disk Queue Length
The average number of physical read and write requests that were queued on the selected physical disk during the sampling period. If the I/O system is overloaded, more read/write operations will be waiting. If the disk queue length exceeds a specified value too frequently during peak usage of SQL Server, there might be an I/O bottleneck.

b. Avg. Disk Sec/Read
The average time, in seconds, of a read of data from the disk. Use the following to analyze numbers in the output.

i. Less than 10 milliseconds (ms) = very good

ii. Between 10-20 ms = okay

iii. Between 20-50 ms = slow, needs attention

iv. Greater than 50 ms = serious IO bottleneck

c. Avg. Disk Sec/Write
The average time, in seconds, of a write of data to the disk. See the guidelines for the previous item, Avg. Disk Sec/Read.

d. Physical Disk: %Disk Time
The percentage of elapsed time that the selected disk drive was busy servicing read or write requests. A general guideline is that if this value > 50%, there is an I/O bottleneck.

e. Avg. Disk Reads/Sec
The rate of read operations on the disk. Make sure that this number is less than 85% of disk capacity. Disk access time increases exponentially beyond 85% capacity.

f. Avg. Disk Writes/Sec
The rate of write operations on the disk. Make sure that this number is less than 85% of the disk capacity. Disk access time increases exponentially beyond 85% capacity.

g. Database: Log Bytes Flushed/sec
The total number of log bytes flushed. A large value indicates heavy log activity in TempDB.

h. Database:Log Flush Waits/sec
The number of commits that are waiting on log flush. Although transactions do not wait for the log to be flushed in tempdb, a high number in this performance counter indicates and I/O bottleneck in the disk(s) associated with the log.

3) By default, TempDB is set to auto grow; once the space in the file is full, the file is allowed to grow 10 % of the initial file size. However, relying on auto grow to manage file growth causes all applications or internal operations that use TempDB to pause. It is important to note that auto grow can lead to data fragmentation and should be used as a last resort. Avoid using such setting when possible.

Anyway, it’s recommended to use the new feature “Instant data file initialization” to improve the performance of auto grow operations. This feature skips zeroing out data pages that can reduce the time when performing operations like creating DBs, adding files to an existing database, increasing the size of an existing database file manually or thru auto growth or restoring a database or filegroup. In previous versions of SQL Server, data and log files are first initialized by filling the files with zeros when you perform one of the previously mentioned operations.

Be in mind that even though the above feature is available, your instance might not be configured to use it!! To do so, you should give your SQL Server service account or the local group SQLServerMSSQLUser$instancename the SE_MANAGE_VOLUME_NAME privilege. The SQLServerMSSQLUser$instancename local group is created when you install SQL Server and adds the SQL Server service account to it. To do this, the service account or local group has to be granted the "Perform Volume Maintenance Task" local security right. By default, the local Administrators group already has this permission so if your service account is a member of this group, you don't have to do anything.

This procedure improves file increase performance on auto grow events, however this should not be used as a substitute for properly sizing your database files (see last tip of this list) to minimize, if not prevent, auto growth. Moreover, if you give this permission to the service account while the SQL Server service is running, remember to restart the service in order for the feature to take effect. You can also run the command gpupdate /force to apply the modification in the security policy immediately.

4) Set the TempDB file growth increment to a reasonable size to avoid the TempDB database files from growing by too small a value. If the file growth is too small, compared to the amount of data that is being written to TempDB, it may have to constantly expand and this will surely affect performance. Use the following general guidelines when setting the FILEGROWTH increment for TempDB:

TempDB File Size	Filegrowth increment for TempDB
0 to 100 MB	10 MB
100 to 200 MB	20 MB
200 MB or more	10%

5) The TempDB database recovery model is set to ‘Simple’. However, it’s important to estimate the log space for TempDB. Make the TempDB log file large enough to avoid truncating logs frequently (on automatic or manual checkpoints).

Take into account, however, that most operations in TempDB are not logged. A good practice is to start with an initial log size of about 20% of data file, and monitor possible bottlenecks on the space in TempDB (on both data and log files) using perfmon counters (Database: Log files(s) Size(KB), Database: Log File(s) Used (KB), Free Space in tempdb (KB)), and SQL Server 2005/2008 Dynamic Management Functions (sys.dm_db_file_space_usage, sys.dm_db_session_file_usage, and sys.dm_db_task_space_usage).

6) Do not shrink TempDB since it causes data fragmentation (performance degradation) and these files will probably grow again.

7) Create the TempDB on a fast I/O subsystem, and ensure that you have an adequate number of spindles to support your I/O requirements with an acceptable latency. Use RAID 1 + 0 if possible, because it provides better protection from hardware failure, and better write performance (logs & TempDB are update intensive). RAID 5 has well known write performance penalties to maintain the parity. In RAID 5, one logical write require one physical read of the data block, one read for the parity, one write for the data, one write for the parity for a total of 4 I/Os

? RAID 5 I/Os per Disk = [Reads + (4 * Writes)] / Number of Disks

? RAID 1 + 0 I/Os per Disk = [Reads + (2 * Writes)] / Number of Disks

Also, it’s very important to separate TempDB data and log files on different disks from those used by ServMan data and log files.

8) In general, avoid using the Trace Flag T-118 on SQL Server 2005/2008 (it basically eliminates mixed extent allocations)because there were improvements in TempDB object caching in those SQL Server versions, hence there should be significantly less contention in allocation structures. Under this trace flag SQL Server allocates full extents (8 pages of 8KB each) to each TempDB object, thereby eliminating the contention on SGAM (Secondary Global Allocation Map) page. This is done at the expense of some waste of disk space in TempDB.

9) Remember that when the server restarts, the TempDB file size is reset to the configured value (the default is 8 MB), and auto grow is temporary for TempDB (unlike other types of SQL DBs) and it is reset as well when SQL Server restarts. Moreover, each time SQL Server restarts, TempDB is copied from the Model SQL Server system DB, and it inherits certain database configuration options, such as ALLOW_SNAPSHOT_ISOLATION, from the Model DB.

10) …and last but not the least, be in mind that there is not available a capacity planning tool for sizing appropriately the TempDB SQL system DB. You need to monitor how much the TempDB database is being used by applications, and in this case by the ServMan database queries, triggers, functions, temporary tables, indexes creation, etc. You can create a historical TempDB database and log files records, and a stored procedure to capture the TempDB database and log file space used. Detailed steps on this link.