Top Ten Things You Should Know About Optimizing SQL Server Performance - Part Two

Hi Friends,

In our last article, Top Ten Things You Should Know About Optimizing SQL Server Performance - Part One  we discussed five things. The remaining five are described below.

SQL PROFILER IS YOUR FRIEND

If we have SQL Server 2000 (which means no DMVs to make your life easier), we can still obtain (with a little more work) similar information to identify and classify our workload. In the following step, I use Profiler Traces from a remote machine. There is nothing
to stop you from using Server side traces instead. The important part is what you do with the raw data once it gets into a database table.

Start a Trace

The goal is to classify the workload, so I have chosen these four SQL-related events.

  • RPC:Completed
  • SP:Completed
  • SQL:BatchCompleted
  • SQL:StmtCompleted
Figure 1 shows the Trace Properties Dialog. I have also chosen all possible columns for each of these event types.

SQL Server

Figure 2 shows the General tab in the same dialog. I have configured the trace to store into a table on a Server other than the Server I am tracing. I have also configured the trace to stop after an hour.

SQL Server

Once the trace is finished, the data should now be available in the database table that I configured. For those who wish to use server side tracing, we will also assume from this point that the trace data now exists in a table. On a server with a large amount of throughput, there will be a large number of rows in the trace table. In order to make sense of all this data, it will be necessary to aggregate. I suggest aggregating by at least the SQL text, or TextData column. You can include other columns in your aggregation, such as user or client host name, but for now I will concentrate on TextData. 

TextData is a text column, which means I can’t do a GROUP BY on it. So I will convert it to something we can do a GROUP BY on. In order to do this, I will create a column on the trace table called TextDataTrunc. Figure 3 illustrates the populating of this column with a simple UPDATE.

I HAVE A BOTTLENECK – WHAT DO I DO NOW?

Once you have identified a bottleneck and worked out that it is best to leave the sp_configure settings alone, you need to find the workload that is causing the bottleneck. This is a lot easier to do in SQL Server 2005. Users of SQL Server 2000 will have to be content with using Profiler or Trace (more on that in #6).

In SQL Server 2005, if you identified a CPU bottleneck, the first thing that you would want to do is to get the top CPU consumers on the server. This is a very simple query on sys.dm_exec_query_stats,

  1. select top 50  
  2. qs.total_worker_time / execution_count as avg_worker_time,  
  3. substring(st.text, (qs.statement_start_offset/2)+1,  
  4. ((case qs.statement_end_offset  
  5. when -1 then datalength(st.text)  
  6. else qs.statement_end_offset  
  7. end – qs.statement_start_offset)/2) + 1) as statement_text,  
  8. *  
  9. from  
  10. sys.dm_exec_query_stats as qs  
  11. cross apply sys.dm_exec_sql_text(qs.sql_handle) as st  
  12. order by  
  13. avg_worker_time desc  

WHY CHANGING SP_CONFIGURE SETTINGS PROBABLY WON’T HELP

SQL Server is not like other databases. Very few switches and knobs are available to tweak performance. There are certainly no magic silver bullets to solve performance problems simply by changing an sp_configure setting.

It is generally best to leave the sp_configure settings at their defaults, thereby letting SQL Server manage things. Your time is best spent looking at performance from a workload perspective, such as database design, application interaction and indexing issues.

Let’s look at a workload example at a setting and see why it is generally best to leave things alone.

The “max worker threads” setting is used to govern how many threads SQL Server will use. The default value (in SQL Server 2005 on commodity hardware) is 256 worker threads.

This does not mean that SQL Server can have only 256 connections. On the contrary, SQL Server can service thousands of connections using up to the maximum number of worker threads.

If you were responsible for a SQL Server that regularly had 300 users connected, you might be tempted to raise the maximum number of worker threads to 300. You might think that having one thread per user would result in better performance. This is incorrect. Raising this number to 300 does two things,

  1. Increases the amount of memory that SQL Server uses. Even worse, it decreases the amount of memory that SQL Server can use for buffer cache, because each thread needs a stack
  2. Increases the context switching overhead that exists in all multithreaded software In all likelihood, raising the maximum number of worker threads to 300 made things worse. It also pays to remember that even in a four-processor box, there can only be four threads running at any given time. Unless you are directed to do so by Microsoft support, it is best to focus your efforts on index tuning and resolving application contention issues.

PERFORMANCE COUNTERS – HOW TO CUT TO THE CHASE

A very common question related to SQL Server performance optimization is: What counters should I monitor?

In terms of managing SQL Server, there are two broad reasons for monitoring performance counters,

  1. Operational
  2. Bottlenecks
Although they have some overlap, these two reasons allow you to easily choose a number of data points to monitor.

Operational Monitoring

Operational monitoring checks for general resource usage. It helps answer questions like,
  • Is the server about to run out of resources like CPU, disk space or memory?
  • Are the data files able to grow? • Do fixed size data files have enough free space for data?
You also could collect data for trending purposes. A good example would be collecting the sizes of all the data files. From this information, you could trend the data file growth rates. This would allow you to more easily forecast what resource requirements you might have in the future.

Bottleneck Monitoring

Bottleneck monitoring focuses more on performance-related matters. The data you collect helps answer questions such as,

  • Is there a CPU bottleneck?
  • Is there an I/O bottleneck?
  • Are the major SQL Server subsystems, such as the Buffer Cache and Procedure Cache, healthy?
  • Do we have contention in the database?

Baselining and Benchmarking are means to an END

What are Baselining and Benchmarking?

Baselining and benchmarking give you a picture of resource consumption over time. If your application has not yet been deployed into production, you need to run a simulation. This can be achieved by,

  • Observing the application in real time in a test environment
  • Playing back a recording of the application executing in real time
The best outcome is achieved by observing actual workloads in real time or playing back a recording of a real time simulation. Ideally, you would also want to run the workload on hardware comparable to what the application will be deployed on and with “realistic” data volumes. SQL statements that  deliver good performance on small tables often degrade dramatically as data volumes increase. The resulting data can then be plotted to easily identify trends.

The practical upshot is that you can evaluate future behavior against a baseline, to determine whether resource consumption has improved or worsened over time. What Baselining Can’t Do Baselining is not the only tool in your performance optimization toolbox. To explain what baselining and benchmarking can’t do, let’s use the ubiquitous car analogy and talk about fuel consumption.

The performance counter we are going to sample is obviously a fuel gauge.

SQL Server

For the period of a few days, we will sample the level of the fuel in the fuel tank and plot it in a graph shown below.

SQL Server

The plot displays the fuel remaining in the fuel tank over time. We can see that the baseline behavior represents the level of fuel in the tank that decreases slowly at first and then starts to accelerate more quickly towards the end of the measured time period. In general, this is the normal behavior for fuel in a fuel tank over time. 

Assuming this graph represents normal behavior, we can measure and plot a different behavior and compare the two graphs. We would easily see the change in behavior. Emerging trends may also be easily identified since we can plot against time.

A baseline cannot, however, provide any qualitative measure of efficiency. From the chart above, you cannot draw any conclusions about how efficient the car is—you must investigate elsewhere for this information. The baseline can tell you only whether you used more (or less) fuel between two days.

Similarly, for SQL Server, a baseline can tell you only that something is outside that range of normally observed behavior. It cannot tell you whether the server is running as efficiently as possible.

The point is that you should not start with baselining. You need to make sure that your application workload is running as efficiently as possible. Once performance has been optimized, you can then take a baseline. Also, you cannot simply stop with baselining. You should keep our application running as efficiently as possible and use your baseline as an early warning system that can alert you when performance starts to degrade.

Hope this will help you!!