SP Diagnostics Tool - Part 3 of 3

In Part 1 and Part 2 we have seen reports on performance and HTTP requests. There are other reports also available out of the box to help diagnose performance issues and report usage patterns.

Failed User Requests

These are user requests that failed or that were so slow that users might have assumed they failed.

Select a failed request to fetch its trace logs. Look for traces that mention a failure in some component of the system. If the cause is not apparent then look at the Windows Events report for signs of a system failure on the server or in IIS.

If a request failed because it was too slow, look for a gap in the log, that might be highlighted. If the lines before the gap indicate that the delay occurred in SQL Server then this request was most likely a lock victim. Look at the SQL Blocking report to find the blocking query that is the root cause of the issue.

Some requests, such as downloads of large files, can be expected to be slow.

Crashes

This report displays all of the IIS worker process crashes that occurred in the specified time range. After a row is selected in the top report, the last few seconds of traces from the crashing process are displayed in the bottom panel. These traces might indicate why the crash occurred.

Crashes can significantly affect availability. The availability report might underestimate the effect of crashes because requests that are being executed at the time of a crash are not recorded. Even when a crash does not noticeably affect availability, it could lead to data loss or other problems and should be investigated.

Usage Report group

The Usage Report group contains several reports that display information about farm usage trends and issues.

Requests Per URL

This report displays the most frequently requested URLs. You can use this report to identify pages that are often accessed and therefore might be high-priority candidates for optimization.

Requests Per User

This report displays the percentage of requests made by the most common user accounts. Some system accounts, such as the search crawler service account, might be expected to generate many requests. At certain times, individual users might also perform operations that create an unexpected peak in resource usage.

Application Workload

This report displays the time spent serving requests from various client applications in a given time range. The report provides an estimate of which resources are being consumed by client requests. The report might indicate the following considerations:

  • High total durations indicate a need for additional memory on the Web servers.
  • High SQL Server process durations imply high SQL I/O or processor usage, or that requests from client applications might be blocked by other queries.
  • High Web server durations might indicate high processor usage on the farm Web servers.

Requests per Site

This report displays the percentage of requests made to each site in the farm.

Saving Reports and Exporting Graphs and Reports

You can save the reports by clicking on the "Save" button then give a name. The report is saved in the Custom Section in the Reports panel.

SP-Diagnostics-Saving-and-Exporting-Reports.jpg

Figure 12: SP Diagnostics: Saving and Exporting Reports

You can also click on "Export Report to a .csv file or the Graph as a .png file" and use it for further analysis or reporting purposes.

Saving Snapshot of data

You can take a snapshot of all the logs or only the current reports depending on your needs. Once snapshot is taken it can be taken offline from the production system for further research and analaysis.

SP-Diagnostics-Snapshot-of-Farm-State.jpg

Figure 13: SP Diagnostics: Snapshot of Farm State

After you provide the export directory the entire data from the usage database and ULS Logs for the given timeframe is exported to that location. This will enable taking the data offline and extracting reports from it without affecting the production systems.

Summary

SPDiag is a required tool for a SharePoint Administrator making their life easier to diagnose obscure issues that just mention the correlation id in the message box.