Monitoring features of SharePoint 2010


The monitoring features in Microsoft SharePoint Server 2010 help you to understand how the SharePoint Server 2010 system is running, analyze and repair problems, and view metrics for the sites. Monitoring the SharePoint Server 2010 environment includes the following tasks:

  • Configuring the various aspects of monitoring to suit business needs.
  • Monitoring the environment and resolving any problems that might arise.
  • Viewing reports and logs of the environment activity.

Regular performance tests

SharePoint Health Analyzer checks for potential configuration, performance, and usage problems in SharePoint Server 2010. It runs predefined health rules against servers in the farm and returns a status that tells you the outcome of the test.

Receive auto alerts

SharePoint Health Analyzer also creates an alert in the Health Analyzer Reports list in Central Administration. You can click an alert to view more information about the problem and see steps to resolve the problem.

Configuring monitoring

SharePoint Server 2010 comes installed with default settings for its monitoring features. However, you might want to change some of these settings to better suit the business needs. The aspects that you might change configuration settings for include diagnostic logging and health and usage data collection.

Diagnostic logging

SharePoint Server 2010 collects data in the diagnostic log that can be useful in troubleshooting. The default settings are sufficient for most situations, but depending upon the business needs and lifecycle of the farm, you might want to change these settings. For example, if you are deploying a new feature or making large-scale changes to the environment, you might want to change the logging level to either a more verbose level, to capture as much data about the state of the system during the changes, or to a lower level to reduce the size of the log and the resources needed to log the data.

The SharePoint Server 2010 environment might require configuration of the diagnostic loggings settings after initial deployment or upgrade and possibly throughout the system's life cycle. The guidelines in the following list can help you form best practices for the specific environment.

  • Change the drive that logging writes to. By default, diagnostic logging is configured to write logs to the same drive and partition that SharePoint Server 2010 was installed on. Because diagnostic logging can use lots of drive space and writing to the logs can affect drive performance, you should configure logging to write to a drive that is different from the drive on which SharePoint Server 2010 was installed. You should also consider the connection speed to the drive that logs are written to. If verbose-level logging is configured, lots of log data is recorded. Therefore, a slow connection might result in poor log performance.
     
  • Restrict log disk space usage. By default, the amount of disk space that diagnostic logging can use is not limited. Therefore, limit the disk space that logging uses to make sure that the disk does not fill up, especially if you configure logging to write verbose-level events. When the disk restriction is used up, the oldest logs are removed and new logging data information is recorded.
    Use the Verbose setting sparingly. You can configure diagnostic logging to record verbose-level events. This means that the system will log every action that SharePoint Server 2010 takes. Verbose-level logging can quickly use drive space and affect drive and server performance. You can use verbose-level logging to record a greater level of detail when you are making critical changes and then re-configure logging to record only higher-level events after you make the change.
     
  • Regularly back up logs. The diagnostic logs contain important data. Therefore, back them up regularly to make sure that this data is preserved. When you restrict log drive space usage, or if you keep logs for only a few days, log files are automatically deleted, starting with the oldest files first, when the threshold is met.
    Enable event log flooding protection. Enabling this setting configures the system to detect repeating events in the Windows event log. When the same event is logged repeatedly, the repeating events are detected and suppressed until conditions return to a typical state.

Health and usage data collection

The monitoring features in SharePoint Server 2010 use specific timer jobs to perform monitoring tasks and collect monitoring data. The health and usage data might consist of performance counter data, event log data, timer service data, metrics for site collections and sites, search usage data, or various performance aspects of the Web servers. The system uses this data to create health reports, Web Analysis reports, and administrative reports. The system writes usage and health data to the logging folder and to the logging database.

A timer job is a trigger to start to run a specific Windows service for one of the SharePoint 2010 products. It contains a definition of the service to run and specifies how frequently the service should be started. The Windows SharePoint Services Timer v4 service (SPTimerV4) runs timer jobs. Many features in SharePoint 2010 products rely on timer jobs to run services according to a schedule.

You might want to change the schedules that the timer jobs run on to collect data more frequently or less frequently. You might even want to disable jobs that collect data that you are not interested in. You can perform the following tasks on timer jobs:

  • Modify the schedule that the timer job runs on.
  • Run timer jobs immediately.
  • Enable or disable timer jobs.
  • View timer job status. You can view currently scheduled jobs, failed jobs, currently running jobs, and a complete timer job history.

Monitoring the farm and resolving problems by using SharePoint Health Analyzer

SharePoint Server 2010 includes a new, integrated health analysis tool that is named SharePoint Health Analyzer that enables you to check for potential configuration, performance, and usage problems. SharePoint Health Analyzer runs predefined health rules against servers in the farm. A health rule runs a test and returns a status that tells you the outcome of the test. When any rule fails, the status is written to the Health Reports list in SharePoint Server 2010 and to the Windows Event log. The SharePoint Health Analyzer also creates an alert in the Health Analyzer Reports list on the Review problems and solutions page in Central Administration. You can click an alert to view more information about the problem and see steps to resolve the problem. You can also open the rule that raised the alert and change its settings.

Like all SharePoint Server 2010 lists, you can edit Health Analyzer Reports list items, create custom views, export the list items into Microsoft Excel, subscribe to the RSS feed for the list, and many other tasks. Each health rule falls in one of the following categories: Security, Performance, Configuration, or Availability.

A health rule can be run on a defined schedule or on an impromptu basis. All health rules are available through Central Administration, on the Monitoring page, for either immediate or scheduled execution.

Farm administrators can configure specific health rules to do the following:

  • Enable or disable rules.
  • Configure rules to run on a predefined schedule.
  • Define the scope where the rules run.
  • Receive e-mail alerts when problems are found.
  • Run rules an impromptu basis.

View and use reports

SharePoint Server 2010 can be configured to collect data and create reports about server status and site use. You perform the following using reporting:

  • View administrative reports, such as search reports.
  • Create and review Information Management Policy Usage reports.
  • View health reports that include slowest pages and top active pages.
  • View Web Analytics reports that include Web site traffic reports, search query reports, and customized reports.