PDF Search in SharePoint


SharePoint Search! This is not a new topic at all but is certainly the most important area to be explored in SharePoint. Microsoft boasts about SharePoint's search functionality as one of the best in the world. SharePoint uses full text searching in order to retrieve relevant information from a number of documents. Search works quite well when it comes to the most popular formats like doc, txt, htm, xls, and ppt but fails in the case of mht, aspx, asp, etc. PDF is not supported by default and one has to install the iFilter that can be downloaded from the Adobe site. Install the filter, tweak your configuration files and within minutes, you will be able to index and search PDF documents. Here are some links that show how to enable PDF searching in SharePoint:

http://support.microsoft.com/?id=555209

http://support.microsoft.com/default.aspx?scid=kb;EN-US;832809

You will notice that the link shown in the above article takes you to the Adobe site where you can download iFilter v5.0. For your information, Adobe has release iFilter v6.0 as well, which is available for downloading. You can download it here:

http://www.adobe.com/support/downloads/detail.jsp?ftpID=2611

These filters help you index Adobe PDF documents with Microsoft Indexing clients. This allows the user to easily search for text with in the PDF documents.

Theoretically, SharePoint searches following document types:

  • ascx - ascx document
  • asp - asp document
  • aspx - aspx document
  • doc - Microsoft Word Document
  • dot - Microsoft Word Template
  • eml - Internet E-Mail Message
  • exch - exch document
  • htm - HTML Document
  • html - HTML Document
  • jhtml - jhtml document
  • jsp - jsp document
  • mht - MHTML Document
  • mhtml - MHTML Document
  • nsf - nsf document
  • odc - Microsoft Office Data Connection
  • ppt - Microsoft PowerPoint Presentation
  • pub - Microsoft Office Publisher Document
  • tif - Microsoft Office Document Imaging File
  • tiff - Microsoft Office Document Imaging File
  • txt - Text Document
  • url - Internet Shortcut
  • vdx - vdx document
  • vsd - vsd document
  • vss - vss document
  • vst - vst document
  • vsx - vsx document
  • vtx - vtx document
  • xls - Microsoft Excel Worksheet
  • xml - XML Document

Practically, not all documents are returned in the results even if you provided correct keywords. Please read this great article from Bil Simser:

http://weblogs.asp.net/bsimser/archive/2004/10/06/238787.aspx

Things to remember:

  1. Restart your IIS after adding a new iFilter. You don't need to restart your machine.
  2. iFilter must be installed on the machine where your SQL Server is and not on the machine where your SharePoint front end is installed.