Reader Level:
ARTICLE

Architecture of Search Schema in SharePoint 2013

Posted by Destin joy Articles | SharePoint October 23, 2012
In this article we can see some very good information about the Search Schema of SharePoint 2013. The search index is one of the most important elements in search architecture. What is in our search index determines what people will find when they look for information by entering search queries or by interacting with internet or intranet pages.
  • 0
  • 0
  • 5534

In this article we will see some very good information about the Search Schema of SharePoint 2013. The search index is one of the determines what people will find when they look for information by entering search queries or by the most important elements in the search architecture. What is in our search index interacting with internet or intranet pages.

To include content in the search index, you must first crawl the content. There are various content sources that you can crawl, for example SharePoint sites, file shares or user profiles. The contents and metadata of the items that you crawl are represented as crawled properties in the search service application. To include the contents of the crawled properties in the search index, you map crawled properties to managed properties. There is a set of default mappings, but often you want to use the search schema to, for example, change the default mappings, create new mappings or create new managed properties. The way the search index is structured depends on the settings on the managed properties.

Search schema

In SharePoint Server 2013 we can create multiple search schemas. The main search schema is defined in the Search Service Application, and you can edit it in the Central Administration. In addition, both site collection administrators and tenant administrators can modify the search schema for a particular site collection or tenant. For example, by modifying the search schema, a site collection administrator can customize what is included in the search index for the content of that site collection and customize the search experience in that site collection. The search schema contains the crawled properties, the managed properties and their settings, and the mappings between crawled and managed properties.

 

Each item that has been crawled and passed on to the content processing pipeline has crawled properties associated with it. You can think of properties such as Author, Title and Creation Date, for example. If the crawled property is new and hasn't been crawled before, the crawled property will be discovered automatically. Crawled properties are grouped into categories that are based on the IFilter or protocol handler of the item. Example categories are Office (containing crawled properties from Word documents, Excel worksheets, and so on), Business Data (containing crawled properties from for example databases) and Web (containing crawled properties from web sites).

You can map crawled properties to managed properties to include the contents of the crawled property in the search index. You can map multiple crawled properties to a single managed property, for example you can map both the crawled properties "Writer" and "Author" to the "Author" managed property, or you can map a single crawled property to multiple managed properties. If a managed property has multiple crawled properties mapped to it, and a document contains values for more than one of those crawled properties, the order in which the crawled properties are mapped to the managed property determines the content of the managed property.

When you create a new managed property, a full crawl must complete before the managed property and its value is included in the search index.

Overview of managed property settings

Settings on the managed properties determine which indexing structure is used, that is how content is saved in the search index.

Site administrators can read the search schema, such as the mappings between crawled and managed properties on the site collection level, but they can't edit the search schema. Administrators using Central Administration, Site Collection Administration and Tenant Administration can edit the search schema. Which settings are available for each administrator role is shown in the following table.
 

Managed property setting

What it does

Example

Available in

Searchable

Enables querying against the content of the managed property. The content of this managed property is included in the full-text index.

If the property is "author", a simple query for "Smith" returns items that contain the word "Smith" and items whose author property contains "Smith".

Central Administration/Site Collection Administration /Tenant Administration

Query able

Enables querying against the specific managed property. The managed property name must be included in the query, either specified in the query itself or included in the query programmatically.

If the managed property is "author", the query must contain "author:Destin".

Central Administration /Site Collection Administration /Tenant Administration

Retrievable

Enables the content of this managed property to be returned in search results. Enable this setting for managed properties that are relevant to present in search results.

 

 

Central Administration /Site Collection Administration /Tenant Administration

Allow multiple values

Allows multiple values of the same type in this managed property.

If this is the "author" managed property, and a document has multiple authors, each author name will be stored as a separate value in this managed property.

Central Administration

Refinable

Yes - active: Enables using the property as a refiner for search results in the front end. You must manually configure the refiner in the web part.

Yes - latent: Enables switching refinable to active later, without having to do a full re-crawl when you switch.

Both options require a full crawl to take effect.

 

Central Administration

Sortable

Yes – active: Enables sorting the result set based on the property before the result set is returned.

Yes – latent: Enables switching sorting to active later without having to do a full re-crawl when you switch.

Both options require a full crawl to take effect.

Use for large result sets that cannot be sorted and retrieved at the same time.

Central Administration

Alias

Defines an alias for a managed property if you want to use the alias instead of the managed property name in queries and in search results. Use the original managed property and not the alias to map to a crawled property.

Use an alias if you don't want to or don't have permission to create a new managed property.

Central Administration /Site Collection Administration /Tenant Administration

Token normalization

Enables returning results independent of letter casing and diacritics used in the query.

The query "curacao" will also match "Destin", "DESTIN" and "DEstin".

Central Administration

Complete matching

Queries will only be matched against the exact content of the property.

If you have a managed property "ID" that contains the string "1-23-456#7", complete matching only returns results on the query ID:"1-23-456#7", and not on ID:"1-23" or  ID:"1 23 456 7".

Central Administration

Mappings to crawled properties

The list shows all the crawled properties that are mapped to this managed property. A managed property can get its content from one or more crawled properties.

You can either include content from all crawled properties or include content from the first crawled property that is not empty, based on a specified order.

 

Central Administration /Site Collection Administration /Tenant Administration

Extraction of company names

Enables the system to extract company name entities from the managed property when crawling new or updated items. Afterwards, the extracted entities can be used to set up refiners in the web part.

There is one pre-populated dictionary for company name extraction. The system saves the original managed property content unchanged in the index and, in addition, copies the extracted entities to the managed property "companies". The "companies" managed property is configured to be searchable, queryable, retrievable, sortable and refinable.

 

You can edit the company name dictionary in the Term Store.

 

Central Administration/Site Collection Administration /Tenant Administration

Custom entity extraction

Enables one or more custom entity extractors to be associated with this managed property. This enables the system to extract entities from the managed property when crawling new or updated items. Afterwards, the extracted entities can be used to set up refiners in the web part.

There are four types of custom extraction dictionaries. You create your own, separate custom entity extraction dictionaries that you deploy using the PowerShell cmdlet Import-SPCustomExtractionDictionary.

The system saves the original managed property content unchanged in the index and, in addition, copies the extracted entities to the managed properties  "WordCustomRefiner1" through 5, "WordPartCustomRefiner1" through 5, "WordExactCustomRefiner" and/or "WordPartExactCustomRefiner" respectively.

These managed properties are configured to be searchable, queryable, retrievable, sortable and refinable. 

 

Central Administration/Site Collection Administration

 

Overview of crawled property settings
 

Crawled property setting

What it does

Example

Available in

Name and information

Name and description of the crawled property. This information about the crawled property is emitted by the filter or protocol handler.

 

Central Administration

Mappings to managed properties

Maps this crawled property to one or more managed properties.

 

Central Administration

Include in full-text index

Includes the content of this crawled property in the full-text index. This enables searching for the content of this crawled property without mapping it to a managed property. Use this setting if the content of this crawled property may be relevant for end-user queries, but you do not see a need for a managed property that contains this content.

Note: Including unnecessary properties in the full-text index may adversely affect search relevance and performance. 

For example, if the crawled property is "reviewer", simple queries such as "Destin" will return both items containing the word "Destin" and items whose reviewer crawled property contains "Destin".  When not enabled, you must specify a managed property mapping, and users must specify a property filter in the query (reviewer:destin) to find the same items. 

Central Administration

 

Search schema tasks

The following table provides an overview of the most common tasks related to crawled and managed properties:
 

Task

Available in

Create a new managed property

Central Administration

Create a new resource intensive managed property (refinable/sortable/other types other than text)

Central Administration

Edit an existing managed property

Central Administration

Map crawled properties to managed properties

Central Administration

View crawled and managed properties and their mapping

Central Administration

 

Edit crawled property categories

Central Administration

View crawled property categories

Central Administration


Manage the search schema in the Central Administration

Create a new managed property

  1. Verify that the user account that is performing this procedure is an administrator for the Search Service Application.
  2. In SharePoint Server 2013 Central Administration, in the Application Management section, click Manage service applications.
  3. Click Search Service Application.
  4. On the Search Administration page, in the Quick Launch, under Queries and Results, click Search Schema.
  5. On the Managed Properties page, click New Managed Property.
  6. In the New Managed Property page, select the options that you want and then click OK.

Edit a managed property

  1. Verify that the user account that is performing this procedure is an administrator for the Search Service Application.
  2. In SharePoint 2013 Central Administration, in the Application Management section, click Manage service applications.
  3. Click Search Service Application.
  4. On the Search Administration page, in the Quick Launch, under Queries and Results, click Search Schema.
  5. Find the managed property that you want to edit in the list that is displayed on the Managed Properties page, or find it by typing the name of the managed property in the Filter box.
  6. Click the managed property or point to the managed property that you want to edit, click the arrow that appears, and then click Edit/Map property.
  7. Select the options that you want and then click OK.

Map a crawled property to a managed property

  1. Verify that the user account that is performing this procedure is an administrator for the Search Service Application.
  2. In SharePoint 2013 Central Administration, in the Application Management section, click Manage service applications.
  3. Click Search Service Application.
  4. On the Search Administration page, in the Quick Launch, under Queries and Results, click Search Schema.
  5. Do one of the following:
    Use the Managed Properties page:
    1. Find the managed property that you want to map to a crawled property in the list that is displayed on the Managed Properties page, or find it by typing the name of the managed property in the Filter box.
    2. Click the managed property or point to the managed property that you want to map, click the arrow that appears, and then click Edit/Map property.
    3. Add the mapping and optionally the order in the Mappings to crawled properties section.
    4. Click OK.

    Use the Crawled Properties page:
    1. Find the crawled property that you want to map to a managed property in the list that is displayed on the Crawled Properties page, or find it by typing the name of the crawled property in the Filter box.
    2. Click the crawled property or point to the crawled property that you want to map, click the arrow that appears, and then click Edit/Map property.
    3. Add the mapping in the Mappings to managed properties section.
    4. Click OK.
View crawled to managed property mappings
  1. Verify that the user account that is performing this procedure is an administrator for the Search Service Application.
  2. In SharePoint 2013 Central Administration, in the Application Management section, click Manage service applications.
  3. Click Search Service Application.
  4. On the Search Administration page, in the Quick Launch, under Queries and Results, click Search Schema.
  5. On the Managed Properties page, you will see an overview of all the managed properties, an overview of the settings on the managed properties and the crawled properties they are currently mapped to.

Manage crawled property categories

  1. Verify that the user account that is performing this procedure is an administrator for the Search Service Application.
  2. In SharePoint 2013 Central Administration, in the Application Management section, click Manage service applications.
  3. Click Search Service Application.
  4. On the Search Administration page, in the Quick Launch, under Queries and Results, click Search Schema.
  5. Click Categories.
  6. Point to the crawled property category that you want to edit, click the arrow that appears, and then click Edit category.
  7. Make the changes that you want and then click OK.

COMMENT USING

Trending up