SIGN UP MEMBER LOGIN:    
ARTICLE

Clustered and Non-Clustered index in SQL 2005

Posted by Sivaraman Dhamodaran Articles | SQL Server 2012 January 27, 2011
This article shows you how you can create Clustered and Non-Clustered Indexes. Also it explains how they work.
Reader Level:

1. Introduction

 

We all know that data entered in tables are persisted in the physical drive in the form of database files. Think about a table say Customer table (for any leading bank in India) that has around 16 Million records. When we try to retrieve records for two or three customers based on his customer id, all 16 million records are taken and compared to the supplied customer ids. Think about how much time that will take; if it is a web application and there are 25 to 30 customers that want to access their data through the internet. Does the database server search 16 Million x 30 records? The answer is no because all modern databases use the concept of an index.

 

2. What is Index

 

An index is a database object, which can be created on one or more columns (16 Max column combination). When creating the index will read the column(s) and forms a relevant data structure to minimize the number of data comparisons. The index will improve the performance of data retrieval and adds some overhead on data modification such as create, delete and modify.  So it depends how much data retrieval can be performed on table versus how much of DML (Insert, Delete and Update) Operations.

 

In this article we will see creation of an Index. The two sections below are taken from my previous article as it is required here. If your database has changes for the next two sections, you can directly go to the section 5.

 

3. First Create two tables

 

To explain these constraints we need two tables. First let us create these tables. Run the scripts below to create the tables. Copy and paste the code on the new Query Editor window then execute it.

 

CREATE TABLE Student(StudId smallint, StudName varchar(50), Class tinyint);

CREATE TABLE TotalMarks(StudentId smallint, TotalMarks smallint);

Go

 

Note that there are no constraints at present on these tables. We will add the constraints one by one.

 

4. Primary Key Constraint

 

A table column with this constraint is called the key column for the table. This constraint helps the table to make sure that the value is not repeated and also no null entries. We will mark the StudId column of the Student table as primary key. Follow these steps:

 

1.Right click the student table and click on the modify button

2. From the displayed layout select the StudId row by clicking the Small Square like button on the left side of the row.

3. Click on the Set Primary Key toolbar button to set the StudId column as primary key column.

 

Pic000C.JPG Now this column does not allow null values and duplicate values. You can try inserting values to violate these conditions and see what happens. A table can have only one Primary key. Multiple columns can participate on the primary key column. Then the uniqueness is considered among all the participant columns by combining their values.

 

5. Clustered Index

 

The primary key created for the StudId column will create a clustered index for the Studid column. A table can have only one clustered index on it.

 

When creating the clustered index, SQL server 2005 reads the Studid column and forms a Binary tree on it. This binary tree information is then stored separately in the disc. Expand the table Student and then expand the Indexes. You will see the following index created for you when the primary key is created:

 

000.jpg

 

With the use of the binary tree, now the search for the student based on the studid decreases the number of comparisons to a large amount. Let us assume that you had entered the following data in the table student:

 

000A.jpg

 

The index will form the below specified binary tree. Note that for a given parent, there are only one or two Childs. Left side will always have a lesser value and right side will always have a greater value when compared to parent. The tree can be constructed in the reverse way also. That is, left side higher and right side lower.

 

000B.JPG

 

 

Now let us assume that we had written a query like below:

 

Select * from student where studid = 103;

Select * from student where studid = 107;

 

Execution without index will return value for the first query after third comparison.

Execution without index will return value for the second query at eights comparison.

 

Execution of first query with index will return value at first comparison.
 

Execution of Second query with index will return the value at the third comparison. Look below:
 

1. Compare 107 vs 103 : Move to right node

2. Compare 107 vs 106 : Move to right node

3. Compare 107 vs 107 : Matched, return the record

 

If the numbers of records are less, you cannot see a difference. Now apply this technique with a Yahoo email user accounts stored in a table called say YahooLogin. Let us assume there are 33 million users around the world that have yahoo email id and that is stored in the YahooLogin. When a user logs in by giving the user name and password, the comparison required is 1 to 25, with the binary tree that is clustered index. Look at the above picture and guess yourself how fast you will reach into the level 25. Without Clustered index the comparison required is 1 to 33 millions.

 

Got the usage of Clustered index? Let us move to Non-Clustered index.

 

6. Non Clustered Index

 

A non-clustered index is useful for columns that have some repeated values. Say fox example AccountType column of a bank database may have 10 million rows. But, the distinct values of account type may be 10-15. A clustered index is automatically created when we create the primary key for the table. We need to take care of the creation of the non-clustered index.

 

Follow the steps below to create a Non-clustered index on our table Student based on the column class.

 

1. After expanding the Student table, right click on the Indexes. And click on the New Index.

 

001.jpg

 

2. From the displayed dialog type the index name as shown below and then click on the Add button to select the column(s) that participate in the index. Make sure the Index type is Non-Clustered.

 

002.jpg

 

3. In the select column dialog, place a check mark for the column class. This indicates that we need a non-clustered index for the column Student.Class. You can also combine more than one column to create the Index. Once the column is selected click on the OK button. You will return the dialog shown above with the selected column marked in blue. Our index has only one column. If you selected more than one column, using the MoveUp and MoveDown button you can change order of the indexed columns. When you are using the combination of columns always use the highly repeated column first and more unique columns down in the list. For example, let use assume the correct order for creating the Non-clustered index is:  Class, DateOfBirth, PlaceOfBirth

 

003.jpg

 

4. Click on the Index folder on the right side and you will the non-clustered index based on the column class is created for you.

 

004.jpg

 

7. How does a Non-Clustered Index work?

 

A table can have more than one Non-Clustered index. But, it should have only one clustered index that work based on the Binary tree concept. Non-Clustered column always depends on the Clustered column on the database.

 

This can be easily explained with the concept of a book and their index page at the end. Let us assume that you are going to a bookshop and find a big 1500 page C# book that says all about C#.  When you glanced at the book, it has all beautiful color pages and shiny papers. But, that is not the only eligibility for a good book right? Since you are impressed, you want to see your favorite topic, Regular Expressions and how they are explained in the book. What will you do? I just peeped at you from behind and recorded what you did as below:

 

1. You went to the Index page (It has a total 25 pages). It is already sorted and hence you easily picked up Regular Expressions that comes on the page Number 17.

2. Next you noted down the number displayed next to it which is 407, 816, 1200-1220

3. Your first target is Page 407. You opened a page in the middle, the page is greater than 500.

4. Then you moved to a somewhat lower page. But it still reads 310.

5. Then you moved to a higher page. You are very lucky you exactly got page 407. [Yes man you got it. Otherwise I need to write more. OK?]

6. That's all, you started exploring what is written about Regular expressions on that page, keeping in mind that you need to find page 816 also.

 

In the above scenario, the index page is like a Non-Clustered index and the page numbers are like clustered indexes arranged in a binary tree. See how you came to the page 407 very quickly. Your mind actually traversed the binary tree way left and right to reach the page 407 quickly.

 

Here, the class column with distinct values 1,2,3..12 will store the clustered index columns value along with it. Say for example, let us take only a class value of 1. The Index goes like this:

 

1: 100, 104, 105

 

So here, you can easily get all the records that has value for class = 1. Map this with the Book index example now. You will see all in the subsequent articles.

Login to add your contents and source code to this article
share this article :
post comment
 

Very well explained. Goog one...

Posted by Mohit Singla May 04, 2011

Thanks Man.

Posted by Sivaraman Dhamodaran Mar 24, 2011

thnx now i got the index concept.

Posted by Ankit Nandekar Mar 24, 2011

Thanks Man

Posted by Sivaraman Dhamodaran Mar 16, 2011

Nice article, I have getting proper concept about the index

Posted by shubhangi khandale Mar 16, 2011
6 Months Free & No Setup Fees ASP.NET Hosting!
Become a Sponsor
PREMIUM SPONSORS
  • Finally – a virtual platform that delivers next-generation Windows Server 2008 Hyper-V virtualization technology from a managed hosting partner you can truly depend on. Visit www.maximumasp.com/max for a FREE 30 day trial. Hurry offer ends soon. Climb aboard the MaxV platform and take advantage of High Availability, Intelligent Monitoring, Recurrent Backups, and Scalability – with no hassle or hidden fees. As a managed hosting partner focused solely on Microsoft technologies since 2000, MaximumASP is uniquely qualified to provide the superior support that our business is built on. Unparalleled expertise with Microsoft technologies lead to working directly with Microsoft as first to offer IIS 7 and SQL 2008 betas in a hosted environment; partnering in the Go Live Program for Hyper-V; and product co-launches built on WS 2008 with Hyper-V technology.
    Finally – a virtual platform that delivers next-generation Windows Server 2008 Hyper-V virtualization technology from a managed hosting partner you can truly depend on. Visit www.maximumasp.com/max for a FREE 30 day trial. Hurry offer ends soon. Climb aboard the MaxV platform and take advantage of High Availability, Intelligent Monitoring, Recurrent Backups, and Scalability – with no hassle or hidden fees. As a managed hosting partner focused solely on Microsoft technologies since 2000, MaximumASP is uniquely qualified to provide the superior support that our business is built on. Unparalleled expertise with Microsoft technologies lead to working directly with Microsoft as first to offer IIS 7 and SQL 2008 betas in a hosted environment; partnering in the Go Live Program for Hyper-V; and product co-launches built on WS 2008 with Hyper-V technology.
Become a Sponsor