SQL Coding Best Practices

In this article, we are going to learn the best practices while working with SQL. So,  let's start.

Do not name your procedures with the prefix ‘sp_’

The following script returns all system procedures; 1390 system procedures in this case and all starting with prefix ‘sp_’.

  1. SELECT QUOTENAME(SCHEMA_NAME(SCHEMA_ID))+'.'+QUOTENAME(NAME)) AS SystemProcedures  
  2. FROM sys.all_objects  
  3. WHERE TYPE='P' AND is_ms_shipped=1 

Coding Best Practices

Now, if we create procedures with the same prefix, then there are two things that will happen within the system.

  1. While executing your procedure, the system will first scan through all system procedures and then user-defined procedures. This means that the procedure might take more time for execution, thus decreasing system performance.
  2. If while scanning the system procedure, a match is found, i.e., the name of your procedure is the same as that of a system procedure, then the system procedure will always get executed.

Because of this, you will get an unexpected output which can be an exception, error, some system related information etc.; and you might never understand the reason behind this.

Consider the following example where we have created a procedure named sp_add_agent_parameter.

  1. CREATE PROCEDURE sp_add_agent_parameter  
  2. AS  
  3. BEGIN  
  4. SELECT 'Creating procedure with prefix ''sp_'''  
  5. END    

And, when we execute it, we get the output as:

Coding Best Practices

Seeing this result, we will never come to know why our procedure is not working.

NOTE

You can avoid both the problems mentioned above by executing the procedure as schema_name.procedure_name. But still, it is better to avoid using the prefix ‘sp_’.

Stop using select(*) 

Selecting all columns from a table becomes easy by using Select (*) because then, we don't need to list the column names individually. But using this approach can lead to some problems.

  1. Select (*) returns columns in the same order as defined in the table design. So, if the desired output needs the columns to be in some other order, using this approach won't work.

  2. Suppose, we have a table ‘Product’ having the following columns.

    Coding Best Practices
We write the following SQL code to copy table all those products whose ProductType is ‘Electronic’ from ‘Product’ table to ‘ElectronicProducts’ 

  1. INSERT INTO ElectronicProducts SELECT * FROM ProductMaster WHERE ProductType='Electric'    

This works perfectly fine but then sometime in the future, we add another column to ‘Product’ table and now, its structure is:

Coding Best Practices

And if now, if we run the above SQL code to copy the products, then it will throw the error:

Coding Best Practices

Instead, if we had written the script as:

  1. INSERT INTO ElectronicProducts   
  2. SELECT ProductID,ProductName,ProductType,ProductCost FROM ProductMaster WHERE ProductType='Electric'  

Then, it would have worked in all situations.

Never use column number with ‘order by’ clause

The ‘order by’ clause is used to sort the records in ascending or descending order. Either of the column numbers and column names can be used with the ‘order by’ clause.

Now, consider that we have a table ‘Employee’ having columns EmployeeID, EmployeeName, EmployeeSalary; and we want to find the Employee who earns the lowest. And for doing so we write the following script,

  1. SELECT TOP 1 * FROM Employee ORDER BY 3  
  1. By just looking at the script, we can never understand according to which column the records are being sorted. The person who is trying to understand the script needs to do an extra step of checking the table design to understand what column 3 is. To avoid this extra step and make it simple for anyone to understand the script it is better to use column name with ‘order by’.

  2. Consider a situation where due to some reason we need to add a new column ‘EmployeeDepartment’ to the ‘Employee’ table, but this column is to be added after the EmployeeName column. So now the table has columns in the order- EmployeeID, EmployeeName, EmployeeDepartment, EmployeeSalary.
Try running the same SQL script for sorting the records according to EmployeeSalary. The result that we would be getting won't be the correct one because the records are getting sorted according to the EmployeeDepartment and not EmployeeSalary.

Instead, if we had written column name with the ‘order by’ clause then modifying the table structure would not have caused an issue.

  1. SELECT TOP 1 * FROM Employee ORDER BY EmployeeSalary  
Even if the query is simple, think about performance

Suppose you are designing an Employee management system in which there is an option to search employees by their name.

You write the following script for searching and displaying employee details.

  1. SELECT EmployeeID, EmployeeName, EmployeeSalary, EmployeeDepartment FROM Employee WHERE EmployeeName='Jacob'    
This script is working fine now as the system has just been launched and there are much fewer records, maybe 1000. But now, assume that a few years have passed and the database contains millions of records; will this script give the same performance?

The answer is NO, as it takes more time to scan a table having millions of records than a table having just 1000 records. Therefore we should always think about future performance while writing a script even if it's an easy one.

In this case, for faster search, we should create the index on EmployeeName. The index can be created using the script.

  1. CREATE INDEX EmployeeNameIndex ON Employee(EmployeeName)  
Start using ‘nocount’

Consider the following procedure ‘InsertData’ , which is used to insert data into the Employee table.

  1. ALTER PROCEDURE [dbo].[InsertData] @employeeName varchar(30),  
  2. @employeeDepartment varchar(15),  
  3. @employeeSalary money  
  4. AS  
  5. BEGIN  
  6.   INSERT INTO Employee (EmployeeName, EmployeeDeprtment, EmployeeSalary)  
  7.     VALUES (@employeeName, @employeeDepartment, @employeeSalary)  
  8. END  
On executing this procedure, we get the output

Coding Best Practices

When a client sends a request to insert data; this output is stored in the packet DONE_IN_PROC and sent back to the client, but it is of no use on the client.

Also, sending this information leads to more network usage. To avoid this, we should use ‘nocount’.

It can be used in the above procedure as,

  1. ALTER PROCEDURE [dbo].[InsertData] @employeeName varchar(30),  
  2. @employeeDepartment varchar(15),  
  3. @employeeSalary money  
  4. AS  
  5. BEGIN  
  6.   SET NOCOUNT ON;  
  7.   INSERT INTO Employee (EmployeeName, EmployeeDeprtment, EmployeeSalary)  
  8.     VALUES (@employeeName, @employeeDepartment, @employeeSalary)  
  9. END  

If we do need to know the number of rows returned/affected, we can use @@ROWCOUNT with ‘nocount’.

Table aliases- good or bad?

To be honest, table aliases are both good and bad; it all depends on the way you use them. The following code snippet returns employee names with their respective boss names.

  1. SELECT e1.EmployeeName, e2.EmployeeName as BossName from Employee e1 inner join Employee as e2 ON e1.BossID=e2.EmployeeID  

Table aliases like this (e1 and e2) make it difficult for anyone to understand the code. Even table aliases should be meaningful for fast understanding and to improve the readability. Mostly, using table aliases should be avoided unless you need to do a self-join.

The above code can be re-written as follows to improve its readability.

  1. SELECT Employee .EmployeeName, Boss.EmployeeName as BossName from Employee inner join Employee as Boss ON Employee .BossID=Boss.EmployeeID  
 

X

Build smarter apps with Machine Learning, Bots, Cognitive Services - Start free.

Start Learning Now