In Focus

Sub Queries In SQL Server

In this write-up, we will learn about subqueries in SQL Server. Subqueries are also termed as Nested queries.

Subqueries are nothing but literally a technique of nesting one query within the other. If you make nested queries up to multiple levels, then it will be the worst case scenario in your project and will affect the performance. Subqueries are also termed as Nested queries.

There are many places where we can use the subqueries but the most common places are:

  • As a search condition in a WHERE clause (most common)
  • Like a Table in a FROM clause
  • As a search condition in a HAVING clause
  • As a column in the SELECT clause

Let’s see a simple example of Subquery.

  1. SELECT InvoiceId, CustomerId, Total  
  2. FROM Invoice  
  3. WHERE Total > (  
  4.         SELECT AVG(Total)  
  5.         FROM Invoice  
  6.     )  
  7. ORDER BY Total DESC  

In subqueries, we can use any number of tables like we’re doing here. We’re using the same table twice. So, you might get confused about the difference between Joins and Subqueries.

Comparison of Joins and Subqueries

Joins let you include columns from all the tables in the query. And subquery only includes the columns for the table in the outer query.

  1. SELECT InvoiceId, CustomerId, Total  
  2. FROM Invoice  
  3. WHERE Total > (  
  4.         SELECT AVG(Total)  
  5.         FROM Invoice  
  6.     )  
  7. ORDER BY Total DESC  

So the explanation of the above point is that when we’re working with joins, we can include any column of any table in the joins query. Whereas when we’re working with subqueries, we can just select the columns of the outer query, as we have here.

  1. SELECT InvoiceId, CustomerId, Total  
  2. FROM Invoice  
  3. WHERE Total > ...  

Joins are easier to use and their performance is faster, whereas, Subquery has this benefit over the join that we can pass the aggregate values to the outer query with the help of a subquery.

Let’s see an example.

  1. SELECT InvoiceId  
  2.        , InvoiceDate  
  3.        , Total  
  4. FROM Invoice  
  5. JOIN Customer  
  6. ON Invoice.CustomerId = Customer.CustomerId  
  7. WHERE Country = 'USA'  
  8. ORDER BY InvoiceDate DESC  

This is the join query. Now, let’s write the same query in a subquery.

  1. SELECT InvoiceId  
  2.        , InvoiceDate  
  3.        , Total  
  4. FROM Invoice  
  5. WHERE CustomerId IN (  
  6.         SELECT DISTINCT CustomerId  
  7.         FROM Customer  
  8.         WHERE Country = 'USA'  
  9.     )  
  10. ORDER BY InvoiceDate DESC  

Another example of Subquery is where we’re applying the subquery at WHERE clause.

  1. SELECT CustomerId, FirstName, LastName  
  2. FROM Customer  
  3. WHERE NOT EXISTS (  
  4.     SELECT * FROM Invoice  
  5.     WHERE Invoice.CustomerId = Customer.CustomerId  
  6. )  

And we can put the same query in a little bit more different way.

  1. SELECT CustomerId, FirstName, LastName  
  2. FROM Customer  
  3. WHERE CustomerId NOT IN (  
  4.     SELECT DISTINCT CustomerId  
  5.     FROM Invoice  
  6. )  

Co-related Queries

Co-related queries are a type of variation of subquery or nested query. Both (co-related queries and subqueries) have outer queries and inner queries. But the big difference is in co-related queries that our inner queries also referencing the outer queries table.

This is the most common interview question to find the 2nd most highest Salary. [Co-Related Query]

  1. SELECT Name, Salary   
  2. FROM Employees e1  
  3. WHERE 2 = (  
  4.     SELECT COUNT(e2.Salary)  
  5.     FROM Employees e2  
  6.     WHERE e2.Salary >= e1.Salary  
  7. )  

And if we write the same query in [Subquery].

  1. SELECT MAX(Salary)  
  2. FROM Employees  
  3. WHERE Salary NOT IN (  
  4.     SELECT MAX(Salary)  
  5.     FROM Employees  
  6. )  

Look here, in subquery, both inner query and outer query are independent of each other. There are no references.

Now, you might be wondering how this query works actually. So for demonstration, let's see the below code.

  1. SELECT * FROM Employees  

And we’ll get.

 Sub Queries In SQL Server

Now let’s take a look at this query.

  1. SELECT Name, Salary   
  2. FROM Employees e1  
  3. WHERE 2 = (  
  4.     SELECT COUNT(e2.Salary)  
  5.     FROM Employees e2  
  6.     WHERE e2.Salary >= e1.Salary  
  7. )  

Look in the inner query we’re referencing the outer query table.

So in the first iteration, let’s say e1,.Salary is 5000 and the check is WHERE e2.Salary is greater or equal to e1.Salary. So we’ll find 7 matches. And the Count (e2.Salary) will become 7 which is not equal to 2. It means 5000 is at the 7th position in our table.

Similarly, let’s suppose e1.Salary value is 6500 which is greater than and equal to 2 matches in the complete table. Count(e2.Salary) will become 2, it means it is the 2nd highest Salary.

Now let’s try one more time, e1.Salary is 5500. And we want to find the 3rd highest Salary. And obviously, it is the 3rd highest Salary.

  1. SELECT Name, Salary   
  2. FROM Employees e1  
  3. WHERE 3 = (  
  4.     SELECT COUNT(e2.Salary)  
  5.     FROM Employees e2  
  6.     WHERE e2.Salary >= e1.Salary  
  7. )  

Now, when the inner executes, it will find 4 matches of 5500 in the table which means Count is 4 containing (5500, 5500, 6500, 7000). And here is the problem, we need to count just different Salaries. So:

  1. SELECT Name, Salary   
  2. FROM Employees e1  
  3. WHERE 3 = (  
  4.     SELECT COUNT(DISTINCT e2.Salary)  
  5.     FROM Employees e2  
  6.     WHERE e2.Salary >= e1.Salary  
  7. )  

And now, our problem is solved. There is even a replication of Salaries multiple times in the table.