SQL- Self Join And Group By

This is an interview question. It seems very simple, but there are some tricks in it. Analyzing it can review some important concepts of SQL Query.
 
Question
 
Given a table with four columns,
  • Employees - table with columns,

    • Id
    • Name
    • Salary
    • ManagerId
Request
 
Get Manager Name, employee amount per each manager, and the total salaries for each team.
 

A- Set up Table and initial Data

 
Set up Table by the question,
  1. USE [TestDB]  
  2. GO  
  3.   
  4. /****** Object:  Table [dbo].[Employees]    Script Date: 5/28/2021 7:42:51 PM ******/  
  5. SET ANSI_NULLS ON  
  6. GO  
  7.   
  8. SET QUOTED_IDENTIFIER ON  
  9. GO  
  10.   
  11. CREATE TABLE [dbo].[Employees](  
  12.     [Id] [int] IDENTITY(1,1) NOT NULL,  
  13.     [Name] [nvarchar](maxNOT NULL,  
  14.     [Salary] [intNOT NULL,  
  15.     [ManagerId] [intNOT NULL,  
  16.  CONSTRAINT [PK_dbo.Employees] PRIMARY KEY CLUSTERED   
  17. (  
  18.     [Id] ASC  
  19. )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFFON [PRIMARY]  
  20. ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]  
  21. GO 
Insert initial Data
  1. insert dbo.Employees  
  2. Values  ('Greg', 100000, 1),  
  3.         ('George', 150000, 1),  
  4.         ('Helen', 130000, 1),  
  5.         ('Tom', 120000, 2),  
  6.         ('Kevin', 110000, 2),  
  7.         ('David', 120000, 3),  
  8.         ('Geek', 110000,3),  
  9.         ('Tesla', 120000,3) 
The result will be,
 

B- Self JOIN

 
This is a self-JOIN issue. A self-join is a regular join, but the table is joined with itself. We review the JOIN types below,
 

Different Types of SQL JOINs[ref]

 
Here are the different types of JOINs in SQL,
  • (INNER)JOIN- Returns records that have matching values in both tables
  • LEFT(OUTER)JOIN- Returns all records from the left table, and the matched records from the right table
  • RIGHT(OUTER)JOIN- Returns all records from the right table, and the matched records from the left table
  • FULL(OUTER)JOIN- Returns all records when there is a match in either left or right table
 
We make a self JOIN,
  • The left table is the employee table - dbo.Employees e, and
  • The right table is the manager table - dbo.Employees m,
  • Where the Employee table's ManagerId == Manager table's Id (EmployeeId)
  1. SELECT m.Name [Manager Name], e.Id [Employee ID], e.salary [Employee Salary]  
  2. FROM dbo.Employees e JOIN dbo.Employees m  
  3. ON e.ManagerId = m.Id 
We got the result,
 
Where the Manager Name is what we need, the employee ID and Salary were SELECT from employee table (left) which we need to count and sum.
 

C- Group by[ref]

 
To get count and sum, we need to use Group by Clause,
 

The SQL GROUP BY Statement

 
The GROUP BY statement groups rows that have the same values into summary rows, like "find the number of customers in each country".

The GROUP BY statement is often used with aggregate functions (COUNT(), MAX(), MIN(), SUM(), AVG()) to group the result-set by one or more columns.

  1. SELECT m.Name [Manager Name], count(m.Id) [team amount], sum(e.salary) [Total Salary]  
  2. FROM dbo.Employees e JOIN dbo.Employees m  
  3. ON e.ManagerId = m.Id  
  4. group by m.name 
We groupby m.name, and use aggregate function Count() and Sum().
 
 
NOTE
When we select m.name, we have to group by m.name, otherwise, suppose we group by m.id (that is correct),
  1. SELECT m.Name [Manager Name], count(m.Id) [team amount], sum(e.salary) [Total Salary]  
  2. FROM dbo.Employees e JOIN dbo.Employees m  
  3. ON e.ManagerId = m.Id  
  4. group by m.id 
Then we got an error message for the selected m.name,
 
 
Msg 8120, Level 16, State 1, Line 1,

Column 'dbo.Employees.Name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
 

Summary

 
This article discussed the SQL query major features - JOIN, and Group By.

Reference