SQL- Self Join And Group By

Introduction

This is an interview question. It seems very simple, but there are some tricks to it. Analyzing it can review some important concepts of SQL Query.

Question

Given a table with four columns,

  • Employees - table with columns, 
    • Id
    • Name
    • Salary
    • ManagerId

Request

Get the Manager's Name, the employee amount per manager, and the total salaries for each team.

A. Set up Table and initial Data

Set up the Table by the question,

USE [TestDB]   
GO  
  
    /****** Object:  Table [dbo].[Employees]    Script Date: 5/28/2021 7:42:51 PM ******/   SET ANSI_NULLS ON   
GO  
   SET QUOTED_IDENTIFIER ON   
GO  
   CREATE TABLE [dbo]
        .
[Employees](  
    [Id] [int] IDENTITY(1, 1) NOT NULL,  
    [Name] [nvarchar](max) NOT NULL,  
    [Salary] [int] NOT NULL,  
    [ManagerId] [int] NOT NULL,  
 CONSTRAINT [PK_dbo.Employees] PRIMARY KEY CLUSTERED    (  
    [Id] ASC   )WITH (
                PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON 
            [PRIMARY]   ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]   
GO 

Insert initial Data

insert dbo.Employees  
Values  ('Greg', 100000, 1),  
        ('George', 150000, 1),  
        ('Helen', 130000, 1),  
        ('Tom', 120000, 2),  
        ('Kevin', 110000, 2),  
        ('David', 120000, 3),  
        ('Geek', 110000,3),  
        ('Tesla', 120000,3) 

The result will be,

B. Self JOIN in SQL

This is a self-JOIN issue. A self-join is a regular join, but the table is joined with itself. We review the JOIN types below; find a detailed article about Joins in SQL here: Joins in SQL 

Different Types of SQL JOINs

Here are the different types of JOINs in SQL,

  • (INNER)JOIN. Returns records that have matching values in both tables
  • LEFT(OUTER)JOIN. Returns all records from the left table and the matched records from the right table
  • RIGHT(OUTER)JOIN. Returns all records from the right table, and the matched records from the left table
  • FULL(OUTER)JOIN. Returns all records when there is a match in the either the left or right table

We make a self-JOIN,

  • The left table is the employee table - dbo.Employees e, and
  • The right table is the manager table - dbo.Employees m,
  • Where the Employee table's ManagerId == Manager table's Id (EmployeeId)
    SELECT m.Name [Manager Name], e.Id [Employee ID], e.salary [Employee Salary]  
    FROM dbo.Employees e JOIN dbo.Employees m  
    ON e.ManagerId = m.Id 

     

We got the result,

Where the Manager's Name is what we need, the employee ID and Salary were SQL SELECT from the employee table (left) which we need to count and sum.

C. Group by in SQL

To get count and sum, we need to use Group by Clause,

The SQL GROUP BY Statement

The GROUP BY statement groups rows with the same values into summary rows, like "find the number of customers in each country."

The GROUP BY statement is often used with aggregate functions (COUNT(), MAX(), MIN(), SUM(), AVG()) to group the result set by one or more columns.

SELECT m.Name [Manager Name], count(m.Id) [team amount], sum(e.salary) [Total Salary]  
FROM dbo.Employees e JOIN dbo.Employees m  
ON e.ManagerId = m.Id  
group by m.name 

We groupby m.name and use aggregate functions Count() and Sum().

Note

When we select m.name, we have to group by m.nam; otherwise, suppose we group by m.id (that is correct),

SELECT m.Name [Manager Name], count(m.Id) [team amount], sum(e.salary) [Total Salary]  
FROM dbo.Employees e JOIN dbo.Employees m  
ON e.ManagerId = m.Id  
group by m.id 

Then we got an error message for the selected m.name,

Msg 8120, Level 16, State 1, Line 1,

Column 'dbo.Employees.Name' is invalid in the select list because it is not contained in an aggregate function or the GROUP BY clause.

Summary

This article discussed the major features of the SQL query - JOI and Group By.

Reference


Similar Articles