Pivot Tables And Reporting In SQL

Introduction

 
Quite often, there’s a need to change how data is displayed in a database so that the users could analyze it more easily. One of the cases of such transformation is transposing data from rows to columns in SQL Server.
 
Various analytical reports provide the ability to visualize data and speed up the process of making decisions based on the retrieved results.
 
Let’s look at these operations closer in an example of a database for a job recruitment service.
 

Problem statement

 
We need to quickly understand who qualifies for a position that requires knowledge of C# and PostgreSQL.
 
First, let’s define the selection criteria.
 
For simplicity’s sake, let’s assume that we need to search for two skills – C# and PostgreSQL. We’ll omit the other ones to not overcomplicate our example.
 
We’re interested in the Employee, JobHistory, Project, and Skill entities, as well as the ProjectSkill cluster.
 
We also need to keep in mind that the potential employees should have used the required skills (C#, PostgreSQL) in the last 3 years, i.e., their experience is relevant.
 
It is also important to display the use of each skill (technology) for a particular year. As we’re taking data from the last 3 years, we need to show the use of skills for 2017, 2018, and 2019. The current year is 2020 (and I wrote this article in January, so the year has just only begun) and we will only analyze data from the previous 3 years to make things simpler.
 
It’s possible that no potential employees from the database have the relevant PostgreSQL experience. In this case, after confirming this with the employer, we can include the Oracle skill into our search (this database is the closest one to PostgreSQL).
 
This is what the schema of our database for a job recruitment looks like:
 
Database schema
Fig.1 Database schema
 

Solution

 
Considering the starting conditions we described earlier and the description of the desired output, we get the following query:
  1. SELECT emp.[EmployeeID]  
  2.    ,emp.[LastName]  
  3.    ,emp.[FirstName]  
  4.    ,s.[SkillName]  
  5.    ,DATEDIFF(DAY, jh.[StartDate], jh.[FinishDate]) / (DATEDIFF(YEAR, jh.[StartDate], jh.[FinishDate]) + 1) AS [PeriodDay]  
  6.    ,CASE  
  7. WHEN ((2017 >= year(jh.[StartDate])) AND  
  8. (2018 > year(jh.[FinishDate]))) THEN 2017  
  9. WHEN ((2018 >= year(jh.[StartDate])) AND  
  10. (2019 > year(jh.[FinishDate]))) THEN 2018  
  11. WHEN ((2019 >= year(jh.[StartDate])) AND  
  12. (2020 > COALESCE(year(jh.[FinishDate]), year(GetDate())))) THEN 2019  
  13. END AS [Year]  
  14. FROM [Employee] AS emp  
  15. INNER JOIN [JobHistory] AS jh  
  16. ON emp.[EmployeeID] = jh.[EmployeeID]  
  17. INNER JOIN [Project] AS p  
  18. ON p.[ProjectID] = jh.[ProjectID]  
  19. INNER JOIN [ProjectSkill] AS ps  
  20. ON p.[ProjectID] = ps.[ProjectID]  
  21. INNER JOIN [Skill] AS s  
  22. ON s.[SkillID] = ps.[SkillID]  
  23. WHERE (jh.[FinishDate] >= DATEADD(YEAR, -3, GetDate())  
  24. OR (jh.[FinishDate] IS NULL));  
Here’s a possible result of this query,
 
 
 Fig.2 Pivot Table Example
 
This query provides the following information:
  1. EmployeeID – applicant’s ID
  2. LastName – applicant’s second name
  3. FirstName – applicant’s first name
  4. SkillName – skill (technology)
  5. PeriodDay – a distribution of the days in which the required skills were used for each year. It’s calculated in the [Year] column, omitting the non-working days (non-working days are included in the working periods)
  6. Year – the year to which a PeriodDay is related.
We have received the necessary data.
 
However, it’s not very simple to analyze data presented in this way.
 
Transposing data with PIVOT operator
 
For a more convenient visualization, let’s apply data transposition with the help of the PIVOT operator, and output the distribution of used skills for each year in separate columns,
  1. SELECT  
  2. [EmployeeID]  
  3.    ,[LastName]  
  4.    ,[FirstName]  
  5.    ,[SkillName]  
  6.    ,[2017]  
  7.    ,[2018]  
  8.    ,[2019]  
  9. FROM (SELECT  
  10. emp.[EmployeeID]  
  11.   ,emp.[LastName]  
  12.   ,emp.[FirstName]  
  13.   ,s.[SkillName]  
  14.   ,DATEDIFF(DAY, jh.[StartDate], jh.[FinishDate]) / (DATEDIFF(YEAR, jh.[StartDate], jh.[FinishDate]) + 1) AS [PeriodDay]  
  15.   ,CASE  
  16. WHEN ((2017 >= year(jh.[StartDate])) AND  
  17. (2018 > year(jh.[FinishDate]))) THEN 2017  
  18. WHEN ((2018 >= year(jh.[StartDate])) AND  
  19. (2019 > year(jh.[FinishDate]))) THEN 2018  
  20. WHEN ((2019 >= year(jh.[StartDate])) AND  
  21. (2020 > COALESCE(year(jh.[FinishDate]), year(GetDate())))) THEN 2019  
  22. END AS [Year]  
  23. FROM [Employee] AS emp  
  24. INNER JOIN [JobHistory] AS jh  
  25. ON emp.[EmployeeID] = jh.[EmployeeID]  
  26. INNER JOIN [Project] AS p  
  27. ON p.[ProjectID] = jh.[ProjectID]  
  28. INNER JOIN [ProjectSkill] AS ps  
  29. ON p.[ProjectID] = ps.[ProjectID]  
  30. INNER JOIN [Skill] AS s  
  31. ON s.[SkillID] = ps.[SkillID]  
  32. WHERE (jh.[FinishDate] >= DATEADD(YEAR, -3, GetDate())  
  33. OR (jh.[FinishDate] IS NULL))) AS SourceTable  
  34. PIVOT  
  35. (  
  36. AVG([PeriodDay])  
  37. FOR [YearIN ([2017], [2018], [2019])  
  38. AS PivotTable;  
Here’s a possible result of this query:

Fig.3 Transposing data with PIVOT operator
 
In the first four columns, the same information is displayed as in the previous query, and the last three columns show the distribution of used skills for years 2017, 2018, and 2019 respectively.
 
You can create this visualization in tools like SSRS and dbForge Studio for SQL Server. These solutions also allow you to create various reports.
 
Transposing data in dbForge Studio for SQL Server
 
Let’s take a closer look at how we can pivot tables in dbForge Studio for SQL Server.
 
To do this, first, click Pivot Table in the main menu or Report Designer in the Database menu.
 
 Navigating to the report feature
Fig.4 Navigating to the report feature
 
The first command allows you to pivot the data from the table, and the second one helps you create a user report.
 
First, let’s learn about the Pivot Table functionality.
 
Let’s click it and insert the first script in the bottom of the newly-opened window:
 
Refreshing the data with a user query
Fig.5 Refreshing the data with a user query
 
Now we need to press the Refresh button in the top left corner of the window. The result should be similar to the following:
 
Pivot Table in dbForge Studio for SQL Server
Fig.6 Pivot Table in dbForge Studio for SQL Server
 
We should remember what the buttons on the top left do – one refreshes the data according to a query, and Edit SQL Text allows you to change the query.
 
At the bottom, you can switch between data (Data button) and query text (Text button). With the help of the Pivot Table button, you can pivot the table. We’ll get back to it later.
 
You can also change data output using the following buttons:
 
Controlling the presentation of retrieved data
Fig.7 Controlling the presentation of retrieved data – 2
 
and
Controlling the presentation of retrieved data 
Fig.8 Controlling the presentation of retrieved data – 3
 
In the last example, we grouped the data by the first four columns.
 
You can also sort and group data using the context menu:
 
Context menu of the result data set window
Fig.9 Context menu of the result data set window
 
As you can see, you can set filtering, visibility of certain data, and other parameters in the context menu.
 
Now, let’s get back to pivoting our table. Click Pivot Table and select the necessary fields:
 
Configuring the data transposition
Fig.10 Configuring the data transposition
 
You need to select a column in the top right and the necessary command in the bottom right:
  1. Row Area – to a row
  2. Column Area – to a table
  3. Filter Area – to a filter
  4. Data Area – to data
  5. And then press Add To.
In our example, the Row Area command was selected for the following attributes:
  1. EmployeeID
  2. LastName
  3. FirstName
  4. SkillName
The Column Area command was selected for the Year column.
 
The Data Area command was selected for the PeriodDay attribute.
 

Creating a user report in dbForge Studio for SQL Server

 
Let’s now briefly discuss creating user reports in dbForge Studio for SQL Server.
 
First, go to Database\Report Designer:
 
Navigating to the user report functionality
Fig.11 Navigating to the user report functionality
 
In the window that will be opened, select the desired report type (in this case, it’s Standard Report) and click Next:
 
Selecting a report type
Fig.12 Selecting a report type
 
Now, select the connection and data type (in this case, Custom Query) and then click Next,
 
Configuring the report’s connection 
Fig.13 Configuring the report’s connection
 
Keep in mind that you can select the Simple Table/View data type instead of Custom Query if you need to build the report from one table or view. In this particular case, we need to select Custom Query as the report is built on the data from several tables.
 
You should also remember that the connection to a specific database should be established with SQL Server authentication, otherwise the query won’t work.
 
Let’s insert the necessary query and click Next:
 
User query for the report
Fig.14 User query for the report
 
If you click Design before clicking Next, Query Builder will be opened. With its help, you can visually create the query you need.
 
The Load button allows you to load a previously saved query script.
 
Next, choose all the columns needed for the report and press Next,
 
Selecting the fields on which you need to report 
Fig.15 Selecting the fields on which you need to report
 
Then, configure the grouping and click Next:
 
Configuring grouping for the report 
Fig.16 Configuring grouping for the report
 
Next, select the necessary aggregate functions and ignore the NULL-values as we don’t need them for data analysis,
 
Setting data aggregation for the report 
Fig.17 Setting data aggregation for the report
 
Here, we’ll leave default settings and only change the page orientation from portrait to landscape,
 
Configuring the report page 
Fig.18 Configuring the report page
 
We’ll leave the Bold style and click Next,
 
Configuring the report style 
Fig.19 Configuring the report style
 
Finally, let’s name the report and click Finish,
 
Specifying the report’s name 
Fig.20 Specifying the report’s name
 
You can always return to any of the previous steps by clicking Back and save the report as a .bat file with the help of the Save Command Line button or even cancel the report creation by clicking Cancel.
 
Here’s the resulting report project,
 
New user report 
Fig.21 New user report
 
Report results can be saved in various frequently-used formats like RDP, HTML, RTF, and others.
 
The report projects are saved in the RDB format and you can edit them as easily with dbForge Studio for SQL Server as you can RDL files in the Report Builder or in Visual Studio.
 

Conclusion

 
When there is a need to change how data is displayed in a database, we often need to transpose data from rows to columns, that is, to pivot tables. In this article, we provided a step-by-step guide on how to perform the PIVOT command in a T-SQL script and with dbForge Studio for SQL Server. The latter tool enables to set filtering as well as visibility of certain data and allows users to save reports in different widely-used formats including RDP, HTML, RTF, which makes it easier for users to edit.
 
Initially posted here.