How to Generate a CREATE TABLE Script For an Existing Table: Part 1

In this article, we learn how to generate a CREATE TABLE Script For an Existing Table.

SQL Server stores information about all objects and their properties as metadata that can be accessed through system views. In addition, some of the system views hold interesting nuances that can help to better understand how a DBMS works.

To see the system view body, just as for any other script object, the OBJECT_DEFINITION function is used: 

  1. PRINT OBJECT_DEFINITION(OBJECT_ID('sys.objects'))  

However, OBJECT_DEFINITION, as well as its analogue sp_helptext, has a significant disadvantage; it does not allow the return script description for a table object. 

  1. IF OBJECT_ID('dbo.Table1''U'IS NOT NULL  
  2.   DROP TABLE dbo.Table1  
  3. GO  
  4.   
  5. CREATE TABLE dbo.Table1 (ColumnID INT PRIMARY KEY)  
  6. GO  
  7.   
  8. EXEC sys.sp_helptext 'dbo.Table1'  
  9. SELECT OBJECT_DEFINITION(OBJECT_ID('dbo.Table1''U'))  
When executing sp_helptext, we will get the following error:

Msg 15197, Level 16, State 1, Procedure sp_helptext, Line 107
There is no text for object 'dbo.Table1'.

Under the same conditions, a system function OBJECT_DEFINITION returns NULL.

Fetching from sys.sql_modules will also not solve the problem, since the same old OBJECT_DEFINITION function call is used inside this system view:
  1. CREATE VIEW sys.sql_modules AS  
  2.     SELECT object_id = o.id,  
  3.         definition = object_definition(o.id),  
  4.         ...  
  5.     FROM sys.sysschobjs o  
Such behavior is rather disappointing. Sometimes it is neecessary to retrieve a script description of a table for some scripts. Well, let’s look at system views and create an OBJECT_DEFINITION function analogue for working with table objects.

To start, let's create a test table, in order for the process of script writing to be more clear: 
  1. IF OBJECT_ID('dbo.WorkOut''U'IS NOT NULL  
  2.     DROP TABLE dbo.WorkOut  
  3. GO  
  4.   
  5. CREATE TABLE dbo.WorkOut   
  6. (  
  7.     WorkOutID BIGINT IDENTITY(1,1) NOT NULL,  
  8.     TimeSheetDate AS DATEADD(DAY, -(DAY(DateOut) - 1), DateOut),  
  9.     DateOut DATETIME NOT NULL,  
  10.     EmployeeID INT NOT NULL,  
  11.     IsMainWorkPlace BIT NOT NULL DEFAULT 1,  
  12.     DepartmentUID UNIQUEIDENTIFIER NOT NULL,  
  13.     WorkShiftCD NVARCHAR(10) NULL,  
  14.     WorkHours REAL NULL,  
  15.     AbsenceCode VARCHAR(25) NULL,  
  16.     PaymentType CHAR(2) NULL,  
  17.     CONSTRAINT PK_WorkOut PRIMARY KEY CLUSTERED (WorkOutID)  
  18. )  
  19. GO  
And proceed to the first step, getting a list of columns and their properties.

Essentially, the list of columns can be obtained by simply referencing one of the several system views. Thus, it is important to fetch from the simplest system views, in order for the query execution time to be minimal.

Here are a few examples along with their execution plans made in dbForge Studio for SQL Server

  1. --#1  
  2. SELECT *  
  3. FROM INFORMATION_SCHEMA.COLUMNS c  
  4. WHERE c.TABLE_SCHEMA = 'dbo'  
  5.     AND c.TABLE_NAME = 'WorkOut'  
 
  
  1. --#2  
  2. SELECT c.*   
  3. FROM sys.columns c WITH(NOLOCK)  
  4. JOIN sys.tables t WITH(NOLOCK) ON c.[object_id] = t.[object_id]  
  5. JOIN sys.schemas s WITH(NOLOCK) ON t.[schema_id] = s.[schema_id]  
  6. WHERE t.name = 'WorkOut'  
  7.     AND s.name = 'dbo'   
 
  1. --#3  
  2. SELECT *   
  3. FROM sys.columns c WITH(NOLOCK)  
  4. WHERE OBJECT_NAME(c.[object_id]) = 'WorkOut'  
  5.     AND OBJECT_SCHEMA_NAME(c.[object_id]) = 'dbo'  
 
  1. --#4  
  2. SELECT *  
  3. FROM sys.columns c WITH(NOLOCK)  
  4. WHERE c.[object_id] = OBJECT_ID('dbo.WorkOut''U')  
 

The presented plans show that the #1 and #2 approaches contain excessive amount of connections that will increase the query execution time, while the #3 approach leads to the complete scan of the index, making it the least efficient of all.

In terms of performance, the #4 approach remains the most attractive to me.

However, data contained in sys.columns (as well as in INFORMATION_SCHEMA.COLUMNS) is not enough to completely describe the table structure. This forces joins to other system views to be established as in the following:

  1. SELECT   
  2.       c.name  
  3.     , [type_name] = tp.name  
  4.     , type_schema_name = s.name  
  5.     , c.max_length  
  6.     , c.[precision]  
  7.     , c.scale  
  8.     , c.collation_name  
  9.     , c.is_nullable  
  10.     , c.is_identity  
  11.     , ic.seed_value  
  12.     , ic.increment_value      
  13.     , computed_definition = cc.[definition]  
  14.     , default_definition = dc.[definition]  
  15. FROM sys.columns c WITH(NOLOCK)  
  16. JOIN sys.types tp WITH(NOLOCK) ON c.user_type_id = tp.user_type_id  
  17. JOIN sys.schemas s WITH(NOLOCK) ON tp.[schema_id] = s.[schema_id]  
  18. LEFT JOIN sys.computed_columns cc WITH(NOLOCK) ON   
  19.         c.[object_id] = cc.[object_id]   
  20.     AND c.column_id = cc.column_id  
  21. LEFT JOIN sys.identity_columns ic WITH(NOLOCK) ON   
  22.         c.[object_id] = ic.[object_id]   
  23.     AND c.column_id = ic.column_id  
  24. LEFT JOIN sys.default_constraints dc WITH(NOLOCK) ON dc.[object_id] = c.default_object_id  
  25. WHERE c.[object_id] = OBJECT_ID('dbo.WorkOut''U')  
Accordingly, the execution plan will look not so optimistic, as before. Note that the column list is even read out 3 times:
 
 

Have a look inside sys.default_constraints:

  1. ALTER VIEW sys.default_constraints AS  
  2.     SELECT name, object_id, parent_object_id,  
  3.         ...  
  4.         object_definition(object_id) AS definition,  
  5.         is_system_named  
  6.     FROM sys.objects$  
  7.     WHERE type = 'D ' AND parent_object_id > 0  

There is an OBJECT_DEFINITION call inside the system view. So, to retrieve the description of the default constraint, we don’t need to establish joining.

OBJECT_DEFINITION is still used in sys.computed_columns:

  1. ALTER VIEW sys.computed_columns AS  
  2.     SELECT object_id = id,  
  3.         name = name,  
  4.         column_id = colid,  
  5.         system_type_id = xtype,  
  6.         user_type_id = utype,  
  7.         ...  
  8.         definition = object_definition(id, colid),  
  9.         ...  
  10.     FROM sys.syscolpars  
  11.     WHERE number = 0  
  12.         AND (status & 16) = 16 -- CPM_COMPUTED  
  13.         AND has_access('CO', id) = 1  
We seem to have already avoided 2 joins. The case with sys.identity_columns is more curious:  
  1. ALTER VIEW sys.identity_columns AS  
  2.     SELECT object_id = id,  
  3.         name = name,  
  4.         column_id = colid,  
  5.         system_type_id = xtype,  
  6.         user_type_id = utype,  
  7.         ...  
  8.         seed_value = IdentityProperty(id, 'SeedValue'),  
  9.         increment_value = IdentityProperty(id, 'IncrementValue'),  
  10.         last_value = IdentityProperty(id, 'LastValue'),  
  11.         ...  
  12.     FROM sys.syscolpars  
  13.     WHERE number = 0 -- SOC_COLUMN  
  14.         AND (status & 4) = 4 -- CPM_IDENTCOL  
  15.         AND has_access('CO', id) = 1  

To retrieve information about IDENTITY properties, an undocumented property IDENTITYPROPERTY is used. After a check, its unchanging behavior on SQL Server 2005 and higher was ascertained.

As a result of calling these functions directly, the column list obtaining query becomes significantly simplified:

  1. SELECT   
  2.       c.name  
  3.     , [type_name] = tp.name  
  4.     , type_schema_name = s.name  
  5.     , c.max_length  
  6.     , c.[precision]  
  7.     , c.scale  
  8.     , c.collation_name  
  9.     , c.is_nullable  
  10.     , c.is_identity  
  11.     , seed_value = CASE WHEN c.is_identity = 1 THEN IDENTITYPROPERTY(c.[object_id], 'SeedValue'END  
  12.     , increment_value = CASE WHEN c.is_identity = 1 THEN IDENTITYPROPERTY(c.[object_id], 'IncrementValue'END    
  13.     , computed_definition = OBJECT_DEFINITION(c.[object_id], c.column_id)  
  14.     , default_definition = OBJECT_DEFINITION(c.default_object_id)  
  15. FROM sys.columns c WITH(NOLOCK)  
  16. JOIN sys.types tp WITH(NOLOCK) ON c.user_type_id = tp.user_type_id  
  17. JOIN sys.schemas s WITH(NOLOCK) ON tp.[schema_id] = s.[schema_id]  
  18. WHERE c.[object_id] = OBJECT_ID('dbo.WorkOut''U')  

And the execution plan becomes more efficient:

 
 

Finally, instead of joining to sys.schemas, the SCHEMA_NAME system function can be called, that triggers much more faster than JOIN. This is true, provided that the number of schemes does not exceed the number of user objects. And since such a situation is unlikely, it can be neglected.

Next, get a list of columns included in the primary key. The most obvious approach is to use sys.key_constraints

  1. SELECT   
  2.       pk_name = kc.name  
  3.     , column_name = c.name  
  4.     , ic.is_descending_key   
  5. FROM sys.key_constraints kc WITH(NOLOCK)  
  6. JOIN sys.index_columns ic WITH(NOLOCK) ON   
  7.         kc.parent_object_id = ic.object_id   
  8.     AND ic.index_id = kc.unique_index_id  
  9. JOIN sys.columns c WITH(NOLOCK) ON   
  10.         ic.[object_id] = c.[object_id]  
  11.     AND ic.column_id = c.column_id  
  12. WHERE kc.parent_object_id = OBJECT_ID('dbo.WorkOut''U')  
  13.     AND kc.[type] = 'PK'  
 
 
 
In most cases, PRIMARY KEY is a clustered index and the Unique constraint.

At the metadata level, SQL Server sets index_id to 1 for all clustered indexes, so we can make a selection from sys.indexes filtering by is_primary_key = 1 or by index_id = 1 (not recommended).

Additionally, to avoid joining to sys.columns, the COL_NAME system function can be used:

  1. SELECT   
  2.       pk_name = i.name  
  3.     , column_name = COL_NAME(ic.[object_id], ic.column_id)  
  4.     , ic.is_descending_key  
  5. FROM sys.indexes i WITH(NOLOCK)  
  6. JOIN sys.index_columns ic WITH(NOLOCK) ON   
  7.         i.[object_id] = ic.[object_id]   
  8.     AND i.index_id = ic.index_id  
  9. WHERE i.is_primary_key = 1  
  10.     AND i.[object_id] = object_id('dbo.WorkOut''U')  
 
 

Now combine the obtained queries into one query to get the following final query:

  1. DECLARE  
  2.       @object_name SYSNAME  
  3.     , @object_id INT  
  4.     , @SQL NVARCHAR(MAX)  
  5.   
  6. SELECT  
  7.       @object_name = '[' + OBJECT_SCHEMA_NAME(o.[object_id]) + '].[' + OBJECT_NAME([object_id]) + ']'  
  8.     , @object_id = [object_id]  
  9. FROM (SELECT [object_id] = OBJECT_ID('dbo.WorkOut''U')) o  
  10.   
  11. SELECT @SQL = 'CREATE TABLE ' + @object_name + CHAR(13) + '(' + CHAR(13) + STUFF((  
  12.     SELECT CHAR(13) + '    , [' + c.name + '] ' +   
  13.         CASE WHEN c.is_computed = 1  
  14.             THEN 'AS ' + OBJECT_DEFINITION(c.[object_id], c.column_id)  
  15.             ELSE   
  16.                 CASE WHEN c.system_type_id != c.user_type_id   
  17.                     THEN '[' + SCHEMA_NAME(tp.[schema_id]) + '].[' + tp.name + ']'   
  18.                     ELSE '[' + UPPER(tp.name) + ']'   
  19.                 END  +   
  20.                 CASE   
  21.                     WHEN tp.name IN ('varchar''char''varbinary''binary')  
  22.                         THEN '(' + CASE WHEN c.max_length = -1   
  23.                                         THEN 'MAX'   
  24.                                         ELSE CAST(c.max_length AS VARCHAR(5))   
  25.                                     END + ')'  
  26.                     WHEN tp.name IN ('nvarchar''nchar')  
  27.                         THEN '(' + CASE WHEN c.max_length = -1   
  28.                                         THEN 'MAX'   
  29.                                         ELSE CAST(c.max_length / 2 AS VARCHAR(5))   
  30.                                     END + ')'  
  31.                     WHEN tp.name IN ('datetime2''time2''datetimeoffset')   
  32.                         THEN '(' + CAST(c.scale AS VARCHAR(5)) + ')'  
  33.                     WHEN tp.name = 'decimal'  
  34.                         THEN '(' + CAST(c.[precisionAS VARCHAR(5)) + ',' + CAST(c.scale AS VARCHAR(5)) + ')'  
  35.                     ELSE ''  
  36.                 END +  
  37.                 CASE WHEN c.collation_name IS NOT NULL AND c.system_type_id = c.user_type_id   
  38.                     THEN ' COLLATE ' + c.collation_name  
  39.                     ELSE ''  
  40.                 END +  
  41.                 CASE WHEN c.is_nullable = 1   
  42.                     THEN ' NULL'  
  43.                     ELSE ' NOT NULL'  
  44.                 END +  
  45.                 CASE WHEN c.default_object_id != 0   
  46.                     THEN ' CONSTRAINT [' + OBJECT_NAME(c.default_object_id) + ']' +   
  47.                          ' DEFAULT ' + OBJECT_DEFINITION(c.default_object_id)  
  48.                     ELSE ''  
  49.                 END +   
  50.                 CASE WHEN cc.[object_id] IS NOT NULL   
  51.                     THEN ' CONSTRAINT [' + cc.name + '] CHECK ' + cc.[definition]  
  52.                     ELSE ''  
  53.                 END +  
  54.                 CASE WHEN c.is_identity = 1   
  55.                     THEN ' IDENTITY(' + CAST(IDENTITYPROPERTY(c.[object_id], 'SeedValue'AS VARCHAR(5)) + ',' +   
  56.                                     CAST(IDENTITYPROPERTY(c.[object_id], 'IncrementValue'AS VARCHAR(5)) + ')'   
  57.                     ELSE ''   
  58.                 END   
  59.         END  
  60.     FROM sys.columns c WITH(NOLOCK)  
  61.     JOIN sys.types tp WITH(NOLOCK) ON c.user_type_id = tp.user_type_id  
  62.     LEFT JOIN sys.check_constraints cc WITH(NOLOCK)   
  63.          ON c.[object_id] = cc.parent_object_id   
  64.         AND cc.parent_column_id = c.column_id  
  65.     WHERE c.[object_id] = @object_id  
  66.     ORDER BY c.column_id  
  67.     FOR XML PATH(''), TYPE).value('.''NVARCHAR(MAX)'), 1, 7, '      ') +   
  68.     ISNULL((SELECT '  
  69.     , CONSTRAINT [' + i.name + 'PRIMARY KEY ' +   
  70.     CASE WHEN i.index_id = 1   
  71.         THEN 'CLUSTERED'   
  72.         ELSE 'NONCLUSTERED'   
  73.     END +' (' + (  
  74.     SELECT STUFF(CAST((  
  75.         SELECT ', [' + COL_NAME(ic.[object_id], ic.column_id) + ']' +  
  76.                 CASE WHEN ic.is_descending_key = 1  
  77.                     THEN ' DESC'  
  78.                     ELSE ''  
  79.                 END  
  80.         FROM sys.index_columns ic WITH(NOLOCK)  
  81.         WHERE i.[object_id] = ic.[object_id]  
  82.             AND i.index_id = ic.index_id  
  83.         FOR XML PATH(N''), TYPE) AS NVARCHAR(MAX)), 1, 2, '')) + ')'  
  84.     FROM sys.indexes i WITH(NOLOCK)  
  85.     WHERE i.[object_id] = @object_id  
  86.         AND i.is_primary_key = 1), '') + CHAR(13) + ');'  
  87.   
  88. PRINT @SQL  

Which, when executed, will generate the following script for the test table:

  1. CREATE TABLE [dbo].[WorkOut]  
  2. (  
  3.       [WorkOutID] [BIGINTNOT NULL IDENTITY(1,1)  
  4.     , [TimeSheetDate] AS (dateadd(day, -(datepart(day,[DateOut])-(1)),[DateOut]))  
  5.     , [DateOut] [DATETIME] NOT NULL  
  6.     , [EmployeeID] [INTNOT NULL  
  7.     , [IsMainWorkPlace] [BITNOT NULL CONSTRAINT [DF__WorkOut__IsMainW__52442E1F] DEFAULT ((1))  
  8.     , [DepartmentUID] [UNIQUEIDENTIFIER] NOT NULL  
  9.     , [WorkShiftCD] [NVARCHAR](10) COLLATE Cyrillic_General_CI_AS NULL  
  10.     , [WorkHours] [REALNULL  
  11.     , [AbsenceCode] [VARCHAR](25) COLLATE Cyrillic_General_CI_AS NULL  
  12.     , [PaymentType] [CHAR](2) COLLATE Cyrillic_General_CI_AS NULL  
  13.     , CONSTRAINT [PK_WorkOut] PRIMARY KEY CLUSTERED ([WorkOutID])  
  14. );  

As you can see, the topic is too broad and it is not limited to a column list and primary key.

That's why generation of indexes, foreign keys, and other statements are planned to be revealed in the next part of this topic.