Combination Of Id-ParentId And HierarchyId Approaches To Hierarchical Data

Introduction to Hierarchical and alphabetical sorting

To sort the data hierarchically and alphabetically means sorting a tree using depth-first search algorithm and selecting nodes with the same parent in the alphabetical order of their names.

The hierarchical sorting along with the level of nesting is a good way to emulate an expanded tree or a table of a book's contents using just a flat list of objects without the necessity to build a tree of objects.

Just fetch the records, sort them hierachically and alphabetically, indent each record n times ( where n is the level of nesting), and you'll get the expanded tree for UI and reports.

Two approaches

There are several approaches to structure a relational database to store the hierarchical data. The one, which is being used more frequently, is the Id-ParentId approach.

The Id-ParentId approach has a few drawbacks that make it necessary to seek alternative methods of storing hierarchical data.

The most difficult and expensive tasks in Id-ParentId approach are those which require recursive methods. For example -

  • find all descendants,
  • find all ancestors,
  • calculate the level of nesting,
  • sort flat records hierarchically and then alphabetically.

As a response to these difficulties, Microsoft added the HierarchyId data type to SQL Server since version 2008.

The HierarchyId data type,

  • helps find descendants and ancestors without recursion,
  • has fast GetLevel() method,
  • contains the information about a node position in hierarchy,
  • contains the information about a node order within its parent children.

If  you want to build a clustered index on HierarchyId, then all the node-records will physically be stored in the order in which they are most frequently fetched.

But the HierarchyId data type still has drawbacks, which are the trade-offs for its useful properties,

  • Impossibility to use HierarchyId as a foreign key to a parent, i.e. self-referencing; (actially a foreign key to a parent can be substituted with PERSISTED REFERENCES as described in this article, but all the pros and cons of such a solution are waiting for their researchers)
  • Difficult ways to generate HierarchyId for added and changed records to keep required order, SQL Server does not do this automatically, it is up to programmer to manage it.
  • Difficult use of the HierarchyId as a primary and foreign key: it may regularly change,
  • It has no corresponding type in JavaScript (for example).

So what to choose: Id-ParentId or HierarchyId?

An acceptable solution to this dilemma of choice may be the combination of two approaches: Id-ParentId and HierarchyId.

This combination will allow us to,

  • Use Id as an integer, ever increasing and unchangeable primary key,
  • Use ParentId as a self-referencing foreign key,
  • Use HierrarchyId to keep the records physically sorted in hierarchical and alphabetical order.

This article describes the way to unite the both approaches and ensure their normal operation.

Some Facts about HierarchyId

Column and field names in this article and source code,

  • Hid — column of HierarchyId data type
  • HidPath — string column for human-readable path of HierarchyId
  • HidCode — a binary presentation of HierarchyId converted to a string and without leading 0x.

To human-readable format

The HierarchyId data type is a materialized path encoded into binary format. It can be decoded to human-readable string path with ToString() method,

  1. select Hid, Hid.ToString() as HidPath, Hid.GetLevel() as Level from Folders order by Hid;

  2. -- Note: ToString() and GetLevel() are case sensitive!!!
Results----------------------------------- Hid HidPath Level ---------- ------------- -------- 0x / 0
0x58 /1/ 1
0x5AC0 /1/1/ 2
0x5AD6 /1/1/1/ 3
0x5AD6B0 /1/1/1/1/ 4
0x5AD6B580 /1/1/1/1/1/ 5
0x5AD6B680 /1/1/1/1/2/ 5
0x5AD6B780 /1/1/1/1/3/ 5
0x5AD6D0 /1/1/1/2/ 4
0x5AD6D580 /1/1/1/2/1/ 5
0x5AD6D680 /1/1/1/2/2/ 5
0x5AD6D780 /1/1/1/2/3/ 5
...

From human-readable format

The opposite operation from string path to binary format works also.

  1. declare @Hid hierarchyid = '/1/1/1/1/3/';  
  2. select @Hid  

Results
------------
0x5AD6B780

The above is the implicit conversion. HierarchyId has the explicit Parse() method, although a call format is a little strange.

  1. declare @Hid hierarchyid = hierarchyid::Parse('/1/1/1/1/3/'); 
  2. -- Parse() is case sensitive!!!  
  3. select @Hid  

Results
------------
0x5AD6B780

Tricky numbering system of paths

The interesting thing about the HierarchyId is the way how it manages the numeration of a node inserted between two other nodes.

To get a Hid for a new node, HierarchyId uses GetDescendant() method from parent Hid and uses two Hids of nodes between which the new has to be inserted.

  1. declare @Parent hierarchyid = 0x;  
  2.   
  3. print @Parent.GetDescendant('/1/''/2/').ToString()     -- /1.1/  
  4. print @Parent.GetDescendant('/1/''/1.1/').ToString()   -- /1.0/  
  5. print @Parent.GetDescendant('/1/''/1.0/').ToString()   -- /1.-1/  
  6. print @Parent.GetDescendant('/1/''/1.-1/').ToString()  -- /1.-2/  
  7.   
  8. print @Parent.GetDescendant('/1.1/''/2/').ToString()   -- /1.2/  
  9. print @Parent.GetDescendant('/1.2/''/2/').ToString()   -- /1.3/  
  10. print @Parent.GetDescendant('/1.3/''/2/').ToString()   -- /1.4/  
  11.   
  12. print @Parent.GetDescendant('/1.3/''/1.4/').ToString() -- /1.3.1/  
  13.   
  14. print @Parent.GetDescendant('/1.2.3.4.5.6.7.8/''/1.2.3.4.5.6.7.9/').ToString() 
    -- /1.2.3.4.5.6.7.8.1/  
  15.   
  16.   
  17. -- by the way  
  18.   
  19. declare @Hid hierarchyid = '/1.2.3.4.5.6.7.8.1/';  
  20. select @Hid; -- 0x63A08A49A85258  
  21.   
  22. declare @Hid hierarchyid = '/-1.-2.-3.-4.-5.-6.-7.-8.-1234567890/';  
  23. select @Hid; -- 0x41F8F87A3C1D8E87216D9A81A73A  
  24.   
  25.   
  26. -- special cases with null  
  27.   
  28. print @Parent.GetDescendant(nullnull).ToString()    -- /1/  
  29. print @Parent.GetDescendant('/1/'null).ToString()   -- /2/  
  30. print @Parent.GetDescendant(null'/1/').ToString()   -- /0/  

Why do we need binary encoding?

If we can build a string path, why do we need this HierarchyId binary encoding?

Could we just sort the hierarchical data by this string path?

Look at this example,

  1. select hierarchyid::Parse('/1/'as Hid, '/1/' as HidPath union all  
  2. select hierarchyid::Parse('/2/')       , '/2/'            union all  
  3. select hierarchyid::Parse('/10/')      , '/10/'   
  4. order by HidPath;  

Results
--------------
Hid HidPath
----- --------
0x58 /1/
0xAA /10/
0x68 /2/

The same query but ordered by Hid makes the right sorting:

Results
--------------
Hid HidPath ----- -------- 0x58 /1/ 0x68 /2/ 0xAA /10/

Microsoft uses the sophisticated algorithm of HierarchyId to encode the string path so that it can be used for hierarchical sorting just by the Hid column value.

The Solution

Master and slave

So, let's combine Id-ParentId self referencing approach with HierarchyId.

The Id-ParentId part will be responsible for Id primary key, self-reference ParentId foreign key and be the leading, or master. The HierarchyId part will be a calculated field, or slave.

Of course, Hid as a persistent calculated field is the denormalization. But this is a conscious step and a compromise for the advantages of HierarchyId.

The best place and moment to calculate the HierarchyId and keep it in coordinated state is a stored procedure and while saving the node.

User Catalog for experiments

Let's imagine we have a multi-user application where each of user keeps its own Catalog. This Catalog keeps information in hierarchical Folders.

The table for these Folders might look like this,

  1. create table dbo.Folders  
  2. (      
  3.     UserId    int           not null    ,  
  4.     Hid       hierarchyid   not null    ,  
  5.     Id        int           not null    identity,  
  6.     ParentId  int               null    ,  
  7.     Name      nvarchar(50)  not null    ,  
  8.   
  9.     constraint CU_Folders unique clustered (UserId asc, Hid asc),  
  10.   
  11.     constraint PK_Folders primary key nonclustered (Id asc),      
  12.   
  13.     constraint FK_Folders_UserId foreign key (UserId) references dbo.Users (Id),  
  14.   
  15.     constraint FK_Folders_ParentId foreign key (ParentId) references dbo.Folders (Id)  
  16.       
  17.     constraint CH_Folders_ParentId check (Hid = 0x and ParentId is null or Hid <> 0x and ParentId is not null)  
  18. );  

Note, that the clustered index is build on UserId and Hid, but the primary key is build on Id column.

This allows to keep all the records of the user's folders hierarchically sorted physically. And, at the same time, use the integer primary key as a foreign key for another tables and DTOs.

It is well known, that the hierarchy built on HierarchyId may have one root only.

Because the clustered key is the combination of UserId and Hid columns, a table dbo.Folders may have one root for each user.

The root node is mandatory, every user's catalog must have one, so the root is mostly a system record that is not edited by users.

Once the root node is a system record, every other nodes must have a parent and ParentId must be not null.

The best place to make calculation

As Microsoft states it here: "It is up to the application to generate and assign HierarchyId values in such a way that the desired relationship between rows is reflected in the values."

The best way to create and keep integrity and consistency of hierarchical data in our case is always use the same stored procedure to insert and update a Folder node.

A user-defined table type that represents saving Folder may serve as a parameter to the SaveFolder stored procedure,

  1. create type dbo.FolderTableType as table  
  2. (  
  3.     Id        int           not null  primary key clustered,  
  4.     ParentId  int           not null  ,  
  5.     Name      nvarchar(50)  not null  
  6. );  

And here is the SaveFolder stored procedure itself,

  1. create procedure dbo.SaveFolder  
  2.     @Folder dbo.FolderTableType readonly  
  3. as  
  4. begin  
  5.     set nocount on;  
  6.           
  7.     begin -- variable declaration   
  8.       
  9.         declare   
  10.             @ParamId         int         ,  
  11.             @ParentId        int         ,  
  12.             @UserId          int         ,  
  13.             @ParentHid       hierarchyid ,  
  14.   
  15.             @StartTranCount  int         ,  
  16.             @OldHid          hierarchyid ,  
  17.             @NewHid          hierarchyid ;          
  18.   
  19.         declare @FolderIds table (   
  20.             InsertedId  int        not null,  
  21.             OldHid      hierarchyid    null,  
  22.             NewHid      hierarchyid    null      
  23.         );  
  24.   
  25.     end;  
  26.   
  27.     begin try  
  28.         set @StartTranCount = @@trancount;  
  29.   
  30.         if @StartTranCount = 0   
  31.         begin  
  32.             set transaction isolation level serializable;  
  33.             begin transaction;  
  34.         end;  
  35.   
  36.   
  37.         begin -- init variables and lock parent for update   
  38.   
  39.             select   
  40.                 @ParamId = Id,   
  41.                 @ParentId = ParentId   
  42.             from   
  43.                 @Folder;  
  44.   
  45.               
  46.             select  
  47.                 @UserId    = UserId,  
  48.                 @ParentHid = Hid        
  49.             from  
  50.                 dbo.Folders  
  51.             where  
  52.                 Id = @ParentId;  
  53.   
  54.         end;  
  55.               
  56.   
  57.         begin -- save the @Folder   
  58.   
  59.             merge into dbo.Folders as target  
  60.                 using   
  61.                 (  
  62.                     -- full join in this 'select' allows to see picture as if
  63.                     -- the change already applied, 
  64.                     -- thus coalesce(t.Id, f.Id) take new value
  65.                     -- if t.Id exists and old value if not  
  66.   
  67.                     select    
  68.   
  69.                         -- LAG and LEAD functions help to find the previous   
  70.                         -- and next Hid values if to sort by Name  
  71.   
  72.                         Hid      =  @ParentHid.GetDescendant (  
  73.   
  74.                                         LAG (case when t.Id is null then f.Hid end)   
  75.                                              over(order by coalesce(t.Name, f.Name)),  
  76.   
  77.                                         LEAD(case when t.Id is null then f.Hid end)   
  78.                                              over(order by coalesce(t.Name, f.Name))  
  79.                                     ),  
  80.                         Id       =  coalesce( t.Id       , f.Id       ),  
  81.                         ParentId =  coalesce( t.ParentId , f.ParentId ),  
  82.                         Name     =  coalesce( t.Name     , f.Name     )  
  83.                     from  
  84.                         (select * from dbo.Folders where ParentId = @ParentId)  f  
  85.                         full join @Folder t on t.Id = f.Id  
  86.                 )   
  87.                    
  88.                 -- for LAG and LEAD functions we need all children of the @ParentId  
  89.                 -- but for merge we need the Folder where source.Id = @ParamId   
  90.                    
  91.                 as source on source.Id = @ParamId and source.Id = target.Id   
  92.   
  93.             -- target.UserId = @UserId here just to make sure 
  94.             -- that we do not reparent a Folder to another user  
  95.   
  96.             when matched and target.UserId = @UserId then  
  97.                 update set  
  98.                     Hid       =  source.Hid  ,  
  99.                     Name      =  source.Name ,  
  100.                     ParentId  =  source.ParentId  
  101.   
  102.             -- source.Id = 0 here just to make sure that we insert a new Folder   
  103.             -- and not already deleted one  
  104.             -- not matched Id can be deleted by another user and we can try to add it again  
  105.   
  106.             when not matched by target and source.Id = 0 then  
  107.                 insert (  
  108.                     UserId   ,  
  109.                     Hid      ,  
  110.                     ParentId ,  
  111.                     Name     )  
  112.                 values (  
  113.                     @UserId         ,  
  114.                     source.Hid      ,  
  115.                     source.ParentId ,  
  116.                     source.Name     )  
  117.   
  118.             output    
  119.                 inserted.Id  ,  
  120.                 deleted.Hid  ,  
  121.                 inserted.Hid  
  122.   
  123.             into @FolderIds (  
  124.                 InsertedId  ,  
  125.                 OldHid      ,  
  126.                 NewHid      );  
  127.   
  128.         end;  
  129.   
  130.   
  131.         begin -- reparent children   
  132.   
  133.             -- saving folder might change its ParentId  
  134.             -- in that case Hid of all its children should reflect the change also   
  135.   
  136.             select top 1 @OldHid = OldHid, @NewHid = NewHid from @FolderIds;  
  137.   
  138.   
  139.             if @OldHid <> @NewHid   
  140.                 update dbo.Folders set  
  141.                     Hid = Hid.GetReparentedValue(@OldHid, @NewHid)   
  142.                 where  
  143.                     UserId = @UserId  
  144.                     and Hid.IsDescendantOf(@OldHid) = 1;  
  145.         end;  
  146.   
  147.   
  148.         if @StartTranCount = 0 commit transaction;  
  149.   
  150.     end try  
  151.     begin catch  
  152.         if xact_state() <> 0 and @StartTranCount = 0 rollback transaction;  
  153.           
  154.         declare @ErrorMessage nvarchar(4000) = dbo.GetErrorMessage();  
  155.         raiserror (@ErrorMessage, 16, 1);  
  156.         return;  
  157.     end catch;  
  158.   
  159. end;  

If you want to save each Folder by the above stored procedure only, then the Hid column will always have actual and consistent information.

Fetch a Folder with its descendants in hierarchical order

Thus the following stored procedure fetches a Folder with all its descendants and sorts them hierarchically and then alphabetically as if they are an expanded tree,

  1. create procedure dbo.GetFolderWithSubFolders  
  2.     @FolderId int  
  3. as  
  4. begin  
  5.     set nocount on;  
  6.   
  7.     select   
  8.         d.Id       ,  
  9.         d.ParentId ,  
  10.         d.Name     ,  
  11.         [Level]    =  d.Hid.GetLevel(),  
  12.         HidCode    =  dbo.GetHidCode(d.Hid),  
  13.         HidPath    =  d.Hid.ToString()  
  14.     from   
  15.         dbo.Folders f  
  16.         inner join dbo.Folders d on d.UserId = f.UserId and d.Hid.IsDescendantOf(f.Hid) = 1  
  17.     where  
  18.         f.Id = @FolderId  
  19.     order by  
  20.         d.Hid;  
  21. end;  

For example

  1. exec dbo.GetFolderWithSubFolders @FolderId = 40   

Results ---------------------------------------------------- Id ParentId Name Level HidCode HidPath
----- --------- ----- ------ ---------- ------------
40 13 3A 3 5AD6 /1/1/1/
121 40 4A 4 5AD6B0 /1/1/1/1/
364 121 5A 5 5AD6B580 /1/1/1/1/1/
365 121 5B 5 5AD6B680 /1/1/1/1/2/
366 121 5C 5 5AD6B780 /1/1/1/1/3/
122 40 4B 4 5AD6D0 /1/1/1/2/
367 122 5A 5 5AD6D580 /1/1/1/2/1/
368 122 5B 5 5AD6D680 /1/1/1/2/2/
369 122 5C 5 5AD6D780 /1/1/1/2/3/
123 40 4C 4 5AD6F0 /1/1/1/3/
370 123 5A 5 5AD6F580 /1/1/1/3/1/
371 123 5B 5 5AD6F680 /1/1/1/3/2/
372 123 5C 5 5AD6F780 /1/1/1/3/3/

What is the use of HidCode?

As it stated above, HidCode is a binary presentation of HierarchyId converted to varchar and without leading 0x.

Here is the dbo.GetHidCode scalar SQL function,

  1. create function dbo.GetHidCode  
  2. (  
  3.     @Hid hierarchyid  
  4. )  
  5. returns varchar(1000)  
  6. as  
  7. begin  
  8.     return replace(convert(varchar(1000), cast(@Hid as varbinary(892)), 2), '0x''');  
  9. end;  

The HidCode string can be used on a client side to sort a flat list of Folders hierarchically.

Chapter for the perfectionists

The above method of Hid calculation works well, but it has a tiny drawback.

imagine you have two sibling folders  
Id ParentId Name Level HidCode HidPath
----- --------- ----- ------ ---------- ------------
364 121 5A 5 5AD6B580 /1/1/1/1/1/
365 121 5B 5 5AD6B680 /1/1/1/1/2/  
and you want to insert a new folder
 Id ParentId Name
----- --------- -----
0 121 5Aaa
 the result will be
 Id ParentId Name Level HidCode HidPath
----- --------- ----- ------ ---------- --------------
364 121 5A 5 5AD6B580 /1/1/1/1/1/
1234 121 5Aaa 5 5AD6B62C /1/1/1/1/1.1/
365 121 5B 5 5AD6B680 /1/1/1/1/2/

See this sequence in HidPath: 1 -> 1.1 -> 2 instead of 1 -> 2 -> 3

In a time this ugly sequence may become more uglier.

If you OK with that then just skip this chapter.

But for the others I would suggest one more option for keeping the hierarchical data in actual state.

Here is the stored procedure which recalculates the HierarchyId so the path consists of integer numbers in a continuous sequence.

  1. create procedure dbo.SaveFolderWithHidReculc  
  2.     @Folder dbo.FolderTableType readonly  
  3. as  
  4. begin  
  5.     set nocount on;  
  6.           
  7.     begin -- variable declaration   
  8.       
  9.         declare   
  10.             @ParentId        int            ,  
  11.             @UserId          int            ,  
  12.             @ParentHid       hierarchyid    ,  
  13.             @ParentHidStr    varchar(1000)  ,  
  14.             @StartTranCount  int            ,  
  15.             @OldParentId     int            ,  
  16.             @OldParentHid    hierarchyid    ;  
  17.   
  18.         declare @FolderIds table (   
  19.             InsertedId    int        not null,  
  20.             OldParentId   int            null,  
  21.             OldParentHid  hierarchyid    null  
  22.         );  
  23.   
  24.     end;  
  25.       
  26.     begin try  
  27.         set @StartTranCount = @@trancount;  
  28.         if @StartTranCount = 0   
  29.         begin  
  30.             set transaction isolation level serializable;  
  31.             begin transaction;  
  32.         end;  
  33.   
  34.         begin -- init variables and lock parent for update   
  35.   
  36.             select @ParentId = ParentId from @Folder;  
  37.               
  38.             select  
  39.                 @UserId        = UserId ,  
  40.                 @ParentHid     = Hid    ,  
  41.                 @ParentHidStr  = cast(Hid as varchar(1000))          
  42.             from  
  43.                 dbo.Folders  
  44.             where  
  45.                 Id = @ParentId;  
  46.         end;  
  47.   
  48.         begin -- merge calculated hierarchical data with existing folders   
  49.   
  50.             merge into dbo.Folders as target  
  51.                 using   
  52.                 (  
  53.                     select   
  54.                         Hid      =    cast(concat(@ParentHidStr, -1, '/'as varchar(1000)),  
  55.                         Id       ,  
  56.                         ParentId ,  
  57.                         Name  
  58.                     from   
  59.                         @Folder  
  60.                 )   
  61.                 as source on source.Id = target.Id   
  62.   
  63.             when matched and target.UserId = @UserId then  
  64.                 update set  
  65.                     ParentId = source.ParentId ,  
  66.                     Name     = source.Name  
  67.   
  68.             when not matched by target and source.Id = 0  then                                                           
  69.                 insert (  
  70.                     UserId   ,  
  71.                     Hid      ,  
  72.                     ParentId ,  
  73.                     Name     )  
  74.                 values (  
  75.                     @UserId         ,  
  76.                     source.Hid      ,  
  77.                     source.ParentId ,  
  78.                     source.Name     )  
  79.             output  
  80.                 inserted.Id,  
  81.                 deleted.ParentId,  
  82.                 deleted.Hid.GetAncestor(1)                  
  83.             into   
  84.                 @FolderIds (  
  85.                     InsertedId   ,  
  86.                     OldParentId  ,  
  87.                     OldParentHid );  
  88.         end  
  89.   
  90.         begin -- reculculate SubFolder Hids  
  91.   
  92.             select top 1  
  93.                 @OldParentId  = OldParentId ,  
  94.                 @OldParentHid = OldParentHid  
  95.             from   
  96.                 @FolderIds;  
  97.   
  98.             exec dbo.ReculcSubFolderHids @UserId, @ParentId, @ParentHid, @OldParentId, @OldParentHid ;  
  99.   
  100.         end;  
  101.   
  102.         if @StartTranCount = 0 commit transaction;  
  103.   
  104.     end try  
  105.     begin catch  
  106.         if xact_state() <> 0 and @StartTranCount = 0 rollback transaction;  
  107.           
  108.         declare @ErrorMessage nvarchar(4000) = dbo.GetErrorMessage();  
  109.         raiserror (@ErrorMessage, 16, 1);  
  110.         return;  
  111.     end catch;  
  112.   
  113. end;  

The trick here is in call of ReculcSubFolderHids stored procedure after the @Folder save,

  1. create procedure dbo.ReculcSubFolderHids  
  2.     @UserId        int          ,  
  3.     @ParentId      int          ,  
  4.     @ParentHid     hierarchyid  ,  
  5.     @OldParentId   int          =  null,  
  6.     @OldParentHid  hierarchyid  =  null  
  7. as   
  8. begin  
  9.   
  10.     declare @ParentHidStr varchar(1000)    = cast(@ParentHid as varchar(1000));  
  11.     declare @OldParentHidStr varchar(1000) = cast(@OldParentId as varchar(1000));  
  12.   
  13.     with Recursion as  
  14.     (  
  15.         select  
  16.             Id       ,  
  17.             ParentId ,  
  18.             Name     ,  
  19.             OldHid   =  cast(Hid as varchar(1000)),  
  20.             NewHid   =  cast(  
  21.                             concat(   
  22.                                 case when ParentId = @ParentId then @ParentHidStr else @OldParentHidStr end,  
  23.                                 row_number() over (order by Name, Id),  
  24.                                 '/'  
  25.                             )   
  26.                             as varchar(1000)  
  27.                         )  
  28.         from  
  29.             dbo.Folders  
  30.         where  
  31.             ParentId in (@ParentId, @OldParentId)  
  32.   
  33.         union all   
  34.   
  35.         select  
  36.             Id        =  f.Id       ,  
  37.             ParentId  =  f.ParentId ,          
  38.             Name      =  f.Name     ,  
  39.             OldHid    =  cast(f.Hid as varchar(1000)),  
  40.             NewHid    =  cast(  
  41.                              concat(  
  42.                                  r.NewHid,   
  43.                                  row_number() over (partition by f.ParentId order by f.Name, f.Id),   
  44.                                  '/'  
  45.                              )   
  46.                              as varchar(1000)  
  47.                          )                      
  48.         from   
  49.             Recursion r          
  50.             inner join dbo.Folders f on f.ParentId = r.Id   
  51.         where  
  52.             r.OldHid <> r.NewHid  
  53.     )  
  54.     update f set   
  55.         Hid = r.NewHid  
  56.     from  
  57.         dbo.Folders f  
  58.         inner join Recursion r on r.Id = f.Id and f.Hid <> r.NewHid  
  59.     where  
  60.         f.UserId = @UserId;  
  61.       
  62. end;  

The above stored procedure calculated the paths which consists of integer numbers in a continuous sequence.

  So the result of the example in the section beginning with SaveFolderWithHidReculc will be

Id ParentId Name Level HidCode HidPath
----- --------- ----- ------ ---------- ------------
364 121 5A 5 5AD6B580 /1/1/1/1/1/
1234 121 5Aaa 5 5AD6B680 /1/1/1/1/2/
365 121 5B 5 5AD6B780 /1/1/1/1/3/

This method has a drawback also. It works well until the recalculated branch consists of up to about 10,000 nodes, when the time for calculation does not exceed a second. 100,000 nodes' recalculation may take up to 15 seconds or more.

Read a Tree in C#

The implemented Id-ParentId reference allows to build a tree of Folders in C#.

Moreover, once the Folders are sorted hierarchically, the tree can be built for one iteration through the flat list of Folders.

Suppose we have a Folder class in C#, 

  1. public class Folder  
  2. {  
  3.     public Int32 Id { get; set; }  
  4.   
  5.     public Int32? ParentId { get; set; }  
  6.   
  7.     public String Name { get; set; }          
  8.   
  9.     public IList<Folder> SubFolders { get; set; }  
  10. }  

So, the following method builds a tree for one pass through the hierarchically sorted Folder list,

  1. public static IList<Folder> ConvertHierarchicallySortedFolderListToTrees(IEnumerable<Folder> folders)  
  2. {  
  3.     var parentStack = new Stack<Folder>();  
  4.     var parent = default(Folder);  
  5.     var prevNode = default(Folder);  
  6.     var rootNodes = new List<Folder>();  
  7.   
  8.     foreach (var folder in folders)  
  9.     {  
  10.         if (parent == null || folder.ParentId == null)  
  11.         {  
  12.             rootNodes.Add(folder);  
  13.   
  14.             parent = folder;  
  15.         }  
  16.         else if (folder.ParentId == parent.Id)  
  17.         {  
  18.             if (parent.SubFolders == null)  
  19.                 parent.SubFolders = new List<Folder>();  
  20.   
  21.             parent.SubFolders.Add(folder);  
  22.         }  
  23.         else if (folder.ParentId == prevNode.Id)  
  24.         {  
  25.             parentStack.Push(parent);  
  26.   
  27.             parent = prevNode;  
  28.   
  29.             if (parent.SubFolders == null)  
  30.                 parent.SubFolders = new List<Folder>();  
  31.   
  32.             parent.SubFolders.Add(folder);  
  33.         }  
  34.         else  
  35.         {  
  36.             var parentFound = false;  
  37.   
  38.             while(parentStack.Count > 0 && parentFound == false)  
  39.             {  
  40.                 parent = parentStack.Pop();  
  41.   
  42.                 if (folder.ParentId != null && folder.ParentId.Value == parent.Id)  
  43.                 {  
  44.                     parent.SubFolders.Add(folder);  
  45.                     parentFound = true;  
  46.                 }  
  47.             }  
  48.   
  49.             if (parentFound == false)  
  50.             {  
  51.                 rootNodes.Add(folder);  
  52.   
  53.                 parent = folder;  
  54.             }  
  55.         }  
  56.                   
  57.         prevNode = folder;  
  58.     }  
  59.   
  60.     return rootNodes;  
  61. }  

The above method works two times faster than the following method for an unsorted Folder list,

  1. public static IList<Folder> ConvertHierarchicallyUnsortedFolderListToTrees(IEnumerable<Folder> folders)  
  2. {  
  3.     var dictionary = folders.ToDictionary(n => n.Id, n => n);  
  4.     var rootFolders = new List<Folder>();  
  5.   
  6.     foreach (var folder in dictionary.Select(item => item.Value))  
  7.     {  
  8.         Folder parent;  
  9.   
  10.         if (folder.ParentId.HasValue && dictionary.TryGetValue(folder.ParentId.Value, out parent))  
  11.         {  
  12.             if (parent.SubFolders == null)  
  13.                 parent.SubFolders = new List<Folder>();  
  14.   
  15.             parent.SubFolders.Add(folder);  
  16.         }  
  17.         else  
  18.         {  
  19.             rootFolders.Add(folder);  
  20.         }  
  21.     }  
  22.   
  23.     return rootFolders;  
  24. }  

Both methods can build multiple roots, so the return type is IList<Folder>. That is useful when you need to fetch several bunches detached from the tree or several independent trees.

About source code

The attached archive contains the Hierarchy solution, created in Visual Studio 2015, which consists of two projects,

  • DB — SSDT project to create database for SQL Server 2016,
  • Tests — Test project to call the repository methods and see the result.

The solution contains the examples of,

  • How to fetch a Folder by Id with HidCode, HidPath and full path from the root Folder;
  • How to fetch immediate SubFolders of a Folder;
  • How to fetch a Folder with all its descendants as a flat list and as a tree;
  • How to fetch a Folder with all its parents as a flat list and as a tree;
  • How to add a new Folder with Hid calculation on the fly;
  • How to edit a Folder keeping right hierarchical and alphabetical order;
  • How to reparent a Folder, assign a Folder to another Parent in edit method;
  • How to delete a Folder with all its descendants;
  • How to find Folders by Name and build branches from the found Folders to the root as it realized in Visual Studio is Search Solution Explorer.

In order to install the database and run the tests, change the connection string in file Hierarchy.DB.publish.xml and in App.config to yours.

The Database project contains the dbo.GenerateFolders stored procedure which generates hierarchical data for tests. This generation occurs automatically during the database publishing phase.


Similar Articles