Homemade DeSerializator For XML

Creating a DeSerializator is like reinventing the wheel but at the same time, it is a great task. Obviously, if the DeSerializator is not being made only for a specific class, then we have to use reflection, and during the implementation, we can meet many difficulties and interesting problems.

At first, the idea to implement a simple DeSerializator which can DeSerialize a simple class from an XML document seems easy until the class contains simple value type properties or some very simple class properties. The difficulties begin when we have some collection or interface type properties. Importantly, I wanted to use fewer tools from System.XML assembly.

The basic idea is to use a simple XML with this structure.

<tag>value</tag>

OR NOT USED TAG

<tag/>

CONCEPTION

An XML is a tree where there is a root element, and the tree can have many branches. The leaf element is actually the value element and each element in the path is a branch.

The class object that we want to instantiate by XML contains everything that is in the XML but there is no surity if the XML contains everything that is in the class.

It means that we have to traverse on the XML tree and instantiate the objects in the class accordingly.

I choose Pre-order traversal which works in the way shown below.

Pre-order: F, B, A, D, C, E, G, I, H.

Although it is a binary tree we can use this approach on our non-binary XML tree as well.

IMPLEMENTATION

For the sake of simplicity, first create a tree object from the XML and use this object from now on.

The node object of the tree contains the tag name which is the type of the object in the class, the possible value and possible child nodes,

  1. private class Node  
  2.         {  
  3.             public int level { getset; }  
  4.             public int index { getset; }  
  5.             public string tag { getset; }  
  6.             public string value { getset; }  
  7.             public List<Node> nodes { getset; }  
  8.   
  9.             public Node()  
  10.             {  
  11.                 nodes = new List<Node>();  
  12.             }  
  13.         }  

Get the text from the XML file and create a concatenated string from it by removing the possible namespaces and commented out parts,

  1. private string GetTextFromXml(XDocument doc)  
  2.         {  
  3.             //Remove Namespaces from the tags  
  4.             doc.Descendants().Attributes().Where(x => x.IsNamespaceDeclaration).Remove();  
  5.   
  6.             foreach (var elem in doc.Descendants())  
  7.                 elem.Name = elem.Name.LocalName;  
  8.   
  9.             var xmlDocument = new XmlDocument();  
  10.             xmlDocument.Load(doc.CreateReader());  
  11.   
  12.             string text = xmlDocument.OuterXml;  
  13.   
  14.             //Remove breaks  
  15.             text = text.Replace("\n""");  
  16.             //Remove whitespaces  
  17.             text = Regex.Replace(text, @"\s+""");  
  18.             //Remove commented out tags  
  19.             text = Regex.Replace(text, @"(<!--)(.*?)(-->)""");  
  20.   
  21.             return text;  
  22.         }  

When the concatenated string already exists, we can build the tree by recursion using regular expression in order to get the elements along with values.

  1. private void GetTag(string tag, Node n)  
  2.         {  
  3.             //If tag doesn't contain close tag '</' then the end of the branch is reached  
  4.             if (!tag.Contains("</")) { return; };  
  5.   
  6.             //Get the complete XML node along with value  
  7.             foreach (Match match in Regex.Matches(tag, @"<([^>]+)>(.*?)</\1>"))  
  8.             {  
  9.                 Node node = new Node();  
  10.   
  11.                 //Next level of the tree  
  12.                 node.level = n.level + 1;  
  13.                 
  14.                 //Name of the tag  
  15.                 node.tag = match.Groups[1].ToString();  
  16.                 //Value of the tag (maybe null)  
  17.                 if (!match.Groups[2].Value.Contains("/")) node.value = match.Groups[2].Value;  
  18.                 n.nodes.Add(node);   
  19.                 //Next index on the current level  
  20.                 node.index = n.nodes.Count;  
  21.   
  22.                 //Recursion  
  23.                 GetTag(match.Groups[2].Value, node);  
  24.             }  
  25.         }  

When we have a Node object which is a tree containing the values, we can traverse on the tree and create the class.

The procedure of the instantiation is simple but not so straight forward in many cases, like arrays or interfaces.

DIFFICULTIES

The class may contains arrays.

Instantiation of an array is easy if we know its length.

  1. Array.CreateInstance( typeof(Int32), 5 );   

The class may contain interface. Obviously, we can't instantiate an interface so we have to find the behind class that implements it.

How to check if the property is a list or array

We can do it differently like this one but we have to be careful because it will be true for String type as well.

  1. typeof(IEnumerable).IsAssignableFrom(propInfo.PropertyType)   

How to get the generic type of a list items

Obviously, the type of the PropertyInfo is List and not the type of its item. We can't instantiate a string object with the activator because it doesn't have parameterless constructor 

  1. Activator.CreateInstance(propInfo.PropertyType) //string à Error!  

SOLUTION

The traversal method completed with explanations in the code,

  1. private void Traverse(Node n, object o, Helper helperObj)    
  2.         {    
  3.             //All properties of the curent object    
  4.             PropertyInfo[] propInfos = o.GetType().GetProperties();    
  5.             PropertyInfo propInfo = null;    
  6.     
  7.             //If there is no more children then the end of the branch is reached    
  8.             if (n.nodes.Count == 0)    
  9.             {    
  10.                 return;    
  11.             }    
  12.     
  13.             //Looping on the child nodes of the current object    
  14.             foreach (Node node in n.nodes)    
  15.             {    
  16.                 object instance = null;    
  17.                 object obj = null;    
  18.     
  19.                 //Get the actual property of the current node    
  20.                 //Maybe it's name defined in an XmlArrayItemAttribute or an XmlElementAttribute    
  21.                 propInfo = Array.Exists(propInfos, x => x.Name == node.tag) ?   
  22.                   propInfos.Where(x => x.Name == node.tag).First() :    
  23.                   Array.Exists(propInfos, y =>    
  24.                   {    
  25.                       var attribs = y.GetCustomAttributes(false);    
  26.                       return Array.Exists(attribs, z =>    
  27.                       {    
  28.                           Type attribType = z.GetType();    
  29.     
  30.                           if (attribType == typeof(XmlArrayItemAttribute))    
  31.                           {    
  32.                               return ((XmlArrayItemAttribute)z).ElementName == node.tag;    
  33.                           }    
  34.                           else if (attribType == typeof(XmlElementAttribute))    
  35.                           {    
  36.                               return ((XmlElementAttribute)z).ElementName == node.tag;    
  37.                           }    
  38.                           return false;    
  39.                       });    
  40.                   }) ?    
  41.                   propInfos.First(v => v.GetCustomAttributes(false).First(m =>   
  42.                   (m.GetType() == typeof(XmlArrayItemAttribute) &&    
  43.                       ((XmlArrayItemAttribute)m).ElementName == node.tag) ||   
  44.                       (m.GetType() == typeof(XmlElementAttribute) &&    
  45.                       ((XmlElementAttribute)m).ElementName == node.tag)) != null) :    
  46.                   null;    
  47.     
  48.                 if (propInfo != null)    
  49.                 {    
  50.                     //If the property is an IEnumerable and not a String    
  51.                     //then create a generic list with the type of the property    
  52.                     if (typeof(IEnumerable).IsAssignableFrom(propInfo.PropertyType) &&   
  53.                         (propInfo.PropertyType.Name != "String"))    
  54.                     {    
  55.     
  56.                         var listType = typeof(List<>);    
  57.     
  58.                         //Get the proper generic type    
  59.                         var genericType = propInfo.PropertyType.IsArray ?   
  60.                         listType.MakeGenericType(Type.GetType(propInfo.PropertyType.FullName.Replace("[]"""))) :    
  61.                             listType.MakeGenericType(Type.GetType(propInfo.PropertyType.FullName).GetGenericArguments()[0]);    
  62.     
  63.                         //Create the generic instance    
  64.                         instance = Activator.CreateInstance(genericType);    
  65.     
  66.                         //Create the concrete item instance    
  67.                         var itemInstance = (propInfo.PropertyType.Name == "String[]" ||   
  68.                            instance.GetType().GetGenericArguments().Single() == typeof(string)) ?    
  69.                             new String(new Char[] { ' ' }) //If string    
  70.                             : instance.GetType().GetGenericArguments().Single().IsInterface ? //If interface    
  71.                             Activator.CreateInstance(Assembly.GetExecutingAssembly().GetTypes().First     
  72.                             (x => x.GetInterfaces().Contains(instance.GetType().GetGenericArguments().Single())                              && x.GetConstructor(Type.EmptyTypes) != null)) :    
  73.                             Activator.CreateInstance(instance.GetType().GetGenericArguments().Single()); //If other    
  74.     
  75.                         //If the object already exists then don't need new instance    
  76.                         object temp = propInfo.GetValue(o, null);    
  77.     
  78.                         //If the collection already exists    
  79.                         if (temp != null)    
  80.                         {    
  81.                             //and it's an array then fill the created temporary list    
  82.                             //with it's exisiting items    
  83.                             if (propInfo.PropertyType.IsArray)    
  84.                             {    
  85.                                 foreach (object item in ((Array)temp))    
  86.                                 {    
  87.                                     instance.GetType().GetMethod("Add").Invoke(instance, new[] { item });    
  88.                                 }    
  89.                             }    
  90.                             //or set the created instance to the existing one    
  91.                             else    
  92.                                 instance = temp;    
  93.                         }    
  94.     
  95.                         //Add the created item instance to the generic list    
  96.                         instance.GetType().GetMethod("Add").Invoke(instance, new[] { itemInstance });    
  97.     
  98.                         helperObj.HelperObject = instance;    
  99.     
  100.                         //If the property is an array then loop through     
  101.                         //the list and fill a new array with the items    
  102.                         if (propInfo.PropertyType.IsArray)    
  103.                         {    
  104.                             //Initialze the length of the array in advance     
  105.                             //by the number of the current child nodes.    
  106.                             //The first one will be set and the others are null for the time being    
  107.                             var CountofItem = node.nodes.Count;    
  108.     
  109.                             var array = Array.CreateInstance(itemInstance.GetType(), CountofItem);    
  110.     
  111.                             for (int j = 0; j < ((IList)instance).Count; j++)    
  112.                             {    
  113.                                 array.SetValue(((IList)instance)[j], j);    
  114.                             }    
  115.     
  116.                             //Finally set the array to the instance    
  117.                             instance = array;    
  118.                             helperObj.ItemIndex = 0;    
  119.                             helperObj.HelperObject = instance;    
  120.                         }    
  121.     
  122.                         obj = itemInstance;    
  123.                     }    
  124.                     else //If property is NOT Enumerable     
  125.                     {      
  126.                         //If the property is value type    
  127.                         if (propInfo.PropertyType.IsValueType)    
  128.                         {    
  129.                             TypeConverter tc = TypeDescriptor.GetConverter(propInfo.PropertyType);    
  130.                             instance = tc.ConvertFromString(node.value);    
  131.                         } //or it's a String    
  132.                         else if (propInfo.PropertyType.Name == "String")    
  133.                         {    
  134.                             instance = new String(node.value.ToCharArray());    
  135.                         } //or it's a class    
  136.                         else if (propInfo.PropertyType.IsClass)    
  137.                         {    
  138.                             instance = Activator.CreateInstance(propInfo.PropertyType);    
  139.                         } //or it's an interface    
  140.                         else if (propInfo.PropertyType.IsInterface)    
  141.                         {    
  142.                             //Find the implementation of the interface    
  143.                             //Get all executing assemblies    
  144.                             Type[] types = Assembly.GetExecutingAssembly().GetTypes();    
  145.                             //Get the implemented type    
  146.                             Type implementedType = types.First(x =>   
  147.                              x.GetInterfaces().Contains(propInfo.PropertyType) &&  
  148.                              x.GetConstructor(Type.EmptyTypes) != null);    
  149.     
  150.                             instance = Activator.CreateInstance(implementedType);    
  151.                         }    
  152.     
  153.                         obj = instance;    
  154.                     }    
  155.     
  156.                     //Finally set the property with the created object    
  157.                     propInfo.SetValue(o, instance, null);    
  158.                 }    
  159.                 else    
  160.                 {    
  161.                     //If the current node index is 1 then it is the first item of the collection    
  162.                     //therefore we use the existing first item and don't create a newer one    
  163.                     if (node.index==1)    
  164.                     {    
  165.                         obj = o;    
  166.                     } //Otherwise create a new object    
  167.                     else if (o.GetType().Name != "String")    
  168.                         obj = Activator.CreateInstance(o.GetType());    
  169.                     else //Except if it is a String object    
  170.                         obj = new String(new char[] { });    
  171.     
  172.                     //If the Node has value then set the value of the object    
  173.                     if (node.value != null)    
  174.                     {    
  175.                         if (obj.GetType().IsValueType)    
  176.                         {    
  177.                             TypeConverter tc = TypeDescriptor.GetConverter(obj.GetType());    
  178.                             obj = tc.ConvertFromString(node.value);    
  179.                         } //or it's a String    
  180.                         else if (obj.GetType().Name == "String")    
  181.                         {    
  182.                             obj = new String(node.value.ToCharArray());    
  183.                         }     
  184.                     }    
  185.     
  186.                     //If the HelperObject is a List or Array    
  187.                     if (helperObj.HelperObject != null)    
  188.                     {    
  189.                         //If the collection is an array    
  190.                         if (helperObj.HelperObject.GetType().IsArray)    
  191.                         {    
  192.                             //Set the created new object to the next item by using    
  193.                             //the itemIndex of the Helper    
  194.                             if (((Array)helperObj.HelperObject).GetValue(0).GetType() == obj.GetType())    
  195.                             {    
  196.                                 ((Array)helperObj.HelperObject).SetValue(obj, helperObj.ItemIndex);    
  197.                                 helperObj.ItemIndex++;    
  198.                             }    
  199.                             else    
  200.                                 throw new Exception("Not possible to set this <" +node.tag+ "> into the class object!");    
  201.                         }    
  202.                         //If the collection is a generic list    
  203.                         else if (typeof(IEnumerable).IsAssignableFrom(helperObj.HelperObject.GetType()) &&   
  204.                                     (helperObj.HelperObject.GetType().Name != "String"))    
  205.                         {    
  206.                             //If the current node index is 1 then it is the first item of the collection    
  207.                             //therefore we set this item with the object    
  208.                             if (node.index == 1)    
  209.                             {    
  210.                                 ((IList)helperObj.HelperObject)[0] = obj;    
  211.                             } //Otherwise add as a new item    
  212.                             else    
  213.                                 ((IList)helperObj.HelperObject).Add(obj);    
  214.                         }    
  215.                     }    
  216.                     //If the HelperObject is null then there isn't object    
  217.                     //for the current node    
  218.                     else    
  219.                         throw new Exception("Not possible to set this <" +node.tag+ "> into the class object!");    
  220.                 }    
  221.                 //Recursion    
  222.                 Traverse(node, obj, helperObj);    
  223.             }    
  224.         }    
A helper object is needed in order to follow the filling of the collection object. The ItemIndex is for actual property in the object and HelperObject is the actual object itself.
  1. class Helper  
  2.     {  
  3.         public int ItemIndex { getset; }  
  4.         public object HelperObject { getset; }  
  5.     }   
Finall, the caller which is the constructor of the DeSerializator class:

The object parameter will be the class object that we want to instantiate.

  1. public Deserializator(string path, object obj)  
  2.         {  
  3.             if (File.Exists(path))  
  4.             {  
  5.                 string[] lines = File.ReadAllLines(path);  
  6.                 XDocument doc = XDocument.Parse(String.Join("", lines));  
  7.   
  8.                 string text = GetTextFromXml(doc);  
  9.   
  10.                 Node n = new Node() { tag = "root", index = 0 };  
  11.   
  12.                 GetTag(text, n);  
  13.   
  14.                 Helper helperObj = new Helper();  
  15.   
  16.                 Traverse(n.nodes[0], obj, helperObj);  
  17.             }  
  18.         }  

CONCLUSION

Although, this is an unnecessary solution as DeSerializator already exists for XML, but it was an interesting challenge to implement. It revealed some special cases of reflection and traversal. It can be interesting to implement other possible property types as well. Please let me know if you have any constructive ideas.