XML Manipulation In C#

Introduction

XML stands for Extensible Markup Language file format, which is used to create common information formats and share both the format and the data on the World Wide Web, intranet, etc.

It has the advantages given below.

  1. Human and machine-readable format.
  2. It is platform independent.
  3. Its self-documenting format describes the structure and field names as well as specific values.
  4. XML is heavily used to store the data and share the data.
  5. XML data is stored in text format. This makes it easier to expand or upgrade to new operating systems, new applications, or new browsers, without losing data. 
  6. It strictly follows closing node, case-sensitive and node name. 

Extensible Markup Language  

In this article, we will discuss about XML manipulation in C#. We discuss the points given below.

  1. Add node/XML value to the existing XML.
  2. Edit/Update XML data.
  3. Remove the node from XML data.
  4. Select the node value from XML.
  5. XML Serialization.

Using code

We will use mostly XDocument and XMLDocument class to manipulate XML data.  The following is the LINQ to XML(XDocument) class hierarchy, which will help to understand it more.

LINQ to XML(XDocument) class hierarchy

Add node/XML value to existing XML

Below is the sample XML data, which we will use in our demonstration: 

string tempXml = @"<Projects>  
<Project ID='1' Name='project1' />  
<Project ID='2' Name='project2' />  
<Project ID='3' Name='project3' />  
<Project ID='4' Name='project4' />  
<Project ID='5' Name='project5' />  
</Projects>";  

In the demonstration, we define how many different ways, where we can add the node, using XMLDocument and XDocument class. The output is shown in Figure 2.

Using XmlDocument

// Option1: Using InsertAfter()
// Adding Node to XML  
XmlDocument doc3 = new XmlDocument();  
doc3.LoadXml(tempXml);  
XmlNode root1 = doc3.DocumentElement;  
//Create a new attrtibute.  
XmlElement elem = doc3.CreateElement("Project");  
XmlAttribute attr = doc3.CreateAttribute("ID");  
attr.Value = "6";  
elem.Attributes.Append(attr);  
//Create a new attrtibute.  
XmlAttribute attr2 = doc3.CreateAttribute("Name");  
attr2.Value = "Project6";  
elem.Attributes.Append(attr2);  
//Add the node to the document.  
root1.InsertAfter(elem, root1.LastChild);  
doc3.Save(Console.Out);  
Console.WriteLine();  

// Option2: Using AppendChild()
XmlDocument doc4 = new XmlDocument();  
doc4.LoadXml(tempXml);  
XmlElement XEle = doc4.CreateElement("Project");  
XEle.SetAttribute("Name", "Project6");  
XEle.SetAttribute("ID", "6");  
doc4.DocumentElement.AppendChild(XEle.Clone());  
doc4.Save(Console.Out);  
Console.WriteLine();

XML Manipulation In C# 

Figure 2 : Output after adding new node

Using XDocument

// Option1: Using AddAfterSelf()
XDocument xdoc = XDocument.Parse(tempXml);  
var cust = xdoc.Descendants("Project")  
                .First(rec => rec.Attribute("ID").Value == "5");  
cust.AddAfterSelf(new XElement("Project", new XAttribute("ID", "6")));  
xdoc.Save(Console.Out);  
Console.WriteLine();  

// Option2: Using Add() method 
XDocument doc = XDocument.Parse(tempXml);  
XElement root = new XElement("Project");  
root.Add(new XAttribute("ID", "6"));  
root.Add(new XAttribute("Name", "Project6"));  
doc.Element("Projects").Add(root);  
doc.Save(Console.Out);  
Console.WriteLine();  
  
// // When it contains namespace http://stackoverflow.com/questions/2013165/add-an-element-to-xml-file  
string tempXmlNamespace = @"<Projects xmlns='http://schemas.microsoft.com/developer/msbuild/2003'>  
                    <Project ID='1' Name='project1' />  
                    <Project ID='2' Name='project2' />  
                    <Project ID='3' Name='project3' />  
                    <Project ID='4' Name='project4' />  
                    <Project ID='5' Name='project5' />  
                    </Projects>";  
XNamespace ns = "http://schemas.microsoft.com/developer/msbuild/2003";  
XDocument xDoc = XDocument.Parse(tempXmlNamespace);  
  
var b = xDoc.Descendants(ns + "Project").Last();  
  
b.Parent.Add(  
    new XElement(ns + "Project",  
        new XAttribute("ID", "6"), new XAttribute("Name", "Project6")  
    )  
);  
  
xDoc.Save(Console.Out);  
Console.WriteLine(); 

Edit/Update XML data

Sometimes, we need to change/update XML node value. For instance, we have a node for Project, whose ID is 2 and want to update the Project Name attribute. Following code sample implements the same:

Using XDocument

// Option1: Using SetAttributeValue()
XDocument xmlDoc = XDocument.Parse(tempXml);  
// Update Element value  
var items = from item in xmlDoc.Descendants("Project")  
            where item.Attribute("ID").Value == "2"  
            select item;  
  
foreach (XElement itemElement in items)  
{  
    itemElement.SetAttributeValue("Name", "Project2_Update");  
}  
  
xmlDoc.Save(Console.Out);  
Console.WriteLine();  
  
// Option2: Using Attribute.Value()   
var doc = XElement.Parse(tempXml);  
var target = doc.Elements("Project")  
        .Where(e => e.Attribute("ID").Value == "2")  
        .Single();  
  
target.Attribute("Name").Value = "Project2_Update";  
doc.Save(Console.Out);  
Console.WriteLine();  
  
// Option3: Using ReplaceWith()   
XDocument xmlDoc1 = XDocument.Parse(tempXml);  
XElement xObj = xmlDoc1.Root.Descendants("Project").FirstOrDefault();  
xObj.ReplaceWith(new XElement("Project", new XAttribute("ID", "1"),  
    new XAttribute("Name", "Project1_Update")));  
  
xmlDoc1.Save(Console.Out);  
Console.WriteLine(); 

XML Manipulation In C# 

Figure 3: Update XML Node value

Using XmlDocument

int nodeId = 2;  
XmlDocument xmlDoc2 = new XmlDocument();  
xmlDoc2.LoadXml(tempXml);  
//node["Node2"].InnerText = "Value2";  
XmlNode node = xmlDoc2.SelectSingleNode("/Projects/Project[@ID=" + nodeId + "]");  
node.Attributes["Name"].Value = "Project2_Update";  
  
xmlDoc1.Save(Console.Out);  
Console.WriteLine(); 

Remove node from XML data

To remove the node from existing XML data, we will use XmlDocument and XDocument class. 

Using XmlDocument Class

// Option1: Remove using SelectSingleNode()
int nodeId = 1;  
XmlDocument xmlDoc = new XmlDocument();  
xmlDoc.LoadXml(tempXml);  
XmlNode nodeToDelete = xmlDoc.SelectSingleNode("/Projects/Project[@ID=" + nodeId + "]");  
if (nodeToDelete != null)  
{  
    nodeToDelete.ParentNode.RemoveChild(nodeToDelete);  
}  
//xmlDoc.Save("XMLFileName.xml");  
xmlDoc.Save(Console.Out);  
Console.WriteLine();  
  
// Option2: Remove XML node using Tag name  
XmlDocument doc2 = new XmlDocument();  
doc2.Load(@"D:\ConsoleApplication4\MyData.xml");  
XmlNodeList nodes = doc2.GetElementsByTagName("Project");  
XmlNode node = nodes[0]; // Getting first node  
node.ParentNode.RemoveChild(node);  
doc2.Save(Console.Out);  
Console.WriteLine();  
  
// Option3: Remove one node/child element  
XmlDocument doc1 = new XmlDocument();  
doc1.LoadXml("<book genre='novel' ISBN='1-2-3'>" +  
            "<title>XML Manipulation</title>" +  
            "</book>");  
XmlNode root = doc1.DocumentElement;  
//Remove the title element.  
root.RemoveChild(root.FirstChild);  
doc1.Save(Console.Out);  
Console.WriteLine(); 

XML Manipulation In C# 

Figure 4: Output after delete node

Using XDocument Class

// Using XML Linq  
XDocument xdoc1 = XDocument.Parse(tempXml);  
var elementsToRemove = from elemet in xdoc1.Elements("Projects").Elements("Project")  
                        where elemet.Attribute("Name").Value == "project1"  
                        select elemet;  
foreach (var e in elementsToRemove)  
{  
    e.Remove();  
}              
  
// Using Lambda expression  
XDocument doc = XDocument.Load(@"D:\ConsoleApplication4\MyData.xml");  
doc.Descendants("Project").Where(rec => rec.Attribute("Name").Value == "project2").Remove();  
//doc.Save(@"D:\ConsoleApplication4\MyData_Update.xml");  
doc.Save(Console.Out);  
Console.WriteLine();  
  
// Using XPathSelectElement() method  
XDocument xdoc = XDocument.Parse(tempXml);  
xdoc.XPathSelectElement("Projects/Project[@Name = 'project1']").Remove();  
xdoc.Save(Console.Out);  
Console.WriteLine();  
  
// Remove specific node or remove all  
XElement root2 = XElement.Parse(@"<Root>    
                    <Child1>    
                        <GrandChild1/>    
                        <GrandChild2/>    
                    </Child1>    
                    <Child2>    
                        <GrandChild3/>    
                        <GrandChild4/>  
                    </Child2>   
                </Root>");  
  
// Remove specific node   
root2.Element("Child1").Element("GrandChild1").Remove();  
root2.Element("Child2").Elements().Remove();  // Remove all elements  
root2.Save(Console.Out);  
Console.WriteLine();

Select node value from XML

When we use XML data, we want to fetch the data which is based on the node value. We need the project name whose ID is 2. We can use XMLDocument class or XDocument(System.XML.Linq namespace).

XmlDocument xmldoc = new XmlDocument();  
xmldoc.LoadXml(tempXml);  
  
int nodeId = 2;  
XmlNode nodeObj = xmldoc.SelectSingleNode("/Projects/Project[@ID=" + nodeId + "]");  
//string id = nodeObj["Project"].InnerText; // For inner text  
string pName = nodeObj.Attributes["Name"].Value;  
  
// Select Node based on XPath  
XmlNodeList xnList = xmldoc.SelectNodes("/Projects/Project");  
foreach (XmlNode xn in xnList)  
{  
    string projectName = xn.Attributes["Name"].Value;  
}  
  
// Select nodes by TagName  
XmlNodeList nodeList = xmldoc.GetElementsByTagName("Project");  
foreach (XmlNode node in nodeList)  
{  
    var ID = node.Attributes["ID"].Value;  
    var Name = node.Attributes["Name"].Value;  
}

Using XDocument 

string tempXmlData = @"<Projects>  
                    <Project ID='1' Name='project1' />  
                    <Project ID='2' Name='Not' />  
                    <Project ID='3' Name='project3' />  
                    <Project ID='4' Name='Test' />  
                    <Project ID='5' Name='project5' />  
                    </Projects>";  
  
XDocument doc = XDocument.Parse(tempXmlData);  
IEnumerable<Project> result = from rec in doc.Descendants("Project")  
                                where rec.Attribute("Name").Value.Contains("project")  
                                select new Project()  
                                {  
                                    ID = (int)rec.Attribute("ID"),  
                                    Name = (string)rec.Attribute("Name")  
                                };  
foreach (Project p in result)  
{  
    Console.WriteLine("ID:" + p.ID + ", Name: " + p.Name);  
}

XML Manipulation In C#

Figure 5: Result after apply filter

We can generate XML data from C# objects. For instance, we have a list of projects and we want it in XML format. The code sample is given below.

// Generate XML data from C# objects  
List<Project> projects = new List<Project>()  
{  
    new Project{ID = 1, Name="Project1"},  
    new Project{ID = 2, Name="Project2"},  
    new Project{ID = 3, Name="Project3"},  
    new Project{ID = 4, Name="Project4"},  
    new Project{ID = 5, Name="Project5"}  
};  
  
string tempStr = SerializeObject<List<Project>>(projects);  
List<Project> tempProjects = DeserializeObject<List<Project>>(tempStr);  
  
XDocument xDocument = new XDocument(  
    new XDeclaration("1.0", "utf-8", "yes"),  
    new XComment("LINQ To XML Demo"),  
    new XElement("Projects",  
    from project in projects  
    select new XElement("Project", new XAttribute("ID", project.ID),  
                                    new XAttribute("Name", project.Name))));  
  
xDocument.Save(Console.Out);  
Console.WriteLine();

XML Manipulation In C#

Figure 6: After convert object to XML

XML Serialization/Deserialization

Serialization/Deserialization is a cool and important feature of an application. It is required when your want to communicate/send data to other applications. Serialization is a process to convert an object to other formats like XML or binary. Deserialization is just reverse process of Serialization means to convert byte array or XML data to the objects.

The following points needs to remember when there is a class for serialization.

  • XML serialization only serializes public fields and properties.
  • XML serialization does not include any type information.
  • We need to have a default/ non-parameterized constructor in order to serialize an object.
  • ReadOnly properties are not serialized.

Below are some important attributes while Serialization happens.

  • XmlRoot Represents XML document's root Element 
  • XmlElement Field will be serialized as an XML element 
  • XmlAttribute Field will be serialized as an XML attribute
  • XmlIgnore Field/property will be ignored during serialization 

Let's design Project entity for serialization.

[XmlRoot("Projects")]  
public class Project  
{  
    [XmlAttributeAttribute("ID")]  
    public int ID { get; set; }  
  
    [XmlAttributeAttribute("Name")]  
    public string Name { get; set; }  
}

After designing an entity, we have DeserializationObject() method, which takes XML data parameter and returns object. Likewise, we have a method SerializeObject() which takes an object as a parameter and returns the data as XML format. 

public static T DeserializeObject<T>(string xml)  
{  
    var serializer = new XmlSerializer(typeof(T));  
    using (var tr = new StringReader(xml))  
    {  
        return (T)serializer.Deserialize(tr);  
    }  
}  
  
public static string SerializeObject<T>(T obj)  
{  
    var serializer = new XmlSerializer(typeof(T));  
  
    XmlWriterSettings settings = new XmlWriterSettings();  
    settings.Encoding = new UnicodeEncoding(true, true);  
    settings.Indent = true;  
    //settings.OmitXmlDeclaration = true;  
  
    XmlSerializerNamespaces ns = new XmlSerializerNamespaces();  
    ns.Add("", "");  
  
    using (StringWriter textWriter = new StringWriter())  
    {  
        using (XmlWriter xmlWriter = XmlWriter.Create(textWriter, settings))  
        {  
            serializer.Serialize(xmlWriter, obj, ns);  
        }  
  
        return textWriter.ToString(); //This is the output as a string  
    }  
} 

string tempStr = SerializeObject<List<Project>>(projects);
List<Project> tempProjects = DeserializeObject<List<Project>>(tempStr); 

XML Serialization 

Figure 7: Result after Serialization

XmlDocument vs XDocument

We mostly used XmlDocument or XDocument class to manipulate XML data. However, there are some differences between them:

  • XDocument(System.Xml.Linq) is from the LINQ to XML API and XmlDocument(System.Xml.XmlDocument) is the standard DOM-style API for XML.
  • If you're using .NET version 3.0 or lower, you have to use XmlDocument, the classic DOM API. On the other hand, if you are using .NET version 3.5 onwards, you need to use XDocument.
  • Performance wise XDocument is faster than XmlDocument because it (XDocument) is newly developed to get better usability with LINQ. It's (XDocument) much simpler to create documents and process them.

Conclusion

We learned how to use LINQ to XML to manipulate XML by loading external XML files and reading the data within, as well as the writing data to external XML files. We also discussed how to use XmlDocument and XDocument, as XDocument has more features. We need to use this if you are using .NET framework 3.5 onwards.

Hope this helps. 


Similar Articles