Bulk Data Transactions using OpenXML

OPENXML is a new function added to SQL Server 2000 that provides a rowset view over an XML document. Since a rowset is simply a set of rows that contain columns of data, OPENXML is the function that allows an XML document to be treated in the familiar relational database format. It allows for the passing of an XML document to a T-SQL stored procedure for updating the data.

OPENXML is a new function added to SQL Server 2000 that provides a rowset view over an XML document. Since a rowset is simply a set of rows that contain columns of data, OPENXML is the function that allows an XML document to be treated in the familiar relational database format. It allows for the passing of an XML document to a T-SQL stored procedure for updating the data.

OpenXML -  to summarize

  • It extends the SQL Language
  • It is used within T-SQL Stored Procedures
    • XML Document passed as parameter
  • It uses row and column selectors utilizing XPath
  • It supports the following:
    • Attribute and element-centric mappings.
    • Edge table rowset.
    • XML annotation/overflow column.
    • Hierarchy support.

OpenXML - Syntax

  • OpenXML(idoc, rowpattern,flags)
    [WITH (SchemaDecl | Tablename)]
  • Parameters
  • idoc
    Is the document handle of the internal representation of an XML document. The internal representation of an XML document is created by calling sp_xml_preparedocument.

sp_xml_preparedocument - Reads the Extensible Markup Language (XML) text provided as input, then parses the text using the MSXML parser (Msxml2.dll), and provides the parsed document in a state ready for consumption. This parsed document is a tree representation of the various nodes (elements, attributes, text, comments, and so on) in the XML document.

sp_xml_preparedocument returns a handle that can be used to access the newly created internal representation of the XML document

  • Rowpattern
    Is the XPath pattern used to identify the nodes (in the XML document whose handle is passed in the idoc parameter) to be processed as rowsflags
    • Flags
      Indicates the mapping that should be used between the XML data and the relational rowset, and how the spill-over column should be filled. flags is an optional input parameter, and can be one of these values
  • SchemaDecl
    in-line meta-data for relational view(column_name1 column_type1 [colpattern1], …, column_namej column_typej [colpatternj])
  • Tablename
    existing table to obtain meta-data for relational view
  • Edgetable
    if neither SchemaDecl or Tablename is specified

Architecture:

BulkDataTransactionsUsingOpenXML.jpg

The following part of the article describes the usage of OPENXML function to insert multiple rows of data in a single database call. This can be an effective alternative to looping through an array and calling a stored procedure to insert a row each time.

The example provided inserts 10 rows into a table, so the OPENXML approach is cutting the database calls from 10 to 1 in this case. This minimization of database calls can translate into significant performance and scalability benefits. Each time a database call is made, network and database resources are utilized. The more demands you make for these resources, the more likely you will experience degradation in your application's performance. OPENXML enables you to, in essence, package data together in a single call (as XML), map it to a rowset view, and execute all of the inserts within the same database call which results in a minimization of the utilization of these resources. 

CREATE PROC sp_insert_employee @empdata ntext
AS
DECLARE @hDoc int
EXEC sp_xml_preparedocument @hDoc OUTPUT, @empdata
INSERT INTO Employee
SELECT *
FROM OPENXML(@hDoc, '/Employee')
WITH Employee
EXEC sp_xml_removedocument @hDoc
GO 

You can see that the only parameter passed to the procedure is the XML passed as a varchar. Depending on the size of the XML string you are working with, the XML string input parameter can be (n)char or (n)text in addition to (n)varchar. The @hDoc variable is required by the sp_xml_preparedocument as an output parameter.Sp_xml_preparedocument is a SQL Server system stored procedure that creates an internal representation of the XML document passed to it, and returns this document handle in @hDoc.

The OPENXML function accepts three arguments, the first two of which are required. The first argument is the document handle that you created by calling sp_xml_preparedocument. This tells OPENXML which XML document you are working with. The second argument is an XPATH (XML Path Language) pattern used to identify the nodes in the XML document.

Each node identified by the XPATH pattern corresponds to a single row in the rowset generated by OPENXML. In our example, there are 10 < Employee> nodes each representing a row in the rowset. The third argument is optional and specifies how the mapping should occur between the rowset created by OPENXML and the XML document. The default is attribute-centric, which means XML attributes of a given name are stored in a column in the rowset with the same name

The WITH clause allows you to specify a Schema declaration (to specify additional mapping between a column in the rowset and a value in the XML document) or the table name if the table already exists with the desired schema.

The example does a simple insert into the Employee table, and since the XML document was created specifically to insert multiple rows into the Employee table, it is sufficient to specify the table name Employee in our WITH clause.

The last statement, EXEC sp_xml_removedocument @hDoc, is called to remove the XML document from it's storage location in the internal cache of SQLServer.

In summary, the new OPENXML function in SQL Server 2000 can be useful for processing multiple table inserts within a single database call. The ability to map an XML document to a rowset representation of a specified portion of the XML document within a stored procedure can maximize the efficiency with which repetitive type inserts are accomplished.
You can also update and delete rows with XML using OPENXML. Without going into specifics the process is basically:

  1. Create an internal representation of the XML document with SP_XML_PREPAREDOCUMENT
  2. Perform the UPDATE / DELETE using the FROM OPENXML () WITH ... syntax, referencing the internal representation of the XML document
  3. Destroy the internal representation of the XML document with SP_XML_REMOVEDOCUMENT

An example of how to use OPEN XML for updating/deleting records is given below:

CREATE PROC sp_update_employee @empdata ntext
AS
DECLARE @hDoc int
exec sp_xml_preparedocument @hDoc OUTPUT,@empdata
UPDATE Employee
SET
Employee.fname = XMLEmployee.fname,
Employee.lname = XMLEmployee.lname
FROM OPENXML(@hDoc, '/root/Employee')
WITH Employee XMLEmployee
WHERE Employee.eid = XMLEmployee.eid
EXEC sp_xml_removedocument @hDoc
SELECT *
from Employee
FOR XML AUTO

 

CREATE PROC sp_delete_data @empdata ntext
AS
DECLARE @hDoc int
exec sp_xml_preparedocument @idoc OUTPUT, @doc
DELETE Customers
FROM OPENXML (@idoc, '/ROOT/Customer/Order/OrderDetail',2)
WITH (OrderID int
'@OrderID',
CustomerID varchar(10)
'@CustomerID',
OrderDate datetime
'@OrderDate',
ProdID int
'@ProductID',
Qty int
'@Quantity') b
WHERE Customer.CustomerID=b.CustomerID
EXEC sp_xml_removedocument @idoc

Summary

  • Leverages existing relational model for use with XML
  • Provides:
    • A mechanism for updating database with data in XML format
    • Multi-row updates in single stored procedure call
    • Multi-table updates leverage XML hierarchy
    • Queries that join existing tables with XML data