Reading XML with Namespaces using LINQ

Introduction

On a recent .NET project I had to hook up with a web service that used non-standard headers in the request. After struggling with the configuration settings and the sparse Windows Communication Foundation (WCF) documentation for creating custom headers, I finally decided to skip WCF altogether and create the request as straight XML and manually process the XML in the response. It turned out to be easy (and straightforward), but I had a little trouble processing the returned XML using LINQ. The problem? The return XML had namespaces, which is common enough in the real world, but the .NET documentation and various blog posts showing how to process XML with LINQ don’t give much attention to namespaces.

XmlNamespaceManager

The key to parsing XML with namespaces, whether using LINQ or not, is to include an XML Namespace Manager when querying the XML. The XML Namespace Manager requires a Name Table, and the only way I know to get one is from an XML Reader. Here are the associated classes:

  • System.Xml.XmlReader – used to load XML from a file or a stream (or a string, via a stream); also delivers the XML’s Name Table
  • System.Xml.XmlNamespaceManager – represents an XML Namespace Manager, which is included with any call that parses the XML
  • System.Xml.XmlNameTable – represents an XML Name Table, which is used to create an XML Namespace Manager

The next sections will show how to populate the XML Namespace Manager. It’s not that hard, but it’s not the kind of thing you’d just stumble across reading the .NET documentation – at least I didn’t.

Sample XML

The code example will pull player names from this fake web service response:

<env:Envelope xmlns:env="http://schemas.xmlSOAP.org/SOAP/envelope/"
     xmlns:xsd="http://www.w3.org/2001/XMLSchema"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <env:Body>
  <aaa:PlayerList xmlns:aaa="http://lakenine.com/playerInfo">
   <Player id="778-2">
    <Name>Fred Jones</Name>
   </Player>
   <Player id="463-9">
    <Name>David Arias</Name>
   </Player>
  </aaa:PlayerList>
 </env:Body>
</env:Envelope>

Two Ways to Parse: XElement and XDocument

As far a LINQ queries go, there’s only one difference between using System.Xml.Linq.XElement and System.Xml.Linq.XDcument:

  • An instance of XElement will represent the first element in the document. In the case of the sample XML, that’s the env:Envelope element.
  • An instance of XDocument will represent the entire document. To get the first element, you’ll have to select it.

Getting ahead of myself for just a bit, to select the env:Body element when using XElement, the XPath is env:Body, but when using XDocument it’s env:Element/env:Body.

Parsing XML using XElement

This example pulls player names from the sample XML using System.Xml.Linq.XElement. It’s fully functional, so feel free to copy, build, and try it:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

// Added these "using" directives
using System.Xml;       // XmlNamespace and others
using System.Xml.Linq;  // LINQ for XML
using System.Xml.XPath; // XPath extensions
using System.IO;        // StringReader; wraps a stream around a string

namespace XMLWithLinq {

    class GetNamesWithXElement {

        static void Main(string[] args) {

            // The sample XML
            string sampleXML = String.Concat(
                "<env:Envelope xmlns:env=\"http://schemas.xmlSOAP.org/SOAP/envelope/\"",
                "     xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\"",
                "     xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">",
                " <env:Body>",
                "  <aaa:PlayerList xmlns:aaa=\"http://lakenine.com/playerInfo\">",
                "   <Player id=\"778-2\">",
                "    <Name>Fred Jones</Name>",
                "   </Player>",
                "   <Player id=\"463-9\">",
                "    <Name>David Arias</Name>",
                "   </Player>",
                "  </aaa:PlayerList>",
                " </env:Body>",
                "</env:Envelope>");

            // Open an XML reader and populate with the sample XML. The
            // sample XML is in a string, and XmlReader.Create requires a
            // stream, so wrap the string with a StringReader.
            XmlReader reader = XmlReader.Create(new StringReader(sampleXML));

            // Create a XML Name Table. The easiest way is start with the one
            // the XmlReader gives us.
            System.Xml.XmlNameTable nameTable = reader.NameTable;

            // Create an XML Namespace Manager. It needs a name table.
            System.Xml.XmlNamespaceManager namespaceManager = new System.Xml.XmlNamespaceManager(nameTable);

            // Add the "env" and "aaa" namespaces to the XML Namespace Manager.
            namespaceManager.AddNamespace("env", "http://schemas.xmlSOAP.org/SOAP/envelope/");
            namespaceManager.AddNamespace("aaa", "http://lakenine.com/playerInfo");

            // Pull the default XElement from the XML Reader; this gives us
            // the root document element (env:Envelope).
            XElement doc = XElement.Load(reader);

            // Finally, we're ready to select the player names.
            var playerNames = from pn
                              in doc.XPathSelectElements("env:Body/aaa:PlayerList/Player/Name", namespaceManager)
                              select (string)pn;
            foreach (string pn in playerNames) {
                Console.WriteLine("Player name: " + pn);
            }
        }
    }
}

The output from this program is as follows:

Player name: Fred Jones
Player name: David Arias

Lines 40-49 assemble the XML Namespace Manager. Once that’s done, the XPath query in the LINQ query on lines 56-58 will work.

Also note that the Player Name value is retrieved by casting the selected XML element to a string. The XPathSelectElements method returns a collection of XElement objects, and XElement implements custom cast operators that will return the element’s content in a host of primitive data types. See the XElement documentation for a complete list of supported data types.

Parsing XML using XDocument

The XDocument approach differs in only two places:

  1. On line 53, change XElement doc = XElement.Load(reader) to XDocument doc = XDocument.Load(reader).
  2. In the LINQ query on lines 56-58, add env:Envelope/ to the beginning of the XPath expression; the rest of the XPath is identical:
    • XElement : env:Body/aaa:PlayerList/Player/Name
    • XDocument: env:Envelope/env:Body/aaa:PlayerList/Player/Name

Leave a Reply

Your email address will not be published. Required fields are marked *