XML Tutorial: Transforming XML Documents

 

XSLT – Part 1

 

Transforming XML Documents

 

The tree of specifications for transforming XML documents from one form to another is shown below.

 

 

Cascading Stylesheets (CSS) is a specification for adding rules to HTML documents that control the appearance of the rendered document in terms of the fonts, spacing, style, margins etc. In this tutorial we will not learn about CSS.  But, do look at the original recommendations, browsers, authoring tools etc that are available at:

 

http://www.w3.org/Style/CSS/

 

 

eXtensible Styleheet Language (XSL) is an extension of CSS aimed at working with general XML documents. There are two key parts:

¨      XSL Transformations (XSLT) – A language to specify transformations of XML documents. In this chapter and the next we will focus on XSLT in detail. We will also see a complete example in action thereafter.

¨      XSL Formatting Objects (XSL-FO) – A specification on formatting semantics. Although we will not formally discuss XSL-FO, we will see an example in a later chapter.

 

The original specification on XSL is available here:

 

http://www.w3.org/Style/XSL/

 

Before we move on to see the nuts-and-bolts of XSLT, you may want to download an editor that supports XSLT so you can try out some of these examples yourself. An editor you could use is XMLSpy Home Edition (from www.altova.com) which is free.

 

In all our examples both in this chapter and the next, we will use base XML file shown below as data, which we will then transform using XSLTs. I would recommend that you copy this file into your chosen editor and run the transformations as we go through them (in the XMLSpy editor, this would be done from the menu XSL -> XSL Transformation).

 

As shown in the XML file below, in order to render a XML document using an XSLT, the XSLT name is specified just after the first statement of the XML file, using the following directive:

Text Box: <?xml-stylesheet type="text/xsl" href="xslt-url"?>
 

 

 

 

 


where xslt-url is the URL of the XSLT file. You may want to change this depending on the location of your XSLT file.

 

Text Box: <?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="D:\articles.xslt"?>
<ARTICLES>
       <ARTICLE>
             <ARTICLEDATA>
                    <TITLE>XSLT Basics</TITLE>
                    <AUTHOR>Jaidev</AUTHOR>
                    <ABSTRACT>An XSLent document</ABSTRACT>
                    <BODY>All about XSLTs</BODY>
             </ARTICLEDATA>
       </ARTICLE>
       <ARTICLE>
             <ARTICLEDATA>
                    <TITLE>CSharp Basics</TITLE>
                    <AUTHOR>Aleksey N</AUTHOR>
                    <ABSTRACT>See its sharp!!</ABSTRACT>
                    <BODY>All about sharp sightedness</BODY>
             </ARTICLEDATA>
       </ARTICLE>
       <ARTICLE>
             <ARTICLEDATA>
                    <TITLE>XML Revisited</TITLE>
                    <AUTHOR>Visitor X M L</AUTHOR>
                    <ABSTRACT>The XML text for you</ABSTRACT>
                    <BODY>All about XML is in here - time and again</BODY>
             </ARTICLEDATA>
       </ARTICLE>
</ARTICLES>
 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


XSLT Syntax

Basics

 

There are two key things to note about an XSLT document.

 

¨      An XSLT document is just another XML document

¨      An XSLT document’s contents lie with the xsl:stylesheet element

 

With these in mind, you can already create your first and most basic XSLT document as below!

 

Text Box: <?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 
 
</xsl:stylesheet>

 

 

 

 

 

 

 


That of course was a blank stylesheet. So now let us see what goes into the space between the xsl:stylesheet start and end tags – essentially the content of the transformation.

 

Templates

 

Stylesheets operate on XML documents by a process of matching tags according to template rules specified by an element called the xsl:template element. The root of a stylesheet is an xsl:template element.

 

Consider the following XSLT used to transform our standard XML document (shown above):

 

Text Box: <?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
       <xsl:output method="html" version="1.0" encoding="UTF-8" indent="yes"/>
       <xsl:template match="/">
       <html>
             <body>
                    <h2>List of books</h2>
                    <table border="2">
                           <tr>
                                 <b>
                                        <th>Title</th>
                                        <th>Author</th>
                                        <th>Abstract</th>
                                 </b>
                           </tr>
                           <tr>
                                 <td>Content here</td>
                                 <td>Content here</td>
                                 <td>Content here</td>
                           </tr>
                    </table>
             </body>
       </html>
       </xsl:template>
</xsl:stylesheet>
 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


         

 

The output is an HTML file that is rendered as below:

List of books

Title

Author

Abstract

Content here

Content here

Content here

 

The key statement to note in the XSLT file is:

 

<xsl:template match="/">

 

This is the basic template that matches against the entire XML document (because of the “/” which represents the root in XPath syntax) and outputs what is within – which are the rows of a table. The match string could be any other string which represents a valid XPath declaration. We will learn about XPath (XML paths) in a later chapter.

 

Now let us see how to add on to the above XSLT so that some meaningful output can be seen in place of the “Content Here” cells.

 

Values and Iterations

 

In order to retrieve the text content of an XML element, the XSLT element called:

 

<xsl:value-of select="xpath-address"/>

 

is used, where xpath-address is the XPath address to be matched against. For instance, in the example below the content of the element named “TITLE” has been retrieved using this directive (just as has been done with “AUTHOR” and “ABSTRACT”).

 

Also we need to iterate over all the elements of the XML file (in our example three articles) so we can populate the rows of the table (in our example three rows one for each article). The statement for this is called:

 

<xsl:for-each select="xpath-address">

 

where once again, xpath-address is the XPath address to be matched against. For instance, in the example below, all “ARTICLEDATA” elements (identified by the complete XPath address “ARTICLES/ARTICLE/ARTICLEDATA”) have been iterated over using the above directive.

 

Let us now see the completed XSLT and the output.