The eXtensivle Markup Language is a generic markup language used to represent any kind of structured data.

  • designed to be simple and general
  • human and machine readable
  • comprehensive unicode support

Applications of XML

  • RSS
  • SVG
  • ODF
  • AJAX
  • XHTML

Structure of XML documents

  • Similar to HTML as both are isntances of SGML
  • Documents consist of elements
  • Elements are defined using tags
    • Elements contain content
    • Elements contain attributes
  • Comments in XML are defined via <!-- -->

Unlike in HTML, XML’s elements/attributes have no explicit meaning

Syntax

  • All elements must be defined via properly nested start/end tags or an empty tag (<foo/>)
  • Element/attribute names are case-sensitive
  • Attribute values must be specified and use quotation marks
  • The document must have a unique root element
  • Documents should contain an initial XML declaration:
<?xml version="1.0" encoding="UTF-8">

Schema

XML schemas are often defined via DTD (Document Type Definition)

Defining an element and attributes

<!ELEMENT name content>
<!ATTLIST element name type value>

content can be one of:

  • (#PCDATA) - alphanumeric text content only
  • EMPTY - an empty element
  • ANY - whatever text the DTD allows
  • (regex) where the regex is a regular expression over child names

DTD Example

<!ELEMENT firstName (#PCDATA)>
<!ELEMENT lastName (#PCDATA)>
<!ELEMENT name (firstName, lastName)>

has the following valid XML:

<name>
    <firstName>John</firstName>
    <lastName>Smith</lastName>
</name>

Defining Attributes

<!ATTLIST element name type value>

The attribute type can be:

  • CDATA - where the value is an arbitrary string
  • (val|val2|val3) - where the value must be val, val2 or val3
  • ID - the attribute value is an identifer and must be unique amongst values of all other ID attributes
  • IDREF - The value must refer to that of an ID attribute used elsewhere in the document

The attribute value can be:

  • "value" - which specifies a default value when the attribute is not present
  • #IMPLIED - specifies the attribute is optional
  • #REQUIRED - specifies the attribute is NOT optional
  • #FIXED "value" - the attribute must be declared with the given value

Linking DTD and XML documents

<!DOCTYPE foobar SYSTEM "foobar.dtd">

Displaying XML with CSS

<?xml-stylesheet href="foobar.css">

XML Parsing in JS

const svgString = '<circle cx="50" cy="50" r="50"/>';
const doc2 = parser.parseFromString(svgString, "image/svg+xml");
console.log(doc2.contentType); // "image/svg+xml"