Getting started¶

For the following examples will use this dtd content:

<!ELEMENT movies (movie*)>
<!ELEMENT movie (title, date?, realisator, characters, (good-comment|bad-comment)*, (bad|good|awesome)?)>
<!ATTLIST movie idmovie ID #IMPLIED>
<!ELEMENT title (#PCDATA)>
<!ELEMENT date (#PCDATA)>
<!ELEMENT realisator (#PCDATA)>
<!ELEMENT characters (character+)>
<!ELEMENT character (#PCDATA)>
<!ATTLIST character idcharacter ID #IMPLIED>
<!ELEMENT good-comment (#PCDATA)>
<!ELEMENT bad-comment (#PCDATA)>
<!ELEMENT bad (#PCDATA)>
<!ELEMENT good (#PCDATA)>
<!ELEMENT awesome (#PCDATA)>

Creating a XML file¶

To create a XML we just need to call the method create like in the following example:

>>> import xmltool
>>> dtd_url = 'examples/movies.dtd'
>>> movies = xmltool.create('movies', dtd_url=dtd_url)
>>> print movies
<movies/>

>>> movies.write('examples/movies.xml')

You can see the content of the generated file and note that the required tags have been added automatically to make a valid XML file.

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE movies SYSTEM "examples/movies.dtd">
<movies/>

Loading a XML file¶

For this example we load the file previously created:

>>> import xmltool
>>> filename = 'examples/movies.xml'
>>> movies = xmltool.load(filename)
>>> print movies
<movies/>

Updating a XML file¶

Now we will see that updating an XML is very easy:

>>> import xmltool
>>> filename = 'examples/movies.xml'
>>> movies = xmltool.load(filename)
>>> print movies
<movies/>

>>> movie = movies.add('movie')
>>> title = movie.add('title', 'Full Metal Jacket')
>>> print title
<title>Full Metal Jacket</title>

>>> print movies
<movies>
  <movie>
    <title>Full Metal Jacket</title>
    <realisator></realisator>
    <characters>
      <character></character>
    </characters>
  </movie>
</movies>

>>> movies.write()

Accessing¶

To access a property of the XML object you have to use the list and dictionnary syntax.

>>> import xmltool
>>> filename = 'examples/movies.xml'
>>> movies = xmltool.load(filename)
>>> # As you can see in the dtd, movies has only one child movie
>>> # which is a repeated element.
>>> # Here is the syntax to get the first movie
>>> print movies['movie'][0]
<movie>
  <title>Full Metal Jacket</title>
  <realisator></realisator>
  <characters>
    <character></character>
  </characters>
</movie>

>>> # You have the text property to access to the value of a tag
>>> print movies['movie'][0]['title'].text
Full Metal Jacket

To check if a XML property exists you can use if ... in ... or .get():

>>> import xmltool
>>> filename = 'examples/movies.xml'
>>> movies = xmltool.load(filename)
>>> movie1 = movies['movie'][0]
>>> print 'date' in movie1
False
>>> print movie1.get('date')
None

There is also a method to get or add element

>>> import xmltool
>>> filename = 'examples/movies.xml'
>>> movies = xmltool.load(filename)
>>> movie1 = movies['movie'][0]
>>> # Long version
>>> if 'date' not in movie1:
...     date = movie1.add('date')
>>> # Short version
>>> date = movie1.get_or_add('date')

We can also access to the attributes

>>> filename = 'examples/movies.xml'
>>> movies = xmltool.load(filename)
>>> movie1 = movies['movie'][0]
>>> movie1.add_attribute('idmovie', 'myid')
>>> print movie1
<movie idmovie="myid">
  <title>Full Metal Jacket</title>
  <realisator></realisator>
  <characters>
    <character></character>
  </characters>
</movie>

>>> print movie1.attributes['idmovie']
myid

Traversing¶

To find all elements from a tagname use findall

>>> import xmltool
>>> filename = 'examples/movies.xml'
>>> movies = xmltool.load(filename)
>>> # Add 2 new movies to improve this example
>>> for i in range(2):
...     movie = movies.add('movie')
...     title = movie.add('title', 'Title %i' % i)
>>> titles = movies.findall('title')
>>> titles_str = [t.text for t in titles]
>>> print titles_str
['Full Metal Jacket', 'Title 0', 'Title 1']

You can also go through all the elements by using walk

>>> import xmltool
>>> filename = 'examples/movies.xml'
>>> movies = xmltool.load(filename)
>>>
>>> tagnames = [elt.tagname for elt in movies.walk()]
>>> print tagnames
['movie', 'title', 'realisator', 'characters', 'character']