Skip to main content

A small python library for quickly traversing XML data.

Project description

## Basic Usage

import drill
doc = drill.parse(path_or_url_or_string)

# Drill down to a specific element.
print unicode(

# Iterate through all "title" tags in the document.
for t in doc.iter('title'):
print t.attrs,

# Find all "bar" nodes with a "baz" child and a "foo" parent.
q = doc.find('//foo/bar[baz]')
# Easily access the first and last elements of matching results.
print q.first(), q.last()
# Iterate over results.
for e in q:

# Parse only elements matching some path
for e in drill.iterparse(filelike, xpath='root/*/something'):
print e.tagname,

## Features

* Runnable test suite
* Python 3 support

## Advantages

* Pure python
* Faster, more efficient parsing than ElementTree
* Using ElementTree, a ~150 MB XML file (~3,000,000 elements) took ~46 seconds to parse, consuming ~1.3 GB of RAM
* Parsing the same file using drill took ~24 seconds and consumed ~950 MB of RAM
* Very unscientific benchmarks performed on a Core i5 @ 2.8 GHz, running Windows 7. YMMV.
* Lots of convenience methods for accessing elements quickly:
* root[2].child['attr']
* first/last/prev/next methods for traversing siblings

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for drill, version 1.2.0
Filename, size File type Python version Upload date Hashes
Filename, size drill-1.2.0-py2.py3-none-any.whl (8.4 kB) File type Wheel Python version py2.py3 Upload date Hashes View hashes
Filename, size drill-1.2.0.tar.gz (7.5 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page