While lxml has some excellent benchmarks about the speed of lxml.etree vs. ElementTree, I wanted to run some tests that were as close as possible to my own use case (fairly simple multi-megabyte XML files).
Here are the results of my little test script lxml-v-etree.py (times are in milliseconds):
name generate | tostring | total | write | parse | find | total ------------------------+----------+-------+-------+-------+------+------ xml.cElementTree 132 | 2430 | 2562 | 2433 | 158 | 58 | 216 xml.cElementTree 112 | 2384 | 2497 | 2387 | 158 | 25 | 183 xml.cElementTree 113 | 2393 | 2507 | 2396 | 161 | 25 | 187 xml.ElementTree 591 | 2571 | 3163 | 2574 | 3613 | 25 | 3638 xml.ElementTree 619 | 2567 | 3187 | 2570 | 3589 | 55 | 3644 xml.ElementTree 609 | 2578 | 3188 | 2581 | 3564 | 55 | 3619 lxml 333 | 75 | 409 | 82 | 200 | 0 | 201 lxml 355 | 93 | 448 | 95 | 182 | 32 | 214 lxml 310 | 94 | 404 | 96 | 156 | 56 | 213 ------------------------+----------+-------+-------+-------+------+------ name generate | tostring | total | write | parse | find | total ------------------------+----------+-------+-------+-------+------+------
Note that the first “total” is “generate + tostring” while the second “total” is for the 2 parsing related tests (previous 2 columns summed).
My parsing tests are basically “etree.parse” and then running “Element.getchildren()” 3 times, which is ridiculously simplistic and should probably be ignored. My writing tests are far more thorough/realistic.
I’m running Python 2.6.2 with lxml 2.1.5 and libxml2 2.6.32 on Ubuntu 9.04 x86_64.
