@@ -63,6 +63,7 @@ ElementTree_.
6363 7.2 Why doesn't ``findall()`` support full XPath expressions?
6464 7.3 How can I find out which namespace prefixes are used in a document?
6565 7.4 How can I specify a default namespace for XPath expressions?
66+ 7.5 How can I modify the tree during iteration?
6667
6768
6869The code examples below use the `'lxml.etree`` module:
@@ -1241,3 +1242,38 @@ How can I specify a default namespace for XPath expressions?
12411242You can't. In XPath, there is no such thing as a default namespace. Just use
12421243an arbitrary prefix and let the namespace dictionary of the XPath evaluators
12431244map it to your namespace. See also the question above.
1245+
1246+
1247+ How can I modify the tree during iteration?
1248+ -------------------------------------------
1249+
1250+ lxml's iterators need to hold on to an element in the tree in order to remember
1251+ their current position. Therefore, tree modifications between two calls into the
1252+ iterator can lead to surprising results if such an element is deleted or moved
1253+ around, for example.
1254+
1255+ If your code risks modifying elements that the iterator might still need, and
1256+ you know that the number of elements returned by the iterator is small, then just
1257+ read them all into a list (or use ``.findall()``), and iterate over that list.
1258+
1259+ If the number of elements can be larger and you really want to process the tree
1260+ incrementally, you can often use a read-ahead generator to make the iterator
1261+ advance beyond the critical point before touching the tree structure.
1262+
1263+ For example:
1264+
1265+ .. sourcecode:: python
1266+
1267+ from itertools import islice
1268+ from collections import deque
1269+
1270+ def readahead(iterator, count=1):
1271+ iterator = iter(iterator) # allow iterables as well
1272+ elements = deque(islice(iterator, 0, count))
1273+ for element in iterator:
1274+ elements.append(element)
1275+ yield elements.popleft()
1276+ yield from elements
1277+
1278+ for element in readahead(root.iterfind("path/to/children")):
1279+ element.getparent().remove(element)
0 commit comments