htmlcxx is a simple non-validating css1 and html parser for C++.
Although there are several other html parsers available, htmlcxx has some
characteristics that make it unique: - STL like navigation of DOM tree,
using excelent's tree.hh library from Kasper Peeters.

- It is possible to reproduce exactly, character by character, the original
document from the parse tree
- Bundled css parser
- Optional parsing of attributes
- C++ code that looks like C++ (not so true anymore)
- Offsets of tags/elements in the original document are stored in the nodes
of the DOM tree

