From 14c633de5f2a4870b9ab713ed227522eb3f8473c Mon Sep 17 00:00:00 2001 From: Vicent Marti <tanoku@gmail.com> Date: Tue, 9 Jun 2015 12:13:11 +0200 Subject: [PATCH 2/4] parser: Skip non-HTML elements when resetting insertion Although the HTML standard doesn't state it explicitly, when "resetting the insertion mode appropriately", the nodes in the stack of open elements are supposed to be nodes in the HTML namespace. As an example, the following document: <table> <svg> <select> <title> <select> </select> </title> </select> </svg> </table> Here, both `svg` and the first `select` are inserted as foreign objects in the SVN namespace. The `title` tag however is an HTML integration point, so the second `select` is inserted in the HTML namespace. When closing the deeply nested `select`, we reset the insertion mode. If we consider the outer select as a valid element when resetting, it will leave us in "insert in select in table", which is not a valid state because it forces us to "close current select" one more time: this call cannot succeed because there are no open selects left in the stack of open elements. To fix the issue, we simply skip non-HTML elements in `get_appropriate_insertion_mode`; this brings the logic in line with the rest of the functions in the parser that expect to find specific elements in the stack of open elements. --- src/parser.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/src/parser.c b/src/parser.c index 39901bc..dc692b3 100644 --- a/src/parser.c +++ b/src/parser.c @@ -572,6 +572,10 @@ static GumboInsertionMode get_appropriate_insertion_mode( } assert(node->type == GUMBO_NODE_ELEMENT || node->type == GUMBO_NODE_TEMPLATE); + if (node->v.element.tag_namespace != GUMBO_NAMESPACE_HTML) + return is_last ? + GUMBO_INSERTION_MODE_IN_BODY : GUMBO_INSERTION_MODE_INITIAL; + switch (node->v.element.tag) { case GUMBO_TAG_SELECT: { if (is_last) { -- 2.21.0