Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

While a GLR parser has the same worst-time complexity O(n^3), it usually performs better on nearly deterministic or deterministic grammars (though Earley can be modified to work better on deterministic grammars). Apart from that the algorithms give quite similar results, and can both handle any context-free grammar. GLR might have the advantage that it doesn't require any "tricks" to handle right-recursion, and that the algorithm itself is very compact.

I haven't implemented an Earley parser yet (should probably give that a try), but I suspect it's not more difficult than implementing a GLR parser (for a great GLR reference implementation, check out the Elkhound paper: http://scottmcpeak.com/elkhound/sources/elkhound/algorithm.h...).

In any case time complexity is not always the best performance measure in the real world, as it's usually the constant overhead that makes up for the largest performance differences, at least when parsing programming languages (which often are fully or almost fully mostly deterministic and can be handled in linear time even by recursive-descent parsers given the right grammar). Here, simple shift-reduce parsers really shine, as they do not do any backtracking and work with a simple rule table and a heap/stack for the tokens they emit. Also, the (optional but often useful) lexing phase of parsing should not be underestimated, as it can be as tricky as the token-based parsing that follows. Python, for example, cannot be lexed with a context-free grammar as the indentation is stateful (and newlines are treated differently depending whether they occur inside a bracket/parens expression or not, which requires a grammar to keep track of the nesting)

The main advantage of "traditional" tools like yacc or bison is that they are highly optimized, and produce parsers that can process > 100 kloc / second, which is hard to achieve with most other frameworks (I couldn't find any benchmarks on the JS parser).



Correction: If you want the parse tree, GLR and Earley can have O(n^4) time complexity it seems:

http://cstheory.stackexchange.com/questions/22621/complexity...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: