Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Regular expressions are not the way to do syntax highlighting yet many of the popular editors do it this way. Create a super huge string in vim, textmate or sublime text and watch it blow up. Then try it in Xcode or Intellij and see that there is no problem. The reason for Xcode and Intellij's fast syntax highlighting is because these do not use regular expressions, they use a lexer. They're examining the text one character at a time with some amount of forward and backward looking state and building the syntax highlighting and smart completion/features that way. As long as your editor is using regular expressions for syntax highlighting this will be a problem for you.


I believe cordite is saying that Atom's syntax highlighting fails when your code contains a huge regular expression.

I don't believe cordite is claiming that Atom's syntax highlighting fails because it uses regular expressions internally.

(I don't know if Atom's syntax highlighting even works that way, FYI. Though I would guess that it does.)


This is the correct interpretation of what I said.

I don't know if it is also related to long strings in general on one line (have not tested)


Lexing IS regular expressipn matching (except for some esoteric features like Haskell's nrsted comments).

Difference is in the handling.

As a matter of fact, Visual Studio uses semantic analysis - classes show differently than other identifiers.


It can be, it can also be a context free grammar, which is 'higher' than a state machine (or regular expression)


But there are also higher complexity bounds on CFGs (since they include regular expressions, they certainly can't be lower).

In any case, I've never heard of someone using non-regular expression CFGs for lexing. Are there cases?


I mentioned Haskell above. It has nested block comments which should be balanced. Thus CFG.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: