Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So what do you use instead of regular expressions for such tasks?


IMO the biggest problem with regexes as seen in the wild is a lack of composability. If you need some kind of pattern like "[setA][setA+setB]{0,n}" then you'll copy-paste the definition of setA in both places. If you need to re-use that entire regex you'll copy-paste it again and construct a monstrous string with a really well-defined structure that isn't even slightly apparent without a reverse engineering session.

Up to a point you can solve that by just giving names to relevant sub-expressions, using a regex builder, etc, but in my experience if I'm going to write even a moderately complicated regex I'll probably be better served with something like parsec (a python implementation here [0]) in whichever language I'm currently using.

That isn't to say that regexes don't have their place -- anything quick and dirty, situations where you need to handle unsanitized input (mind you, the builtin regex engine is probably vulnerable to exponential runtime attacks) and don't want to execute a turing-complete language, etc.... I just think it has bad ergonomics for any parser you might use more than once, and I haven't yet regretted using parsec in situations where a complex regex would have sufficed.

[0] https://pythonhosted.org/parsec/


Perl is great for this. It’s been a long time since I’ve written any, but with the right flags and use of qr// you can write extremely readable perlre.


That kind of depends on the language I am using, as well as dealing with performance requirements vs readability tradeoffs.


What I like about regexes is they are typically both the most readable and the best performing solution to the kind of problems they are suited for.

Just don't use them for validating email addresses or determining if a number is a prime.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: