Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Regular expressions in the formal language theory do not have captures anyway. The name collision is unfortunate, but we have already established that regexes in practice means a pattern language largely modelled after theoretical regular expressions and not the theoretical regular expressions themselves. At the very least the writing could have mentioned this discrepancy.


Captures are a red herring here. They don't fundamentally alter the nature of what a regex does, which is to recognize regular languages. Pointing to them as if they're some kind of justification is like calling pilots "drivers" because drivers originally drove wagons, and wagons didn't have rubber tires like cars and planes anyway. It's completely missing that the point of the distinction between a plane and a wagon has always been the land vs. air travel, not modern features like tires or the infotainment systems or what have you.

But yes I guess it'd have been better for the writing to mention the discrepancy in any case.


On a related note, lookaheads (and behinds) do somewhat change the fundamental expressive power I believe.


> They don't fundamentally alter the nature of what a regex does, which is to recognize regular languages.

It's a bit subjective but captures are harder than recognition. Russ Cox has once noted [1] that the extraction has to be run as a separate step after the recognition and a fast DFA can't always be used for that, suggesting they are related but different problems.

[1] https://swtch.com/~rsc/regexp/regexp3.html


Well, if you allow an arbitrary depth of capture-group nesting, then that may be so, but it seems beside the point here. It is not clear to me that this article makes any point about extraction that is relevant to this discussion.


>At the very least the writing could have mentioned this discrepancy.

The original meaning of 'regular expression' is very specific, and has some significant implications which are lost with the now-common and less well-defined usage. Therefore, if anything needed having this discrepancy mentioned, it was your original statement "it is patently false that regular expressions cannot be recursive", as this is an issue where the distinction is crucial. It is good to see that you have now done so, though the way you have done so suggests there is nothing of practical interest in the formal definition, which, I suggest, would be patently false.


I intentionally used the term "regex" elsewhere for that reason, but I later realized that the indirect quotation can be still problematic.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: