I love Rust. I've lost track of the number of times I've stared at a chunk of C++ code, considered how much nicer it would look in Rust, and sighed. At the beginning of this project, we did consider whether it was the right place to use Rust.
There are no existing JS-compatible regexp crates. BurntSushi's regex crate is great, but finite automata don't do backreferences, which is a dealbreaker for JS. After this code landed, somebody brought https://github.com/ridiculousfish/regress to my attention. It looks promising, but still has a long way to go before it's production-ready, and it didn't exist when we made our decision.
If we had written our own replacement, it would likely have been Rust. SpiderMonkey has a lot of cross-cutting issues (GC especially) that make it hard to replace individual C++ components with Rust, but the regexp engine has a pretty clean API boundary. It's the same reason that it was feasible to swap in Irregexp in the first place.
Ultimately we decided that writing a new engine wasn't the best use of time. A regexp engine is a complicated beast. Writing a new high-performance, JS-compatible engine in Rust could have been person-years of effort, with a long tail of corner cases and performance issues. We haven't had many memory safety bugs in regexp code. As sstangl points out in a sibling comment (hi Sean!), doing JIT compilation undermines some of Rust's safety guarantees.
When it comes down to it, the regexp engine is not a place where SpiderMonkey is looking to push the state of the art. We have to be reasonably fast and feature-complete, but beyond that nobody is going to notice marginal gains. There are higher-leverage opportunities elsewhere.
So another up and coming crate in this space is Raph Levien's fancy-regex [0] which uses a hybrid model to support back references. No idea how compatible the syntaxes are.
I love Rust. I've lost track of the number of times I've stared at a chunk of C++ code, considered how much nicer it would look in Rust, and sighed. At the beginning of this project, we did consider whether it was the right place to use Rust.
There are no existing JS-compatible regexp crates. BurntSushi's regex crate is great, but finite automata don't do backreferences, which is a dealbreaker for JS. After this code landed, somebody brought https://github.com/ridiculousfish/regress to my attention. It looks promising, but still has a long way to go before it's production-ready, and it didn't exist when we made our decision.
If we had written our own replacement, it would likely have been Rust. SpiderMonkey has a lot of cross-cutting issues (GC especially) that make it hard to replace individual C++ components with Rust, but the regexp engine has a pretty clean API boundary. It's the same reason that it was feasible to swap in Irregexp in the first place.
Ultimately we decided that writing a new engine wasn't the best use of time. A regexp engine is a complicated beast. Writing a new high-performance, JS-compatible engine in Rust could have been person-years of effort, with a long tail of corner cases and performance issues. We haven't had many memory safety bugs in regexp code. As sstangl points out in a sibling comment (hi Sean!), doing JIT compilation undermines some of Rust's safety guarantees.
When it comes down to it, the regexp engine is not a place where SpiderMonkey is looking to push the state of the art. We have to be reasonably fast and feature-complete, but beyond that nobody is going to notice marginal gains. There are higher-leverage opportunities elsewhere.