I'm not sure how this is different than: https://github.com/1rgs/jsonformer or h...

remilouf · on Aug 14, 2023

Thanks for bringing clownfish and relm to my attention! afaik other libraries loop over the entire vocabulary at every step of the generation. We on the other hand build an index at initialization by looping once over the vocabulary. Then generation is just as fast as standard generation.

burke · on Aug 14, 2023

torch-grammar generates a mask per PDA stack... we don't try to compute all the possible stacks. I'm sure there's something smarter that could be done here and you've probably figured it out (though IIRC regular languages don't have the arbitrarily recursive stack problem that you get when you get to context-free languages?) anyway, in practice we spend a few milliseconds on the first few requests building caches and then just apply masks from caches after that.

remilouf · on Aug 15, 2023

Sorry for misrepresenting your work. Thank you for correcting me and the explanation. Will take a closer look.

mkuchnik · on Aug 15, 2023

Hi, author of ReLM here. We use automata as well, like you describe, if I understand correctly.