Lately I've been wondering an analogous question in NLP. Could modern NLP techni...

The_rationalist · on Oct 28, 2019

A neural network doesn't understand anything but sure it could predict the program output sequence based on the input sequence. Would there be any useful use case to such a technology?

6gvONxR4sf7o · on Oct 28, 2019

There's no technological use for neuroscience analysis tools to model a microchip, nor for NLP modeling source code, but that isn't the point. It's a potentially useful target to hone the methodology itself on. I actually don't think something like BERT would do so well at learning to essentially execute code.

The counterexample is for a transformer with context width of N tokens, to try to model "def f(x) { ... }; <at least N more tokens here>; def g(x) { y = f(x) ...};". In this case, in trying to evaluate "g(4)", f appears as the literal token "f" and nothing more. To be able to learn to evaluate g, a model would need to be able to do (at least) distant coreference resolution.

jakeinspace · on Oct 28, 2019

Maybe for simple programs. Of course, an arbitrarily deep neural net can emulate any compiler (a function from plain text to machine instruction sequences). But predicting output is limited by the halting problem and complexity.

fnrslvr · on Oct 29, 2019

Whether a neural net could be trained to emulate a compiler seems like the more pertinent question.

I've had the displeasure of trying to explain classical learning hardness results to someone who kept repeating back to me "but RNNs are Turing complete!" If anything, the expressive power of the model you're trying to train pushes back against your efforts to train it to do something useful.

gdy · on Oct 28, 2019

Could it?

SeanAppleby · on Oct 28, 2019

GPT-2 doesn't do a _great_ job at this, but it does to some extent seem to have learned some relationships in structures of text in different programming languages.

For example while toying around with it, I gave it a javascript function that referenced an undefined callback, and GPT-2 gave me back a function, with almost correct syntax and mostly random logic, but with the name and call signature of the callback.

6gvONxR4sf7o · on Oct 28, 2019

These sorts of models do well at generating text, whether it's code or language. That's what you're talking about. They can perform the "execution" to some degree too for natural language (e.g. the squad tasks). But Q&A and text generation are very different tasks. I haven't seen anyone apply a transformer to source code based Q&A tasks.

voldacar · on Oct 29, 2019

>modern NLP techniques

they are turing complete, so sure.