Why does pretraining or not matter in the ISPD 2023 paper? The circuit_training ...

negativeonehalf · on Sept 28, 2024

The quick-start guide in the repo that said you don't have to pre-train for the sample test case, meaning that you can validate your setup without pre-training. That does not mean you don't need to pre-train! Again, the paper talks at length about the importance of pre-training.

marcinzm · on Sept 28, 2024

This is what the repo says:

>Results >Ariane RISC-V CPU >View the full details of the Ariane experiment on our details page. With this code we are able to get comparable or better results training from scratch as fine-tuning a pre-trained model.

The paper includes a graph showing that it takes longer for Ariane to train without pre-training however the results in the end are the same.

negativeonehalf · on Sept 29, 2024

See their ISPD 2022 paper, which goes into more detail about the value of pre-training (e.g. Figure 7): https://dl.acm.org/doi/pdf/10.1145/3505170.3511478

Sometimes training from scratch is able to match the results of pre-training, given ~5X more time to converge. Other times, though, it never does as well as a pre-trained model, converging to a worse final result.

This isn't too surprising -- the whole point of the method is to be able to learn from experience.

anna-gabriella · on Sept 28, 2024

That does not mean you need to pre-train either. Common sense, no?