GPT 3.5 vs. Llama 2 fine-tuning: A Comprehensive Comparison

todd3834 · on Sept 18, 2023

What has me excited about Llama: I've built some tools that I think would make sense to offer for an affordable "lifetime price" but they currently rely on OpenAI api / GPT4. I cannot get myself to offer lifetime memberships to something with an ongoing cost. Lately I've been considering building Electron apps with Llama for code embedded targeted toward Apple Silicon devices. I think with this stack I wouldn't incur any ongoing costs and these utilities could exist for a one time fee.

kromem · on Sept 18, 2023

Lifetime fees don't make a lot of sense for products that will likely have rather short obsolescence cycles.

Larrikin · on Sept 18, 2023

Unless work is paying the cost, getting another subscription for a cloud service and not being able to run the software after some arbitrary period doesn't make a lot of sense to me. I'm willing to pay for updates, I'm tired of every company wanting to turn me into a renter

theironhammer · on Sept 18, 2023

This rent seeking trend in business is like a cancer.

fomine3 · on Sept 19, 2023

Cloud computing is a practical idea in theory. What's bad is that big techs start rent seeking thanks to lock-in. It's okay to utilize cloud LLMs unless they start lock-in IMO, though they will.

ebiester · on Sept 18, 2023

The expectations of the market will keep improving. Even if you don't have ongoing costs, it would behoove you to think about your upgrade plan up front.

svapnil · on Sept 18, 2023

This is really cool, nice work!

Quick question - what would the cost of inference be, at scale, between a fine-tuned 3.5 and Llama 2 fine-tuned? Surely that's another factor that should be considered in this case, right?

lukev · on Sept 18, 2023

I'm curious about the terminology for the "functional representation" dataset.

Is this a well-defined term? I've been thinking about similar approaches for getting more structured propositional knowledge into and out of LLMs, and the examples in the Viggo data set are the closest thing so far to someone thinking the same way I am.

However, Google doesn't turn up many results that use the term in this way. I'd love any more resources or information on the topic.

thewataccount · on Sept 18, 2023

I've been struggling with figuring out a good dataset for fine-tuning. Most of the ones that exist were purpose made for finetuning/training a model already.

Does anyone have any tips for creating sufficient datasets for finetuning specific workloads?

BoorishBears · on Sept 19, 2023

Is there a notebook that shows a reproducible evaluation procedure? You linked to some general eval, not your actual evaluation.

Tepix · on Sept 19, 2023

Would using two RTX 3090s instead of the A40 have been an option?

jerpint · on Sept 19, 2023

I’d be very interested in a similar comparison for RAG style tasks