Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Is psychic vicuna a special model, or just regular vicuna wearing a bejeweled turban?


vicuna v1.5 7b


I noticed that the app is listed as being ~3Gb in size and the Vicuna 7b model is ~13Gb in size. What did you do to compress it? Same for memory... I think it needs 30Gb? And same for CUDA or GPU support... How does that work, or is it just running on the CPU?


The model is compiled for apple silicon/metal with 4-bit quantization using mlc-llm https://mlc.ai/mlc-llm/ which uses TVM Unity https://tvm.apache.org/ under the hood.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: