Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The Qwen family of models are REALLY impressive. I would encourage anyone who hasn't paid them any attention to at least add them to your mental list of LLMs worth knowing about.

Qwen2-VL is a decent vision model. You can try it out online here: https://huggingface.co/spaces/GanymedeNil/Qwen2-VL-7B - I got great results from it for OCR against handwritten text: https://simonwillison.net/2024/Sep/4/qwen2-vl/

Qwen2.5-Coder-32B is an excellent (I'd say even GPT-4 class) model at generating code which I can run on a 64GB M2 MacBook Pro: https://simonwillison.net/2024/Nov/12/qwen25-coder/

QwQ is the Qwen team's exploration of the o1-style of model that has built in chain-of-thought. It's absolutely fascinating, partly because if you ask it a question in English it will often think in Chinese before spitting out an answer in English. My notes on that one here: https://simonwillison.net/2024/Nov/27/qwq/

Most of the Qwen models are Apache 2 licensed, which makes them more open than many of the other open weights models (Llama etc).

(Unsurprisingly they all get quite stubborn if you ask them about topics like Tiananmen Square)



Thanks for the summary. I have been testing QwQ on my M1 (via ollama). I tried a couple double-slit quantum thought experiments, and also found the reasoning mode absolutely fascinating. Occasionally a few logographs appear, but they so far they were not in the way.

The funniest was asking for an ascii graphics depiction of a minecraft watch recipe, and I was actually feeling quite sorry for it, 'wait that can't be right' 'let me try' 'still not right' round and round it went, at least a few pages at which point it decided to try the second recipe I'd asked about to see if that helped with the first.

I didnt know about the other models, 'coder' is downloading now, and fingers crossed it fits in 32GB and knows a bit about Zig.

It sounds like you got the vision one running locally on your M2, nice. I'm running Asahi Linux and not tried anything AI/SD/graphical orientated yet. But nice that you got some SVG out of coder, I never thought of using a coding model in that way.


QwQ often spits out Chinese characters smack dab in the middle of a sentence. Weirdly, it doesn’t break up the coherence or logic, there’s just symbols added.


Do you think training in multiple languages could act as a regularization? Just as polyglots are smarter in real life?


I haven’t seen the architecture of QwQ but I just assumed it learns languages insofar as to pick up relationships between words. It must mean it picks up logic across languages. Huh


I thought too so. But then o1 thinks in english and Qwen thinks in chinese. Is there advantage in thinking in different languages?


> Unsurprisingly they all get quite stubborn if you ask them about topics like Tiananmen Square

Has anyone made a political censorship eval yet?


This would be a great one. Censorship / political alignment compass.


Unfortunately, real political alignment doesnt exist. Most people dont use ideologies to determine alignment.


Alignment might be the wrong goal. Labeling and scoring, something like a multi-dimensional Ground News.


Is it possible to build similar to anthropic computer use feature with Qwen vision model.

Someone open sourced it with langchain

https://x.com/1littlecoder/status/1856397375704576399


Browser use is very easy. Can even do that headless. That way, you can also do bulk processing. For a client, I did some 16k websites with a simple LLM agent. With “computer use” how long would that take, and what would it cost? For me, it was ~$20 (I used Gemini for this task).


Agree. It is amazing that you can run an o1 style model on a Mac. I was able to run QwQ on my 24GB M3 MacBook Air, though results on complex reasoning on domain specific tasks did not work well, and I saw the Chinese 'thinking' too (they don't work well in o1 either). It opens up experimentation which is great, and the reasoning traces for domain specific tasks for RL is where the next improvements are going to come from


I recently fine-tuned the Qwen-Coder-7b-Instuct model to generate Milkdrop presets. Pretty amazing to see what can be done locally. https://huggingface.co/InferenceIllusionist/MilkDropLM-7b-v0...


Why does this model think in Chinese and o1 think in English? Is this because chain-of-thought is achieved by training these models on examples of what “thinking” looks like, which have been constructed by their respective model developers, as opposed to being a more generic feature?


We don’t know o1 thinks in English. What we see is a summary of the thinking process.

My guess is it “thinks” in non-eligible tokens


The o1 release blog post contains 8 full examples of o1 chains of thought (not the summarized versions visible to users). They're English.

https://openai.com/index/learning-to-reason-with-llms/#chain...

I have seen the summaries dip into completely random languages like Thai, so it might switch between languages occasionally.


Did you mean to write "non-legible" ?


> often think in Chinese

I noticed that too, but I haven't seen it think in numbers in Chinese like most bilingual chinese speakers prefers. Or at least I haven't been able to trigger it.


Recently I was scrolling through HF to try a very small model. Fired up Qwen 0.5 B and surprisingly for my purposes it did between that even Lllama 2 7B. That was very surprising to me.


Have you tried query extraction over tabular data? Is there any free models which are comparable to amazon textract for that?


It's pretty good for handwritten maths too - I just tried that demo. Do you know any other open models good at maths notation?


https://huggingface.co/datasets/TIGER-Lab/MathInstruct

Works with +700 year old books w some tweaks. took like $400 to train. can't share more because i don't know more.


That seems to be just for LLMs, not visual. I'm wanting to go from images of maths notation (photos, scans, digital handwriting) to formulas in Latex or MathML or something. Qwen2-VL can do it, but it's pretty heavyweight for just that.


Are we at a point where we could run nonquantized models from QwQ/Qwen series on a 128GB Macbook Pro?


I think so. Are bf16 models nonquantized? There's an MLX one here that should fit on that machine: https://huggingface.co/mlx-community/QwQ-32B-Preview-bf16


Do you have an opinion on Mini CPM 2.6 in comparison to Qwen2-VL?


I haven't tried that Mini CPM model yet.


Just tested it with a meme, and it nailed it.


> (Unsurprisingly they all get quite stubborn if you ask them about topics like Tiananmen Square)

I wonder how the abliterated variants respond to this query.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: