It looks like the 'old-school' RL to me, which makes me wonder why it took so lo... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		suraci on Jan 25, 2025 \| parent \| context \| favorite \| on: TinyZero: Reproduction of DeepSeek R1 Zero in coun... It looks like the 'old-school' RL to me, which makes me wonder why it took so long to get here

vixen99 on Jan 25, 2025 | [–]

Nothing like acronyms to make me feel dumb and ill-informed.

basementcat on Jan 25, 2025 | [–]

Reinforcement Learning

https://en.m.wikipedia.org/wiki/Reinforcement_learning

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact