Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
suraci
on Jan 25, 2025
|
parent
|
context
|
favorite
| on:
TinyZero: Reproduction of DeepSeek R1 Zero in coun...
It looks like the 'old-school' RL to me, which makes me wonder why it took so long to get here
vixen99
on Jan 25, 2025
|
prev
[–]
Nothing like acronyms to make me feel dumb and ill-informed.
basementcat
on Jan 25, 2025
|
parent
[–]
Reinforcement Learning
https://en.m.wikipedia.org/wiki/Reinforcement_learning
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: