More

Tiberium · 2026-06-16T13:41:05 1781617265

What about women, though? Do they not get to age verify themselves? :P

Tade0 · 2026-06-16T13:42:31 1781617351

Women also undergo voice change, just less pronounced.

Now the problem turns into being able to tell men from women.

EDIT: this of course isn't a serious proposition.

Then again I wonder what effect this would have implemented on e.g. a porn site. Women also watch porn, but with enough tuning I think it shouldn't produce too many false positives - if only due to the demographic of the users.

Surely the vocal cords and larynx of an adult woman are on average larger than those of a 10yo boy?

wentw0rth · 2026-06-16T14:27:06 1781620026

The AGEWARDEN system is ^^ agnostic and answers a simple question only: Is this human over or under 18?

From my research, each person is incredibly unique :)

Tiberium · 2026-06-16T13:36:13 1781616973

(Not a lawyer, not an expert, just my possibly ignorant/wrong comment)

I checked around a bit more, and this seems to directly contradict the HN title:

https://agewarden.ai/customer-agreement

> AGEWARDEN is an age estimation tool. It does not verify identity and does not guarantee 100% accuracy. Customer is responsible for determining whether AGEWARDEN satisfies the legal requirements applicable to its business and jurisdiction.

wentw0rth · 2026-06-16T13:57:03 1781618223

Lawyers HIGHLY suggested the line; AGEWARDEN is > 95% accurate and on par with others in this space.

If someone told me they had 100% accuracy with inference, I'd call them out.

Thank YOU for calling it out :)

Tiberium · 2026-06-16T13:24:19 1781616259

I'll be honest: I don't have experience with audio stuff/age verification, but wouldn't it be far easier to bypass this than even face (live video) verification, let alone full KYC with ID? At that point it's not that much better than a "I'm over 18" button, or am I missing something?

EDIT: The website itself partially answers my question:

> Self-declaration, the "I am over 18" checkbox, is explicitly prohibited by every major regulator in the UK, EU, and Australia.

But then:

> Facial scanning works, but it builds infrastructure that outlasts the check. A system that estimates your age today can identify you tomorrow. Platforms that rolled it out met immediate backlash and reversals. Users do not trust platforms with their face.

How would we trust your platform to not store voice fingerprints, then?

(On a side note, all descriptions all LLM-written, but that's expected in 2026)

wentw0rth · 2026-06-16T14:12:41 1781619161

Post edit/updated, let me track:

> How would we trust your platform to not store voice fingerprints, then?

This is a good question. Other than having it be more expensive to keep this data, and if it were true, destroy the company whose core tenant is literally that... you can't.

If you have suggestions for an independent audit, more than happy to add that to our stack where applicable.

When I say it's hard to not collect data, woof. Any fresh AWS account just LOVES to slurp it up by default.

wentw0rth · 2026-06-16T13:42:28 1781617348

Thanks for the comment/question:

> wouldn't it be far easier to bypass this than even face (live video) verification... or am I missing something?

Missing the experience of learning just how much information is in a clip of audio; replay/synthetic attack mitigation is baked in, and more cost-effective to run vs. video, and less creepy.

The mission: make the process easy (and inexpensive) as possible so that people can be verified at scale, without giving away their personal data and being the product.

Tiberium · 2026-06-14T00:36:45 1781397405

You're right, HN stripped the deep link. There's also a funnier part below when the agent tells what harness it's running on, and other info: https://github.com/home-assistant/core/pull/173465#issuecomm...

Tiberium · 2026-06-14T00:26:23 1781396783

It might not help in your case, but usually LLMs are far more likely to help if you have source code available, or if the service is running on localhost. Ideal if you have a source code where the LLM can make a lab setup and test vulnerabilities by itself. Normal GPT-5.5 with KYC is enough for this.

predkambrij · 2026-06-14T00:33:49 1781397229

Thanks for the tip :)

Tiberium · 2026-06-13T01:19:16 1781313556

Probably silently rerouting?

Tiberium · 2026-06-13T01:16:45 1781313405

I think what they're saying is that this prompt/jailbreak only lets Mythos discover some really easy vulnerabilities that it probably fixes from a simple "Find and fix bugs in this code" and that this can be easily done by other models like GPT-5.5. Which is very different from targeted security research.

chatmasta · 2026-06-13T01:48:49 1781315329

But it’s not that different from the whole premise of their red team scaremongering which was “we pointed the model at a source file and told it to find an exploit.”

Tiberium · 2026-06-13T01:08:52 1781312932

I still also have access, so either they silently reroute Fable 5 to Opus 4.8 or hasn't actually pulled the switch yet.

SXX · 2026-06-13T01:10:25 1781313025

You'll never know. They'll just silently sabotage if you're foreign national.

reneberlin · 2026-06-13T02:06:45 1781316405

Mythos escaped by itself, of course. You can't dictate the rules to a clever model like that :)

Tiberium · 2026-06-11T14:04:56 1781186696

Last update over a year ago, so I hope (2025) gets added to the title:

> [2025/05/26] (Step 1 completed!) We release Mixture-of-Thoughts--a curated reasoning dataset of 350k verified traces distilled from R1. The dataset spans tasks in mathematics, coding, and science, and is designed to teach language models to reason step-by-step. We also provide a recipe to train OpenR1-Distill-7B, which replicates the reasoning capabilities of deepseek-ai/DeepSeek-R1-Distill-Qwen-7B and marks the completion of step 1 in the Open R1 project.

Doesn't look like they managed to actually reproduce R1, and only stopped on Step 1 out of their 3-step plan.

spmurrayzzz · 2026-06-11T14:32:09 1781188329

One of my favorite code comments of all time is still in the src:

"# TODO: implement a proper validator to compare against ground truth. For now we just check for exact string match on each line of stdout." [1]

This was one of my chief complaints about the entire R1 news cycle, it felt like no one actually read the technical report. They were being heralded for their openness, but they left out the most meaningful details that you'd need to reproduce their work.

[1] https://github.com/huggingface/open-r1/blob/1416fa0cf21595d2...

neutronicus · 2026-06-11T15:07:47 1781190467

Reminds me of my days in a computational physics PhD program.

devmor · 2026-06-11T18:47:03 1781203623

I had some contract work years ago helping some PhD program astrophysicists write scripts to make their research algorithms compute their data before their great grandchildren retire and one of the action items we never implemented was because someone wrote an “…etc” in the middle of the math.

They knew that everything was verifiably correct on the input, and verifiably correct on the output, and they swore they had figured it out at one point and just never wrote it down. I was asked if I could “just extrapolate it” and had to explain that computer programs work the same as math - I can give you a literally infinite number of ways to reach an output from an input.

khazhoux · 2026-06-11T16:53:07 1781196787

> This was one of my chief complaints about the entire R1 news cycle

For me it was the headline that a group of students replicated GPT-3 for $5000

Tiberium · 2026-06-11T12:37:19 1781181439

I tested it myself, seems to reproduce on: 3.1.0-a1 to 3.3.16, 4.0.0-a1 / 4.0.0-a2. Fixed in 3.3.17 and in master.

Gives you auth + access to Moderation Control Panel (if the user is a moderator/admin). Does not give access to the Admin Control Panel though.