Hacker Newsnew | past | comments | ask | show | jobs | submit | spudlyo's commentslogin

"Fog everywhere. Fog up the river, where it flows among green aits and meadows; fog down the river, where it rolls defiled among the tiers of shipping and the waterside pollutions of a great (and dirty) city. Fog on the Essex marshes, fog on the Kentish heights. Fog creeping into the cabooses of collier-brigs; fog lying out on the yards and hovering in the rigging of great ships; fog drooping on the gunwales of barges and small boats. Fog in the eyes and throats of ancient Greenwich pensioners, wheezing by the firesides of their wards; fog in the stem and bowl of the afternoon pipe of the wrathful skipper, down in his close cabin; fog cruelly pinching the toes and fingers of his shivering little ’prentice boy on deck. Chance people on the bridges peeping over the parapets into a nether sky of fog, with fog all round them, as if they were up in a balloon and hanging in the misty clouds."

To me that all sounds lovely and evocative, hmm. Maybe an inspiration for some of the vibe in the game Little Inferno?

> It's a thermonuclear ADHD amplifier and I have seen the same effect in every single one of my adult friends.

You make this sound like a bad thing. ADHD isn't always about attention deficit, although it is right there in the name. It's more about attention dysregulation. For those of us prone to hyperfocus, working with AI can provide the kinds of stimulation we crave. I can hardly remember a time when I've felt more engaged with my work, more productive, and more badass.

I actually enjoy the collaborative programming process, and was pair programming with folks before the term was coined. At the end of the day I have the satisfaction of browsing the pretty, readable, DRY, maintainable code we end up with after rounds of refactoring and back and forth. I have always employed linters and code formatters, and this is no different, and my standards are still the same. I yell at the clanker about code duplication, hard-coded assumptions, tightly coupled logic, and in the end, while I don't understand the details of every algorithm, I really understand what we've built and the architecture we've designed.


Absolutely. I can't tell you how many times I've been in a conversation and halfway through a sentence I need to whip out AI to scratch the mental itch so I can continue with the conversation.

But prior to this I would rabbit hole. I would try desperately to remember some nuance, or I would not be able to move off a point until I got the validation I was looking for.

The worst is when speaking a foreign language and I hit some complex word in my native language that isn't present in my foreign lexicon. My brain just halts. It wants THAT word or phrase, not a 3 minute detour describing a whole concept.

AI has empowered me to move past these unnecessarily difficult speed bumps in my thinking.


> I actually enjoy the collaborative programming process, and was pair programming with folks before the term was coined

Yep, the same here, I'm a long pair programming enjoyer, but I'd like to raise that collaboration is usually meant with a human being in the context of pp, and prompting and agent to execute a task is nothing like that.


Prompting an agent to execute a task assumes you know what the task should be, have done some research on available options, weighed the pros and cons of various approaches, bounced your ideas off a colleague, have written a few test programs to validate your assumptions, considered how the new code will integrate with existing systems, figured out the parts that you should have tests for, and have generally charted a path forward that gives you a reasonable chance of success.

For me it's been useful as an idea categorizer: "oh well, that turned out to be a crap idea."

It's allowed me to clear out some long-standing brush on the forest floor. And burn it down once or twice.


Won't somebody please think of the copyright holders!?

"This material is valuable enough for me to steal, but not valuable enough to care about there being an incentive to create in the first place!"

Totally makes sense /s


I use it with Pi and with Gptel and I'm extremely happy about the price. The speed of deepseek-v4-pro though leaves something to be desired. I do love how detailed its chain of thought reasoning is, and it's pretty wild watching it think at ~2400 baud. It much more transparent than Gemini 3.5 flash in that regard, but maybe 4-5x slower? For my Latin language morphology and linguistic tasks it seems to be up to the job, and on the plus side I can analyze a handful of sentences parallel without worrying about breaking the bank.


Emacs has been a viable option for going on a half century now. The GNU Emacs 31 branch[0] was cut recently and is barreling towards a new release. It might be time to give it another look.

I'm not saying its package ecosystem isn't vulnerable to these kind of attacks, it is, but it's at least developed by folks with very different goals and ambitions than Microsoft.

[0]: https://github.com/emacs-mirror/emacs/blob/master/etc/NEWS


So, this project consists of a ~175 line README and a ~500 line Python program that glues yt-dlp and Kroko together. Neat.

I guess if it encourages you to install and figure out how to use ffmpeg, yt-dlp, kroko, numpy, and onnx that's a good thing. Sometimes just knowing a thing is possible is a huge benefit.


I see the value as a centralized anti-content-blocker.

This repo is now a good way to centralize hacks around the sure-to-come blockers those platforms will add to prevent download.

Just like uBlockOrigin was a way to centralize all the "just run this greasemonkey script" comments, I can see this getting a huge following for people who really value transcriptions.


I appreciate the perspective! higher ceiling than I'd put on it, but if it gets there awesome. PRs welcome!


thank you. You nailed the actual value, that's right. The real win is just knowing you can do this on a laptop CPU, offline, no GPU or cloud bill. There are tiny done-for-you details, like rescaling token timestamps back to real time after the atempo speedup so --timestamps doesn't lie to you, but they are minor.


Why the choice of Kroko over something like parakeet-tdt-0.6b-v3, which is also faster than realtime on CPU?


Kroko models are more accurate and their size is just a hundred megabytes compared to parakeet (2.5 gigabytes in default fp32)


Do you have a link to results confirming this? Kroko does not seem to be on the Open ASR Leaderboard. Parakeet has an average WER of 6.32 across several common datasets.


Kroko's website says benchmarks aren't formalized yet. FWIW, this url says 5% WER for English [0]. though it doesn't specify the dataset, so not directly comparable to Parakeet's 6.32 on the Open ASR Leaderboard

Best way to judge is to try it on your own audio

[0] https://huggingface.co/hudaiapa88/sherpa-stt-onnx


I was surprised how hard it was to stop the Python transformers library from phoning home to Hugging Face. I set HF_HUB_DISABLE_TELEMETRY=1, and when I called Wav2Vec2CTCTokenizer.from_pretrained I explicitly passed local_files_only=True, but still I got got a warning about not having a valid HF_TOKEN. It wasn't until I stumbled upon HF_HUB_OFFLINE=1 that I'm somewhat confident that I'm not making outgoing connections to HF every time I load a wav2vec2 model from disk.

I wouldn't have realized this was happening at all if it weren't for the obnoxious HF_TOKEN warning.


HF is notorious for making it difficult to work offline (or at least not waste time trying to connect when everything needed is offline) and is constantly changing how it is being handled. Previously, there was TRANSFORMERS_OFFLINE, HF_DATASETS_OFFLINE, etc.


Does something like Little Snitch catch these to help find the things doing hidden shenanigans?


Yes, it would flag an outbound connection from the Python process.


While I think it's a compelling idea that playing speech in your target language while you sleep can help, I don't think it's ever been demonstrated to work.

Having said that, that sleep is incredibly important for learning anything! I practice my language learning during the day, a little bit every day, and I prioritize getting good sleep. This is mostly just trying to go to bed at the same time every night, avoiding alcohol, and giving myself an hour before bed with low lights to read and calm my mind. When you sleep, memories are consolidated, organized, and tagged for long-term storage. I will sometimes wake up in the middle of the night and bouncing around in my mind are echos of phrases and words from my target language. I figure it's working.


I am really loving working on a fun Elisp project with pi, a minimal and very extensible agent. I have the agent use emacsclient to control my session, showing me code, running magit ediff for me, testing, formatting, reloading -- it's all working great.

I'm still exploring all the ways the agent and I can collaborate using Emacs as a shared medium, but at the moment am super optimistic about it.


The messaging around what is and isn't allowed with the various Claude plans has been so very muddled as of late. Add to that declining model performance, changes to default reasoning efforts, expanded token usage, caching bugs, corporate denials and gaslighting -- I don't think it's overstating matters to say they've suffered some major self-inflected reputational damage.

As it stands now, there is so much FUD surrounding their offerings, I'm not sure what they could do in the short term to turn things around.


It's just an organizational maturity thing.

They need to start shifting from "move fast and break things" to "move faster by slowing down". Their public communication, feature set, and organization as a whole needs to start matching the scale and level they're competing at. They won many hearts and minds by being better and are losing them by being chaotic. Different outcomes from the same internal behavior because they needed to change gears and haven't.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: