Hacker Newsnew | past | comments | ask | show | jobs | submit | layoric's commentslogin

> The second you turn your head though, your fellow teammates will conspire to replatform onto Go or Rust or NodeJS or GitHub Actions and make everything miserable again.

Curious how would you use use Smalltalk in replace of GitHub Actions assuming you need a GitHub integrated CI runner?


All any build toolkit is is automations over bash. You can make your own. GitHub integration need not be any more than the most trivial thing that works. Your coworkers, naturally, won't be disciplined enough to keep the integration trivial and will build super complicated crap that's realty hard to troubleshoot, because they can.

I have a hard time trying to conceptualize lossy text compression, but I've recently started to think about the "reasoning"/output as just a by product of lossy compression, and weights tending towards an average of the information "around" the main topic of prompt. What I've found easier is thinking about it like lossy image compression, generating more output tokens via "reasoning" is like subdividing nearby pixels and filling in the gaps with values that they've seen there before. Taking the analogy a bit too far, you can also think of the vocabulary as the pixel bit depth.

I definitely agree replacing AI or LLMs with "X driven by compressed training data" starts to make a lot more sense, and a useful shortcut.


You're right about "reasoning". It's just trying to steer the conversation in a more relevant direction in vector space, hopefully to generate more relevant output tokens. I find it easier to conceptualize this in three dimensions. 3blue1brown has a good video series which covers the overall concept of LLM vectors in machine learning: https://youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_...

To give a concrete example, say we're generating the next token from the word "queen". Is this the monarch, the bee, the playing card, the drag entertainer? By adding more relevant tokens (honey, worker, hive, beeswax) we steer the token generation to the place in the "word cloud" where our next token is more likely to exist.

I don't see LLMs as "lossy compression" of text. To me that implies retrieval, and Transformers are a prediction device, not a retrieval device. If one needs retrieval then use a database.


> You're right about "reasoning". It's just trying to steer the conversation in a more relevant direction in vector space, hopefully to generate more relevant output tokens.

I like to frame it as a theater-script cycling through the LLM. The "reasoning" difference is just changing the style so that each character has film noir monologues. The underlying process hasn't really changes, and the monologues text isn't fundamentally different from dialogue or stage-direction... but more data still means more guidance for each improv-cycle.

> say we're generating the next token from the word "queen". Is this the monarch, the bee, the playing card, the drag entertainer?

I'd like to point out that this scheme can result in things that look better to humans in the end... even when the "clarifying" choice is entirely arbitrary and irrational.

In other words, we should be alert to the difference between "explaining what you were thinking" versus "picking a firm direction so future improv makes nicer rationalizations."


It makes sense if you think of the LLM as building a data-aware model that compresses the noisy data by parsimony (the principle that the simplest explanation that fits is best). Typical text compression algorithms are not data-aware and not robust to noise.

In lossy compression the compression itself is the goal. In prediction, compression is the road that leads to parsimonious models.


The way I visualize it is imagining clipping the high frequency details of concepts and facts. These things operate on a different plane of abstraction than simple strings of characters or tokens. They operate on ideas and concepts. To compress, you take out all the deep details and leave only the broad strokes.

One day people will say "we used to think the devil is in the details, but now we know it is in their removal".

It is not a useful shortcut because you don't know what the training data is, nothing requires it to be an "average" of anything, and post-training arbitrarily re-weights all of its existing distributions anyway.

> In general once you start thinking about scaling data to larger capacities is when you start considering the cloud

What kind of capacities as a rule of thumb would you use? You can fit an awful lot of storage and compute on a single rack, and the cost for large DBs on AWS and others is extremely high, so savings are larger as well.


Well, if you want proper DR you really need an off-site backup, disk failover/recovery, etc. And if you don’t want to manually be maintaining individual drives then you’re looking at one of the big, expensive storage solutions with enterprise grade hardware, and those will easily cost some large multiple more than whatever 2U db server you end up putting in front of it.

Same setup here, one game setup I've hit but this will be a rare problem, is StarCraft Remastered. Wine has an issue with audio processing which I can't seem to configure my way out of. It pegs all 32 threads and still stutters. Thankfully this game can likely run on an actual potato, so I have a separate mini PC running windows for this when I want to get my ass kicked on battle.net.


Working at IT places in the late 2000s, it was still pretty common place for there to be a server rooms. Even for a large org with multiple sites 100s of kms a part, you could manage it with a pretty small team. And it is a lot easier to build resilient applications now than it was back then from what I remember.

Cloud costs are getting large enough that I know I’ve got one foot out the door and a long term plan to move back to having our own servers and spend the money we save on people. I can only see cloud getting even more expensive, not less.


There is currently a bit of an early shift back to physical infra. Some of this is driven by costs(1), some by geopolitical concerns, and some by performance. However, dealing with physical equipment does introduce a different set (old fashioned, but somewhat atrophied) set of skills and costs that companies need to deal with.

(1) It is shocking how much of a move to the cloud was driven by accountants wanting opex instead of capex, but are now concerned with actual cashflow and are thinking of going back. The cloud is really good at serving web content and storing gobs of data, but once you start wanting to crunch numbers or move that data, it gets expensive fast.


In some orgs the move to the cloud was driven by accountants. In my org it was driven by lawyers. With GDPR on the horizon and murmurs of other data privacy laws that might (but didn't) require data to be stored in that customer's jurisdiction, we needed to host in additional regions.

We had a couple rather large datacenters, but both were in the US. The only infrastructure we had in the EU was one small server closet. We had no hosting capacity in Brazil, China, etc. Multi-region availability drove us to the cloud - just not in the "high availability" sense of the term.


> I can only see cloud getting even more expensive, not less.

When you have three major hyperscalers competing for your dollars this is basically not true and not how markets work...unless they start colluding on prices.

We've already seen reduction in prices of web services costs across the three major providers due to this competitive nature.


And it’ll be so good and cheap that you’ll figure “hell, I could sell our excess compute resources for a fraction of AWS.” And then I’ll buy them, you’ll be the new cloud. And then more people will, and eventually this server infrastructure business will dwarf your actual business. And then some person in 10 years will complain about your IOPS pricing, and start their own server room.


I discovered this project recently and used it for Himawari Standard Data format and it made it so much easier. Definitely recommend using this if you need to create binary readers for uncommon formats.


Exactly, and the performance of consumer tech is wildly faster. Eg, a Ryzen 5825U mini pc with 16GB memory is ~$250USD with 512GB nvme. That thing will outperform of 14 core Xeon from ~2016 on multicore workloads and absolutely thrash it in single thread. Yes lack of ECC is not good for any serious workload, but great for lower environments/testing/prototyping, and it sips power at ~50W full tilt.


Curiously, RAM sizes haven't gone up much for consumer tech.

As an example: my Macbook Pro from 2015 had 16 GiB RAM, and that's what my MacBook Air from 2025 also has.


Ehhh Macbook Pros can be configured with up to 128 now, iirc 16 was the max back then. But I guess the baseline hasn't moved as much.


Yes, there has been some movement. But even an 8 fold increase (128/16) over a decade is nothing compared to what we used to see in the past.

Oh, and the new machine has unified RAM. The old machine had a bit of extra RAM in the GPU that I'm not counting here.

As far as I can tell, the new RAM is a lot faster. That counts for something. And presumably also uses less power.


> $50 for a dyno with 1 GB of ram in 2025 is robbery

AWS isn't much better honestly.. $50/month gets you an m7a.medium which is 1 vCPU (not core) and 4GB of RAM. Yes that's more memory but any wonder why AWS is making money hand-over-fist..


Not sure if it's an apples-to-apples comparison with Heroku's $50 Standard-2X dyno, but an Amazon Lightsail instance with 1GB of RAM and 2 vCPUs is $7/month.


AWS certainly also does daylight robbery. In the AWS model the normal virtual servers are overpriced, but not super overpriced.

Where they get you is all the ancillary shit, you buy some database/backup/storage/managed service/whatever, and it is priced in dollars per boogaloo, you also have to pay water tax on top, and of course if you use more than the provisioned amount of hafnias the excess ones cost 10x as much.

Most customers have no idea how little compute they are actually buying with those services.


That is assuming you need that 1 core 24/7, you can get 2 core / 8gb for $43, this will most likely fit 90% of workloads (steady traffic with spikes, or 9-5 cadence).

If you reserve that instance you can get it for 40% cheaper, or get 4 cores instead.

Yes it's more expensive than OVH but you also get everything AWS to offer.


This, plus as a backup plan going from Heroku to AWS wouldn't necessarily solve the problem, at least with our infra. When us-east-1 went down this week so did Heroku for us.


m7a doesn't use HyperThreading; 1 vCPU is a full dedicated core.

To compare to Heroku's standard dynos (which are shared hosting) you want the t3a family which is also shared, and much cheaper.


I must be confused, my understanding was m7a was 4th generation Epyc (Genoa, Bergamo and Siena) which I believe all have 2 threads per core no?


You're not confused--AWS either gets custom chips without it, or they disable the SMT. I'm not sure which. Here's where AWS talks about it: https://aws.amazon.com/ec2/instance-types/m7a/

> One of the major differences between M7a instances and the previous generations of instances, such as M6a instances, is their vCPU to physical processor core mapping. Every vCPU on a M7a instance is a physical CPU core. This means there is no Simultaneous Multi-Threading (SMT). By contrast, every vCPU on prior generations such as M6a instances is a thread of a CPU core.

My wild guess is they're disabling it. For Intel instance families they loudly praise their custom Intel processors, but this page does not contain the word "custom" anywhere.


Decoder for people reading:

PM - Product Manager

FS - Fullstack developer

FE - Frontend developer

BE - Backend developer


Interestingly the French is completely different.

https://github.com/microsoft/edgeai-for-beginners/blob/main/...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: