More

wills_forward · 2025-12-20T00:00:55 1766188855

Why not use both? I just built a pipeline for document data extraction that uses PaddleOCR, then Gemini 3 to check + fix errors. It gets close to 99.9% on extraction from financial statements finally on par with humans.

vrc · 2025-12-20T03:21:30 1766200890

I did the opposite. Tesseract to get bboxes, words, and chars and then mistral on the clips with some reasonable reflow to preserve geometry. Paddle wasn’t working on my local machine (until I found RapidOCR). Surya was also very good but because you can’t really tweak any knobs, when it failed it just kinda failed. But Surya > Rapid w/ Paddle > DocTr > Tesseract while the latter gave me the most granularity when I needed it.

Edit: Gemini 2.0 was good enough for VLM cleanup, and now 2.5 or above with structured output make reconstruction even easier.

jadbox · 2025-12-20T01:36:28 1766194588

This is The Way. Remember AI doesn't have to replace existing solutions but can tactfully supplement it.

wills_forward · 2025-09-22T19:32:08 1758569528

https://x.com/whowillrickwill/status/1920723985311903767

yencabulator · 2025-09-25T22:31:34 1758839494

The Break Fast Club.

wills_forward · 2025-09-05T16:04:42 1757088282

The cheap easy take: it's tragically ironic that the software running the infrastructure in Silicon Valley is such a problem

dilap · 2025-09-05T17:13:35 1757092415

It's a shame that SF politics are so dysfunctional it can't have a metro at the same level of quality as, say, North Korea.

coolspot · 2025-09-05T17:33:46 1757093626

North Korea? If you think it is a good example of a low bar of transit quality/safety to meet, then you’re comically far off.

dilap · 2025-09-05T18:58:57 1757098737

You think that's setting the bar too high or too low?

coolspot · 2025-09-05T23:39:01 1757115541

Too high. I think NK transit system is incomparably safer and cleaner than BART.

Riding without a ticket? Jail.

Littering on the platform? Straight to jail, right away!

Doing any violent crime in NK transit? Believe it or not - death by firing squad.

Here is a quick overview of how the system works: https://youtu.be/eiyfwZVAzGw?si=CnOMa8F6NkiyhifE

dilap · 2025-09-08T14:40:14 1757342414

We're in agreement about the facts on the ground.

Setting aside safety for a moment, consider just hygiene: BART is shockingly dirty. Which suggests mismanagement, above and beyond just a lack of detterence of criminality.

As for safety -- firing squads are probably not in the cards, but would jailing the violent be too much to hope for?

kelnos · 2025-09-06T07:06:35 1757142395

SF doesn't run BART, though.

Not saying SF politics is great, but at least point to the correct boogeyman.

dilap · 2025-09-08T14:46:29 1757342789

I didn't know that, but please accept SF as a sloppy metonym for bay area. :-)

gdulli · 2025-09-05T16:10:28 1757088628

Maybe expected though that high salaries there depress incentive to work in these jobs even more than other cities?

rustystump · 2025-09-05T16:22:24 1757089344

No. It is pretty typical for anything gov to be pretty bad. Most dont work there due to how bureaucratic it is rather than the comp. This is what my friends who work in gov say at least.

notmyjob · 2025-09-05T16:26:59 1757089619

There is a strong correlation between hiring low end people and being or becoming ever more bureaucratic. Bureaucracy like everything else is there for a reason.

aspenmayer · 2025-09-05T16:25:51 1757089551

And yet NYC .gov sites, apps, and functionality makes SF still look like a shantytown after all this time.

rustystump · 2025-09-05T19:20:57 1757100057

Beating a bar that is on the floor is none too impressive.

aspenmayer · 2025-09-07T10:14:12 1757240052

This dead horse ain’t gonna beat itself back to life. Might as well give it the ol’ college try, eh?

mleonhard · 2025-09-05T23:41:50 1757115710

BART is a government organization and all California government employee pay is public. You can see that BART has about 40 software engineers and they earn about 70% of the market rate:

https://transparentcalifornia.com/salaries/search/?q=compute...

It seems to me that they are over-worked & under-paid and are doing a good job given the circumstances.

NIMBYs have blocked BART in Silicon Valley. BART doesn't reach Menlo Park, Palo Alto, Stanford, Mountain View, Sunnyvale, Los Altos, Santa Clara, or Cupertino. A few years ago, it finally reached San Jose.

A separate train (CalTrain) goes from SF through Silicon Valley. Last year they switched to electric trains which are faster and run more frequently. The SF CalTrain station is inconvenient (20-mins walk from downtown, under a highway), but they are working to extend CalTrain to the central SF station: https://en.wikipedia.org/wiki/Salesforce_Transit_Center#Futu... .

So Silicon Valley transit is getting better, slowly.

jerlam · 2025-09-05T16:49:17 1757090957

BART barely goes into Silicon Valley. Fremont was the closest stop up until 2017. Now it gets to North San Jose. Even if was funded, any further extension wouldn't be complete for over a decade.

some-guy · 2025-09-05T16:47:46 1757090866

I'll bite: Silicon Valley isn't known for good infrastructure, we are just able to roll back changes very easily. The cost of getting software wrong for BART is far higher than if my div is padded incorrectly.

wills_forward · 2025-04-25T19:01:34 1745607694

So this could universally decrease the memory requirements by un-quantitized LLMs by 30%? Seems big if true.

moffkalast · 2025-04-25T19:28:30 1745609310

Not as big when Q8 quantization is already considered overkill and cuts it down to 50% (and a flat 2x speed boost without any additional compute overhead mind you) and the more common Q4KM is more like 30%. Definitely interesting if it can be added to existing quantization, but K quants do already use different precision levels for different layers depending on general perplexity impact which is similar to this entropy metric they use, e.g. Q6 using a mix of 4 bits and 8 bits. And that's not even considering calibrated imatrix which does something conceptually similar to FFT to compress even higher.

janalsncm · 2025-04-25T19:36:43 1745609803

Quantization is not lossless.

danielmarkbruce · 2025-04-25T19:39:19 1745609959

Nobody really cares if it meets a strict definition of lossless.

BoorishBears · 2025-04-25T20:12:33 1745611953

I do? I spend a ton of time post-training models for creative tasks.

The effects of model quantization are usually qualified in terms of performance on benchmaxxed tasks with strong logit probabilities, temp 0, and a "right" answer the model has to pick. Or even worse they'll be measured on metrics that don't map to anything except themselves like perplexity (https://arxiv.org/pdf/2407.09141)

I agree Q8 is strong but I also think the effects of quantization are constantly being underappreciated. People are often talking about how these models perform while fundamentally using 10+ variants of a single model with distinct performance profiles.

Even knowing the bits per weight used isn't enough to know how exactly a given quant method is affecting the model: https://docs.unsloth.ai/basics/unsloth-dynamic-v2.0-ggufs

imtringued · 2025-04-26T09:35:02 1745660102

If you've trained your own models you would be aware of quantization aware training.

danielmarkbruce · 2025-04-25T20:37:40 1745613460

"Nobody really cares if it meets a strict definition of lossless" != "quantization can be done haphazardly."

BoorishBears · 2025-04-25T20:54:29 1745614469

If you're trying to really snarkily refer to the article on Dynamic Quants 2.0 and how carefully developed they were, they're comparing their quants to the methodology 99.99% quants out there use.

The problem is not that people are making quants "haphazardly", it's that people keep parroting that various quants are "practically lossless" when they actually have absolutely no clue how lossy they are given how application specific the concept is for something as multidimensional as an LLM.

The moment anyone tries a little harder to quantify how lossy they are, we repeatedly find that the answer is "not any reasonably definition of lossless". Even in their example where Q4 is <1% away in MMLU 5-shot is probably massively helped by a calibration dataset that maps to MMLU-style tasks really well, just like constantly using WikiText massively helps models that were trained on... tons of text from Wikipedia.

So unless you're doing your own calibrated quantization with your own dataset (which is not impossible, but also not near common), even their "non-haphazard" method could have a noticeable impact on performance.

danielmarkbruce · 2025-04-25T21:22:37 1745616157

Wasn't referring to that.

You are saying that people are using quantized models haphazardly and talking about them haphazardly. I'll grant it's not the exact same thing as making them haphazardly, but I think you took the point.

The terms shouldn't be used here. They aren't helpful. You are either getting good results or you are not. It shouldn't be treated differently from further training on dataset d. The weights changed - how much better or worse at task Y did it just get?

BoorishBears · 2025-04-25T21:49:49 1745617789

The term is perfectly fine to use here because choosing a quantization strategy to deploy already has enough variables:

- quality for your specific application

- time to first token

- inter-token latency

- memory usage (varies even for a given bits per weight)

- generation of hardware required to run

Of those the hardest to measure is consistently "quality for your specific application".

It's so hard to measure robustly that many will take significantly worse performance on the other fronts just to not have to try to measure it... which is how you end up with full precision deployments of a 405b parameter model: https://openrouter.ai/meta-llama/llama-3.1-405b-instruct/pro...

When people are paying multiples more for compute to side-step a problem, language and technology that allows you to erase it from the equation is valid.

danielmarkbruce · 2025-04-25T21:58:26 1745618306

You say that as though people know these things for the full precision deployment and their use case.

Some have the capability to figure it and can do it for both full precision and quantized. Most don't and cannot.

moffkalast · 2025-04-25T19:53:28 1745610808

And when you consider that the usual final step in the pipeline is that a sampler goes ham on the probabilities and just picks some random nonsense, the tolerance for lossy compression is fairly high.

In fact, there's this funny occurrence where Q4 models on occasion perform better than their fp16 counterparts on benchmarks ran with top_k=1 since the outputs are slightly more random and they can less deterministically blunder past the local maximum into a more correct solution.

Der_Einzige · 2025-04-26T02:28:39 1745634519

We got an oral at ICLR for calling out how shit samplers like top_p and top_k are. Use min_p!

moffkalast · 2025-04-26T07:13:41 1745651621

True yep, I wish more people benchmarked models with more representative sampler settings and then took the average of 5 or 10 responses.

kridsdale3 · 2025-04-25T19:57:27 1745611047

That's not true. If there are measurable performance differences.

danielmarkbruce · 2025-04-25T20:33:54 1745613234

"strict" means something. People, including yourself, only care if there is a practical difference in performance. "this is lossless and that isn't lossless" is a completely useless statement in this realm. In many domains lossy compression is either not tolerated, not legal or not practical.

kadushka · 2025-04-25T20:09:38 1745611778

If you get any accuracy degradation with full 8 bits of precision you're doing it wrong.

omneity · 2025-04-25T21:27:07 1745616427

Or your model wasn't trained so well (weights are too spiky)

throwaway314155 · 2025-04-25T20:09:38 1745611778

Seems reductive.

wills_forward · on Feb 2, 2025

This paper is basically statistical mechanics with a quantum veneer. Two major issues:

1. Scale: They're simulating just 13 qubits with QuTiP and making grand claims about quantum thermodynamics. The computational complexity they're glossing over here is astronomical. Anyone who's actually worked with quantum systems knows you can't just handwave away the scaling problems.

2. Measurement Problem: Their whole argument about instantaneous vs time-averaged measurements is just repackaging the quantum measurement problem without actually solving anything. They're doing the same philosophical shell game that every "breakthrough" quantum paper does by moving around where they put the observer and pretending they've discovered something profound.

zifk · on Feb 3, 2025

I disagree with you on both fronts.

1. The main underpinning of this article is the analytical theory they come up with independent of their simulation. The fact that it explains a few qubits well is exactly why this is interesting. If you were to scale up their model - a spin-1/2 ising model, you would effectively get a classical magnet, which is obviously well described by classical thermodynamics. It's in limit of small systems that quantum mechanics makes thermodynamics tricky.

2. Their time averaging is just to remove fluctuations in the state, not avoid the measurement problem. They're looking at time averages of the density matrix, which still yields a quantum object that will collapse upon measurement. And as their mathematical model points out, this can be true for arbitrary time averaging windows, the limits just change respectively as smaller time averages allow for larger fluctuations. There's nothing being swept under the rug here.

whatshisface · on Feb 3, 2025

Quantum mechanics is statistical mechanics in the complex numbers.

carlob · on Feb 3, 2025

Quantum mechanics is Markov chains in imaginary time.

teamonkey · on Feb 3, 2025

Can you explain that?

diegoperini · on Feb 3, 2025

State transitions are probabilistic and operators have complex coefficients.

tsimionescu · on Feb 3, 2025

State transitions are deterministic, it's only measurement that is probabilistic.

Filligree · on Feb 3, 2025

Even that is arguable. Subjective experience is probabilistic… kinda.

guntars · on Feb 3, 2025

Do atoms decay deterministically?

tsimionescu · on Feb 3, 2025

As long as they are isolated, their state is a superposition of all possible states, and evolves determinsitically, with the amplitude of each of these "sub-states" evolving perfectly determinsitically. If you want to perform a measurement, you choose a possible decomposition of the superposition state and measure along that axis, and you'll get one of the values along that axis, with a probability that is the modulus of the square of the (complex) amplitude of that value.

Filligree · on Feb 3, 2025

Yes, aka. continuously. Interactions with larger systems makes it appear discontinuous.

wfewras · on Feb 3, 2025

I saw the best minds of my generation pithposting on hn.

carlob · on Feb 5, 2025

https://en.m.wikipedia.org/wiki/Wick_rotation

wills_forward · on Dec 15, 2024

It was funny to hear the same guy warning LMMs were getting too powerful now talking about the limits of available original training data.

wills_forward · on Jan 21, 2024

Citadel or Jump?

wills_forward · on Dec 13, 2023

I really like the elegant simplicity of tagging the screen elements like that and not obfuscating it away.

Nice work too!

rchaves · on Dec 14, 2023

thanks! I took my inspiration from Vim browser plugin (https://chromewebstore.google.com/detail/vimium/dbepggeogbai...), they have a shortcut F that allows you to choose any element on the website to navigate from

thanks vim!

wills_forward · on Dec 6, 2023

Is this part of the reason Apple decided to support RCS? They knew the iMessage system would get opened up eventually anyway...

wills_forward · on Dec 3, 2023

My jaw drop to see algorhythmic complexity laid out so clearly in a 3d space like that. I wish I was smart enough to know if it's accurate or not.

block_dagger · on Dec 3, 2023

To know, you must perform intellectual work, not merely be smart. I bet you are smart enough.

nocoder · on Dec 4, 2023

What a nice comment!! This has been a big failing of my mental model. I always believed if I was smart enough I should understand things without effort. Still trying to unlearn this....

blackbear_ · on Dec 4, 2023

That is a surprisingly common fallacy actually; I think you will find this book quite helpful to overcome it: https://www.penguinrandomhouse.com/books/44330/mindset-by-ca...

wills_forward · on Dec 5, 2023

Aw thanks for such encouragement all

modriano · on Dec 4, 2023

Unfortunately you must look closely at the details to deeply understand how something works. Even when I already have a decent mental heuristic about how an algorithm works, I get a much richer understanding by calculating the output of an algorithm by hand.

At least for me, I don't really understand something until I can see all of the moving parts and figure out how they work together. Until then, I just see a black box that does surprising things when poked.

jampekka · on Dec 4, 2023

It's also important to learn how to "teach yourself".

Understanding transformers will be really hard if you don't understand basic fully connected feedforward networks (multilayer perceptrons). And learning those is a bit challenging if you don't understand a single unit perceptron.

Transformers have the additional challenge of having a bit weird terminology. Keys, queries and values kinda make sense from a traditional information retrieval literature but they're more a metaphor in the attention system. "Attention" and other mentalistic/antrophomorphic terminology can also easily mislead intuitions.

Getting a good "learning path" is usually a teacher's main task, but you can learn to figure those by yourself by trying to find some part of the thing you can get a grasp of.

Most complicated seeming things (especially in tech) aren't really that complicated "to get". You just have to know a lot of stuff that the thing builds on.

SubiculumCode · on Dec 4, 2023

99% persperation, 1% inspiration, as the addage goes...and I completely agree.

The frustration for the curious is that there is more than you can ever learn. You encounter something new and exciting, but then you realize that to really get to the spot where you can contribute will take at least a year or six, and that will require dropping other priorities.