More

cdavid · 2026-06-07T05:10:15 1780809015

That's debatable. We can't go back in history, but if it were not for ML/data science, I believe python 3 would have killed python. At that time web dev / CLI utilities were major use cases, and that was the time golang became mainstream.

Data science, and then ofc DL being done through python just when python 3 was kinda usable (around 3.3/3.4) was a struck of luck timing-wise.

cdavid · 2026-06-06T01:03:27 1780707807

I wanted to understand the implementation of some numerical algorithms, and the tech reports were not enough.

I cloned the repo of said library, gave it claude and asked it to write a new technical report in math notation, but with annotation with link to the code so that I can pick up the details. It basically one shotted the full report and that helped me re-implement it in "pure python + numpy", "manually".

cdavid · 2026-06-04T12:56:55 1780577815

Partially a mix of strong, hacker culture in Germany in the 90ies + Berlin being a major place for electronic music in that decade.

For example, ableton was famousley co-created by the members of the monolake, a pioneer of minimalist techno in the 90ies. Some history there: https://www.roberthenke.com/interviews/ableton.html

coldtea · 2026-06-05T13:03:35 1780664615

Ableton yes - and Bitwig which is ex Ableton people.

But Cubase, Logic, etc weren't focused on electronic music in particular. They catered to general studio production.

cdavid · 2026-06-06T01:07:39 1780708059

Also NI, etc. was very linked to the scene from the early days.

Cubase, etc. have no such link that I know how, but there was still the strong hacker culture around atari and to a lesser degree amiga (vs PC), when PC was just not usable for anything low latency in mid 90ies.

cdavid · 2026-06-03T22:54:34 1780527274

no because it does not come from the same budget

colonelspace · 2026-06-04T00:23:04 1780532584

Money spent is money spent.

cdavid · 2026-05-23T12:55:45 1779540945

Not really. GPU many cores, at least for fp32, gives you 2 to 4 order of magnitudes compared to high speed CPU.

The rest will be from "python float" (e.g. not from numpy) to C, which gives you already 2 to 3 order of magnitude difference, and then another 2 to 3 from plan C to optimized SIMD.

See e.g. https://github.com/Avafly/optimize-gemm for how you can get 2 to 3 order of magnitude just from C.

p1esk · 2026-05-23T14:37:36 1779547056

Theoretical FP32 performance of AMD EPYC 9965 is double that of A100: 41.2 TFLOP/s vs 19.5 TFLOP/s

fc417fc802 · 2026-05-23T22:15:44 1779574544

Isn't that because the A100 is optimizing for memory bandwidth per TF?

cdavid · 2026-05-22T12:53:43 1779454423

scilab is not based on numpy/etc. However, matlab was certainly an inspiration for the scientific python stack in early 2000s. I myself started contributing to numpy and matplotlib by adding missing features I needed to move away from matlab in 2006 or so.

cdavid · 2026-05-18T02:39:58 1779071998

A fifth edition has been out recently: https://shop.elsevier.com/books/programming-massively-parall...

I started learning about GPU and CUDA from this book recently, and I agree the writing is confusing, and code examples have errors. However, it is still a nice reference about many types of algorithms for heterogeneous memory devices, it helped me understand better some patterns for CPUs.

cdavid · 2026-05-10T05:27:31 1778390851

Did not know of the "thinkism" expression. When I was studying in France eng. school, I called that "the mythe du cerveau" (literaly "the brain myth", though does not roll on your tongue as well).

It is guaranteed failure mode of large orgs. Curious to hear about more references on how to fight this at an organization level, besides the one given in the OT.

qsera · 2026-05-10T07:08:48 1778396928

Yea, we just name things that we want to see destroyed...

Not everything need to be made so easy to refer, like using three or four of words instead of one..

kang · 2026-05-10T07:30:09 1778398209

try replacing the word with 'thinking'

cdavid · 2026-05-09T04:28:00 1778300880

The main point of mythical man month was that communication cost across people was the main cost as project grow in complexity.

So increasing individual output by itself is not enough to affect the argument. It could, if you also reduce the size of people needed for a project, where people are everyone included in the project, not just SWE. But there are strong forces in large orgs to pull toward larger project sizes: budgeting overhead and other similar large orgs optimize for legibility kind of arguments.

IMO the only way this will change is when new companies will challenge existing big guys. I think AI will help achieve this (e.g. agentic e-commerce challenging the existing players), but it will take time.

cdavid · 2026-04-25T03:18:44 1777087124

Indeed. I would add a third factor to compute and datasets: the lego-like aspect of NN that enabled scalable OSS DL frameworks.

I did some ML in mid 2000s, and it was a PITA to reuse other people code (when available at all). You had some well known libraries for SVM, for HMM you had to use HTK that had a weird license, and otherwise looking at experiments required you to reimplement stuff yourself.

Late 2000s had a lot of practical innovation that democratized ML: theano and then tf/keras/pytorch for DL, scikit learn for ML, etc. That ended up being important because you need a lot of tricks to make this work on top of "textbook" implementation. E.g. if you implement EM algo for GMM, you need to do it in the log space to avoid underflow, DL as well (gorot and co initialization, etc.).

jesseab · 2026-04-25T03:41:02 1777088462

Remember watching Alec Radford's Theano tutorial and feeling like I had found literal gold.

alasdair_ · 2026-04-25T04:28:04 1777091284

I think your post may have more acronyms than any other post I have ever read on hn. Do you have a guide to which specific things you are talking about with each acronym? Deep Learning and Machine Learning are obvious but some of the others I can’t follow at all - they could be so many different things.

AgentMatt · 2026-04-25T07:21:21 1777101681

NN - neural networks OSS DL frameworks - open source deep learning frameworks

PITA - pain in the ass

SVM - support vector machines HMM - hidden Markov model EM - expectation maximization GMM - gaussian mixture model HTK - hidden Markov model tool kit

ButlerianJihad · 2026-04-25T04:29:53 1777091393

I think he maintains pinball machines and jukeboxes for a chain of Greek restaurants

cdavid · 2026-04-26T08:12:07 1777191127

fair, somebody else clarified already !