More

gojomo · 2025-11-03T18:51:23 1762195883

Plot these against the rate of similar attempts (or successes) over the past century if you want to convince others of anything other than your own subjective presentist perspective.

postflopclarity · 2025-11-03T19:00:53 1762196453

the Hortman murder was the first assassination of a sitting legislator at the state or federal level in my lifetime, which feels pretty significant.

I'm looking at https://en.wikipedia.org/wiki/List_of_assassinated_American_... and I stand by my "subjective perspective" as remaining pretty reasonable. let me know what specifically you wanted me to plot

gojomo · 2025-11-03T20:20:52 1762201252

The Wikipedia page is useful, and as you've identified the 2025 MN Representative Hortman murder as the "first assassination of a sitting legislator at the state or federal level in my lifetime" – not counting the 2015 murder of SC Senator Pinckney – is it safe to assume you're a precociously-posting 10-year-old?

I was born in 1970; per your reference, there've been a bunch of state & federal legislators (or recently-former legislators) killed for political (or pseudo-political deranged) motives "in my lifetime" – and far more in the 1970s than in the last 10 years.

In my lifetime, one sitting President was shot at & missed (Ford in 1976), and one was shot at & hit by a ricochet (Reagan in 1981) – again, more in the past than the shots that grazed candidate Trump in 2024.

The Wikipedia-listed murders of other officeholders, like mayors or judges, are also more frequent in the past than recently – especially going before either of our lifetimes.

So trend impressions are very subject to frames of reference & familiarity with history.

I suspect if people in general had a deeper & broader sense of how common political violence has been, both in US history & worldwide, they'd be, on the one hand, less prone to panic over recent events & rhetoric (even though it is concerning), but also on the other hand more appreciative of the relative peace of recent decades (even with the last few years' events).

postflopclarity · 2025-11-03T21:05:37 1762203937

> not counting the 2015 murder of SC Senator Pinckney

fair enough. not sure how I skipped over that one.

> So trend impressions are very subject to frames of reference & familiarity with history.

I don't disagree with this. but nonetheless in my lifetime (< 30 years) I have mostly lived through only the "relative peace of recent decades" so the increase in political violence over the last few years is very scary.

gojomo · 2025-11-03T18:49:09 1762195749

Lessons are repeated until learned. And again, after those lessons are forgotten.

gojomo · 2025-11-02T00:51:24 1762044684

Not sure you can judge whether these modern models do well on the 'arithmetic analogization' task based on absolute similarity values – & especially L2 distances.

That it ever worked was simply that, among the universe of candidate answers, the right answer was closer to the arithmetic-result-point than other candidates – not necessarily close on any absolute scale. Especially in higher dimensions, everything gets very angularly far from everything else - the "curse of dimensionality".

But the relative differences may still be just as useful/effective. So the real evaluation of effectiveness can't be done with the raw value diff(king-man+woman, queen) alone. It needs to check if that value is less than that for every other alternative to 'queen'.

(Also: canonically these exercises were done as cosine-similarities, not Euclidean/L2 distance. Rank orders will be roughly the same if all vectors normalized to the unit sphere before arithmetic & comparisons, but if you didn't do that, it would also make these raw 'distance' values less meaningful for evaluating this particular effect. The L2 distance could be arbitrarily high for two vectors with 0.0 cosine-difference!)

jdthedisciple · 2025-11-02T09:59:15 1762077555

> It needs to check if that value is less than that for every other alternative to 'queen'.

There you go: Closest 3 words (by L2) to the output vector for the following models, out of the most common 2265 spoken English words among which is also "queen":

    voyage-3-large:             king (0.46), woman (0.47), young (0.52), ... queen (0.56)
    ollama-qwen3-embedding:4b:  king (0.68), queen (0.71), woman (0.81)
    text-embedding-3-large:     king (0.93), woman (1.08), queen (1.13)

All embeddings are normalized to unit length, therefore L2 dists are normalized.

gojomo · 2025-11-03T05:06:15 1762146375

Thanks!

So of those 3, despite the superficially "large" distances, 2 of the 3 are just as good at this particular analogy as Google's 2013 word2vec vectors, in that 'queen' is the closest word to the target, when query-words ('king', 'woman', 'man') are disqualified by rule.

But also: to really mimic the original vector-math and comparison using L2 distances, I believe you might need to leave the word-vectors unnormalized before the 'king'-'man'+'woman' calculation – to reflect that the word-vectors' varied unnormalized magnitudes may have relevant translational impact – but then ensure the comparison of the target-vector to all candidates is between unit-vectors (so that L2 distances match the rank ordering of cosine-distances). Or, just copy the original `word2vec.c` code's cosine-similarity-based calculations exactly.

Another wrinkle worth considering, for those who really care about this particular analogical-arithmetic exercise, is that some papers proposed simple changes that could make word2vec-era (shallow neural network) vectors better for that task, and the same tricks might give a lift to larger-model single-word vectors as well.

For example:

- Levy & Goldberg's "Linguistic Regularities in Sparse and Explicit Word Representations" (2014), suggesting a different vector-combination ("3CosMul")

- Mu, Bhat & Viswanath's "All-but-the-Top: Simple and Effective Postprocessing for Word Representations" (2017), suggesting recentering the space & removing some dominant components

jdthedisciple · 2025-11-03T07:21:02 1762154462

Interesting papers, thanks.

> you might need to leave the word-vectors unnormalized before the 'king'-'man'+'woman' calculation – to reflect that the word-vectors' varied unnormalized magnitudes may have relevant translational impact

I believe translation should be scale-invariant, and scale should not affect rank ordering

gojomo · 2025-11-03T16:01:19 1762185679

> I believe translation should be scale-invariant, and scale should not affect rank ordering

I don't believe this is true with regard to ending angles after addition steps between vectors of varying magnitudes.

Imagine just in 2D: vector A at 90° & magnitude 1.0, vector B at 0° & magnitude 0.5, and vector B' at 0° but normalized to magnitude 1.0.

The vectors (A+B) and (A+B') will be at both different magnitudes and different directions.

Thus, cossim(A,(A+B')) will be notably less than cossim(A,(A+B)), and more generally, if imagining the whole unit circles as filled with candidate nearest-neighbors, (A+B) and (A+B') may have notably different ranked lists of cosine-similarity nearest-neighbors.

jdthedisciple · 2025-11-03T22:24:18 1762208658

You are totally right of course!

It had slipped my (tired) mind that vector magnitudes are actually discarded in embedding model training.

gojomo · 2025-11-02T00:33:51 1762043631

If by 'doc2vec' you mean the word2vec-like 'Paragraph Vectors' technique: even though that's a far simpler approach than the transformer embeddings, it usually works pretty well for coarse document similarity. Even the famous word2vec vector-addition operations kinda worked, as illustrated by some examples in the followup 'Paragraph Vector' paper in 2015: https://arxiv.org/abs/1507.07998

So if for you the resulting doc-to-doc similarities seemed nonsensical, there was likely some process error in model training or application.

gojomo · 2025-11-01T10:22:39 1761992559

When you describe a tax that is "paid annually and only if you win", that's plain generic income tax.

That's not the gambling-activity-specific taxes that Stoller's article discusses - typically applied to gambling businesses' revenues, not bet winners specifically.

gojomo · 2025-11-01T09:52:27 1761990747

Huh? What is Loeb's 'captive audience'?

ceejayoz · 2025-11-03T18:09:11 1762193351

UFO nuts desperate for anyone with a shred of academic credibility.

gojomo · 2025-10-30T20:38:09 1761856689

Why would someone "with his standing at Harvard" be expected to avoid "wild, and public, hypothetical[s]"?

Does everyone at any prestigious institution have some duty to remain conventionally mundane in all their musings?

Is there any reason to think such hypotheticals are, on net, more harmful than helpful?

Isn't tenure (like Loeb's) designed to encourage a fearlessness around topics & speech?

gojomo · 2025-10-30T20:31:51 1761856311

What is Loeb pushing as fact with no evidence? Can you provide a representative quote?

dekhn · 2025-10-30T21:22:49 1761859369

You can easily answer this with a Google search.

gojomo · 2025-11-01T09:50:42 1761990642

No, I can't, because all I've seen from Loeb is pretty clear about what is fact, and what is speculation.

I don't know, and there's no way to find via "Google search", what HN user ~dylan604 is specifically alleging has been improperly "pushed as fact".

If it's clear to you, can you share a representative quote from Loeb? He's got a lot of writing to choose from!

gojomo · 2025-10-28T18:45:25 1761677125

WTF is 'SFP'?

tuetuopay · 2025-10-28T19:18:31 1761679111

Small Form-factor Pluggable, a common optics format for 1 to 25Gbps networks. See the wikipedia page: https://en.wikipedia.org/wiki/Small_Form-factor_Pluggable

gojomo · 2025-10-13T18:55:48 1760381748

Yes, though it's possible a more-general core model, further enhanced with some other ways to bring those texts-of-interest into the working context, might perform better.

Those other ways to integrate the texts might be some form of RAG or other ideas like Apple's recent 'hierarchical memories' (https://arxiv.org/abs/2510.02375).