More

rfdearborn · 2025-04-17T17:34:17 1744911257

> Sending a full-resolution screenshot every five seconds gets expensive fast.

For now.

rfdearborn · 2025-01-15T01:57:38 1736906258

These are best understood as scheduled tasks for the AI instead of tasks for the user.

rfdearborn · 2025-01-06T19:09:48 1736190588

Yeah: the value of (good) OKRs is as a proof of work for the process of aligning teams' vectors with business objectives / each other.

rfdearborn · on June 10, 2024

This integration is way more limited and frictioned. Whereas with search Apple's fully outsourced and queries go straight to your 3rd-party default, Siri escalates to GPT only for certain queries and with one-off permissions. They seem to be calculating that their cross-app context, custom silicon, and privacy branding give them a still-worthwhile shot at winning the Assistant War. I think this is reasonable, especially if open source AI continues to keep pace with the frontier.

rfdearborn · on April 8, 2024

The trendline is definitely toward increasing dynamic routing, but I suspect it's more so that MoE/MoD/MoDE enable models to embed additional facts with less superposition within their weights than enable deeper reasoning. Instead I expect deeper reasoning will come through token-wise dynamism rather than layer-wise -- e.g., this recent Quiet-STaR paper in which the model outputs throwaway rationale tokens: https://arxiv.org/abs/2403.09629

rfdearborn · on March 17, 2022

I think it's reasonable as an early-ish stage startup to say to a candidate that ~"eventually, with scale, there will be cool and impactful ML opportunities here" as long as you're realistic and upfront about the facts that ~"right now most of the impact is in simple but foundational analyses" and ~"there'll be some amount of fires to put out and rote work to automate".

rfdearborn · on May 21, 2020

Seems like permanent remote-optional is rapidly becoming the competitive equilibrium in tech.

As tooling improves, there is some point at which gains from expanded recruiting pool + real estate savings outweigh coordination/culture costs of remote. Beyond this it will be irrational to force physical presences (for many roles, at least) and there'll be no going back.

650REDHAIR · on May 21, 2020

It will also drive down wages across the board long term.

rfdearborn · on May 20, 2020

Approximately once per week I observe a comment on HN saying approximately "search doesn't work anymore." Why is this? Why do I only observe such complaints here?

karatestomp · on May 20, 2020

Lots of people here who were and remember being "good at search" and able to make exactly the thing they're looking for shoot to the top 3 spots on the results screen and being able to craft a series of searches such that, if they all failed, one could be pretty damn sure the information wasn't online or at least was hidden from the Google bot, neither of which are really possible anymore. The 100th time one attempts to find something on Google that one knows is there but just cannot get to show up no matter how many rare keywords one uses, one concludes Google has lost some pretty significant utility it used to have and starts to wonder what one is not finding when one doesn't know exactly the page one is trying to find.

(of course it may be better for lots of other things, but for a fairly large set of "finding things on the Web" tasks it's way worse than it used to be, to the point of being nearly useless—in part I think this is because sometime around 2008-2010 they stopped trying to fight webspammers, choosing instead to embrace some set of them provided they play by Google's rules, and downrank anything that wasn't "well-behaved" webspam or well-known sites hard)

bityard · on May 20, 2020

I definitely remember a time when Google was much, much better at returning relevant results for technical content. My feeling is that there are (at least) two factors at play:

1) Google and other search engines revised their algorithms more than a decade ago. Instead of showing you results for what you searched for, they now show you what they _think you meant_ based on your search history and the search histories of millions of other users. This means searching for uncommon topics and phrases get you useless results. And you can't disable this functionality because it's baked into the core of how Google indexes and categories the content that it slurps up from the web.

2) Blogspam authors and e-commerce sites have gotten SEO down to a science, to the point where searching for nearly _anything_ not super specific no longer gets you _information_ about that particular thing, it gets you blogspam articles filled with fluff and affiliate links, or half-broken e-commerce sites trying to sell you something vaguely related to what you searched for. This is not technically Google's fault but there is a lot they could do to curb this, but all that ad revenue on those sites is how they earn that sweet sweet lucre.

karatestomp · on May 20, 2020

> This is not technically Google's fault but there is a lot they could do to curb this, but all that ad revenue on those sites is how they earn that sweet sweet lucre.

That had been going on for years, but there'd be clear times when Google got ahead of it and results would get much better for a while, and because search was so much more precise it was possible to work around the spam. That those good times stopped happening and results are now a consistent and fairly high level of "spam-filled" by content that's seemed pretty much the same sort of crap for years, leads me to conclude they stopped trying. IIRC right around then they stopped the "no no, our ads our different and good, they're just text and always formatted the same way so it's easy to tell what they are" and became just another banner ad slinger.

[EDIT] just mined Slashdot for that last bit, looks like that happened around the last half of '07, which roughly checks out with my recollection of Google search abruptly getting much worse around '08-'09 then never getting better again.

rfdearborn · on Aug 26, 2019

+1

As with so many things, it seems like incentives are at the root of this problem: manufacturers want to sell more foods, and adding sugar (whose addictive properties are likely to increase consumption x retention) is an easy tactic to pursue.

What should be done to counter this? Tax sugar and it will be substituted for some other unhealthy, manipulative thing. Perhaps only shifting preferences via culture / education (s.t. sweetened products sell relatively worse) can work?

rfdearborn · on Feb 15, 2019

Sure, not releasing the full trained model probably delays it, but sooner or later a bad actor will do their own scraping and train their own model and share it around and the genie will be out of the bottle. Then what?

I think we need to be conducting AI research (and building software generally) under the assumption that all of it will eventually be repurposed by bad actors. How would our practices be different if we consistently and cautiously did this?

Here's a thought experiment: how would the Manhattan project have been different if it were carried out in the open and its products were instantaneously and infinitely reproducible? What is the MAD equilibrium of AI research? I think the impact potential is similar even before AGI.

mlb_hn · on Feb 15, 2019

A lot of the advancements that made the Manhattan project were published by the Germans. On hearing about Hiroshima, Otto Hahn was "'shattered,' and went on to say that he felt 'personally responsible for the deaths of hundreds of thousands of people,' believing that his discovery had made the bomb possible." (https://www.washingtonpost.com/archive/opinions/1992/03/01/t...)

a-dub · on Feb 15, 2019

wasn't that the point of this whole openai thing? they didn't like the idea of there being a club with just google in it that had access to resources and funding to collect and train on massive datasets so they were going to be the "bad actors" who would do their own scraping, train their own models and share them around?

isn't it supposed to be called OPENai?

they don't want to share the data because they don't want to throw away the edge they've gained by collecting it. :)

computer programs that generate human like text aren't dangerous, the internet is full of human like text that is mostly bullshit anyway.

lc5G · on Feb 15, 2019

I have had the same impression regarding their work on dota. They got a lot of publicity with it but their work is not open at all. They have released neither their code which runs the bots on dota2 nor their training code nor the final model. All we have is video recordings of a few games against humans.

banku_brougham · on Feb 15, 2019

>faint praise for soccer champion who apparently keeps winning games through poor performance.

I can’t disagree.

sytelus · on Feb 15, 2019

CommonCrawl already has open dataset in petabyte size ready on AWS. Even if it didn’t exist, scrapping 80GB of data in AWS is trivial. I am surprised authors considered this as such a big deal. Also notice that performance is not anywhere close to humans. It sort of works and it’s astonishing that it does but long way to go before we have to fear weaponizing text generation.

riku_iki · on Feb 15, 2019

I think the big deal is the size of model, BERT large is 300M params, and this one is 1.5B. Bert has been trained on pod with 64 TPUs, and this model requires even larger GPU/TPU cluster. There is no way indie underfunded researcher can train such model.