More

beardedwizard · 2025-06-13T23:38:02 1749857882

The ninja party was not even great.

firefax · 2025-06-14T18:55:51 1749927351

>The ninja party was not even great.

I believe you.

By the time I was cool enough to have my pick of parties, I ended up having a massive panic attack during the crowd crush at the bar at some Rapid7 event and ending up pissing off the person who got me the ticket by leaving after 30 minutes to go buy a Manhattan at some side bar in the casino rather than wait in line 45 minutes for a beer, then spend another 45 minutes trying to wiggle away, only to start wiggling back.

beardedwizard · 2025-06-09T13:47:25 1749476845

Apple recently published a paper that seems to disagree and plainly states it's just pattern matching along with tests to prove it.

https://machinelearning.apple.com/research/illusion-of-think...

martindbp · 2025-06-10T05:48:10 1749534490

Anthropic has done much more in depth research actually introspecting the circuits: https://transformer-circuits.pub/2025/attribution-graphs/bio...

stuckinhell · 2025-06-09T14:10:22 1749478222

I'm having a hard time taking apple seriously, when they have don't even have a great llm.

https://www.techrepublic.com/article/news-anthropic-ceo-ai-i... Anthropic CEO: “We Do Not Understand How Our Own AI Creations Work”. I'm going to lean with Anthropic on this one.

beardedwizard · 2025-06-09T16:38:49 1749487129

I guess I prefer to look at empirical evidence over feelings and arbitrary statements. AI ceos are notoriously full of crap and make statements with perverse financial incentives.

otabdeveloper4 · 2025-06-09T14:13:29 1749478409

> I have a hard time taking your claim about rotten eggs seriously when you're not even a chicken.

beardedwizard · 2025-06-06T05:13:22 1749186802

That's really not true. Context is one strategy to keep a models output constrained, and tool calling allows dynamic updates to context. Mcp is a convenience layer around tool calls and the systems they integrate with

beardedwizard · 2025-05-30T15:37:40 1748619460

I think you are drawing the wrong conclusion - users cannot be mindful of clicks, we should live in a world where click is assumed and go from there.

cmeacham98 · 2025-05-30T17:16:19 1748625379

> users cannot be mindful of clicks

Why not?

esseph · 2025-05-30T18:36:09 1748630169

https://gist.github.com/StevenACoffman/a5f6f682d94e38ed80418...

https://daniel.haxx.se/blog/2025/05/16/detecting-malicious-u...

Because deception is easy

beardedwizard · 2025-05-24T16:42:44 1748104964

A master class on how to say exactly nothing.

beardedwizard · 2025-05-22T15:35:03 1747928103

This is the primary failure of data platforms from my perspective. You need too many 3rd parties/partners to actually get anything done with your data and costs become unbearable.

beardedwizard · 2025-05-12T16:31:07 1747067467

The bummer about lots of supply chain work is that it does not address the attacks we see in the wild like xz where malicious code was added at the source, and attested all the way through.

There are gains to be had through these approaches, like inventory, but nobody has a good approach to stopping malicious code entering the ecosystem through the front door and attackers find this much easier than tampering with artifacts after the fact.

kuruczgy · 2025-05-12T18:20:21 1747074021

Actually this is not quite true, in the xz hack part of the malicious code was in generated files only present in the release tarball.

When I personally package stuff using Nix, I go out of my way to build everything from source as much as possible. E.g. if some repo contains checked in generated files, I prefer to delete and regenerate them. It's nice that Nix makes adding extra build steps like this easy. I think most of the time the motivation for having generated files in repos (or release tarballs) is the limitations of various build systems.

XorNot · 2025-05-12T21:40:52 1747086052

The xz attack did hit nix though. The problem is no one is inspecting the source code. Which is still true with nix, because everyone writes auto bump scripts for their projects).

If anyone was serious about this issue, we'd see way more focus on code signing and trust systems which are transferable: i.e. GitHub has no provision to let anyone sign specific lines of a diff or a file to say "I am staking my reputation that I inspected this with my own eyeballs".

zelphirkalt · 2025-05-13T02:02:27 1747101747

Is it really stacking ones reputation? Think about it: If everyone is doing it all the time, an overlooked something is quickly dismissed as a mistake that is bound to happen sooner or later. Person X is reviewing so much code and does such a great job usually, but now they overlooked that one thing. And they even admitted their mistake. Surely they are not bad.

I think it would quickly fade out. What are we going to do, if even some organization for professional code reviews signs off the code but after 5y in the business they make 1 mistake? Are we no longer going to trust them from that day on?

I think besides signing code, there need to be multiple pairs of eyeballs looking at it independently. And even then nothing is really safe. People get lazy all the time. Someone else surely has already properly reviewed this code. Let's just sign it and move on! Management is breathing down our necks and we gotta hit those KPI improvements ... besides, I gotta pick up the kids a bit earlier today ...

Don't let perfect be the enemy of good. There is surely some benefit, but one can probably never be 100% sure, unless one goes into mathematical proofs and understands them oneself.

0xDEAFBEAD · 2025-05-13T05:53:27 1747115607

It's unlikely that multiple highly-regarded reviewers would all make the same mistake simultaneously (unless all their dev machines got compromised).

Ultimately it's about making the attacker's life difficult. You want to raise the cost of planting these vulnerabilities, so attackers can pull it off once every few decades, instead of once every few years.

jrockway · 2025-05-12T23:30:51 1747092651

Yeah, the more I read through actual package definitions in nixpkgs, the more questions I have about selling this as some security thing. nixpkgs is very convenient, I'll give it that. But a lot of packages have A LOT of unreviewed (by upstream) patches applied to them at nix build time. This burned Debian once, so I expect it to burn nixpkgs someday too. It's inevitable.

I do think reproducible builds is important. It lets people that DO review the source code trust upstream binaries, which is often convenient. I made this work at my last job... if you "bazel build //oci/whatever-image" you end up with a docker manifest that has the same sha256 as what we pushed to Docker Hub. You can then read all the code and know that at least that's the code you're running in production. It's neat, but it's only one piece of the security puzzle.

p1necone · 2025-05-12T23:15:18 1747091718

(Effectively) nobody will ever be serious about this issue unless it were somehow mandated for everyone. Anyone who was serious about it would take 3x as long to develop anything compared to their competitors, which is not a viable option.

0xDEAFBEAD · 2025-05-13T05:56:00 1747115760

Yeah ultimately it's a public goods problem.

I wonder if a "dominant assurance contract" could solve this: https://foresight.org/summary/dominant-assurance-contracts-a...

spookie · 2025-05-13T04:30:03 1747110603

This is why distros with long release cycles are better. Usually more time for eyeballs to parse things.

Take Debian for example, the commit never made it to stable.

transpute · 2025-05-12T22:20:05 1747088405

> provision to let anyone sign specific lines of a diff

Good idea that should be implemented by git itself, for use by any software forge like github, gitlab, codeberg, etc.

https://git-scm.com/book/en/v2/Git-Tools-Signing-Your-Work

justinrubek · 2025-05-13T13:38:24 1747143504

While the code itself did get to nix, the exploit was not functional specifically due to how nix works. That doesn't mean that a more sophisticated attack couldn't succeed though. It was mostly luck that kept it from affecting NixOS.

throwawayqqq11 · 2025-05-12T19:19:52 1747077592

Your preference to compile your backdoors does not really fix the problem of malicious code supply.

I have this vague idea to fingerprint the relevant AST down to all syscalls and store it in a lock file to have a better chance of detection. But this isnt a true fix either.

kuruczgy · 2025-05-12T20:01:08 1747080068

Yes you are right, what I am proposing is not a solution by itself, it's just a way to be reasonably confident that _if you audit the code_, that's going to be the actual logic running on your computer.

(I don't get the value of your AST checksumming idea over just checksumming the source text, which is what almost all distro packages do. I think the number of changes that change the code but not the AST are negligible. If the code (and AST) is changed, you have to audit the diff no matter what.)

The more interesting question that does not have a single good answer is how to do the auditing. In almost all cases right now the only metric you have is "how much you trust upstream", in very few cases is actually reading through all the code and changes viable. I like to look at how upstream does their auditing of changes, e.g. how they do code review and how clean is their VCS history (so that _if_ you discover something fishy in the code, there is a clean audit trail of where that piece of code came from).

vasco · 2025-05-13T04:16:25 1747109785

> it's just a way to be reasonably confident that _if you audit the code_

Why do we pretend this is easy many times in conversation about dependencies? It's as if security bugs in dependencies were calling out at us, like a house inspector looking at a huge hole on the floor of the house. But it's not like that at all, most people would inspect 99.9% of CVEs and read the vulnerable code and accept it. As did the reviewers in the open-source project, who know that codebase much more than someone who's adding a dependency because they want to do X faster. And they missed it or the CVE wouldn't be there, but somehow a random dev looking at it for the first time will find it?

In fact, if to use dependencies I have to read and understand the code and validate it, the number of dependencies I'd use would go to zero. And many things I would be locked out of doing, because I'm too dumb to understand them, so I can't audit the code, which means I'm definitely too dumb to replicate the library myself.

Asking people to audit the code in hopes of finding a security bug is a big crapshoot. The industry needs better tools.

SOLAR_FIELDS · 2025-05-13T06:59:31 1747119571

This makes perfect sense on a beefy super powered dev laptop with the disk space upgrade on an unsaturated symmetrical gig connection.

I’m only exaggerating a little bit here. Nix purism is for those who can afford the machines to utilize it. Doing the same on old hardware is so slow it’s basically nontenable

justinrubek · 2025-05-13T13:36:42 1747143402

This shows only a surface level understanding of what nix provides here.

One of the biggest benefits is the binary cache mechanism, which allows you to skip building something but have the effective result of the build pulled from the cache. It's classical distributions that make building from source only possible for those who can afford the infrastructure, nix is what enables the rest of us to do so.

zeec123 · 2025-05-13T07:29:17 1747121357

The nix cache exists for a reason.

turboponyy · 2025-05-13T11:25:23 1747135523

Glossing over some details, the build artifact and build definition are equivalent in Nix. If you know the build definition, you can pull the artifact from the cache and be assured that you have the same result.

0xDEAFBEAD · 2025-05-13T05:50:05 1747115405

>When I personally package stuff using Nix, I go out of my way to build everything from source as much as possible. E.g. if some repo contains checked in generated files, I prefer to delete and regenerate them. It's nice that Nix makes adding extra build steps like this easy. I think most of the time the motivation for having generated files in repos (or release tarballs) is the limitations of various build systems.

You know what would be really sweet?

Imagine if every time a user opted to build themselves from source, a build report was by default generated and sent to a server alongside the resulting hashes etc. And a diff report gets printed to your console.

So not only are builds reproducible, they're continuously being reproduced and monitored around the world, in the background.

Even absent reproducibility, this could be a useful way to collect distribution data on various hashes, esp. in combination w/ system config info, to make targeted attacks more difficult.

yencabulator · 2025-05-12T16:38:14 1747067894

I think a big part of the push is just being able to easily & conclusively answer "are we vulnerable or not" when a new attack is discovered. Exhaustive inventory already is huge.

tough · 2025-05-12T17:46:21 1747071981

i read somewhere go has a great package for this that checks statically typed usage of the vuln specific functions not whole package deps

yencabulator · 2025-05-12T18:01:29 1747072889

https://pkg.go.dev/golang.org/x/vuln/cmd/govulncheck

tough · 2025-05-12T19:14:30 1747077270

ty ty exactly what I was thinking

does something like this exist for other languages like rust, python or js?

yencabulator · 2025-05-12T19:50:22 1747079422

I don't think the Rust ecosystem has that at this time. They're annotating the vulnerabilities with affected functions, but as far as I know nobody's written the static analysis side of it.

https://github.com/rustsec/rustsec/issues/21

Python and JS might be so dynamic that such static analysis just isn't as useful.

dwattttt · 2025-05-12T19:51:14 1747079474

For Rust, the advisory database cargo-audit uses (https://github.com/RustSec/advisory-db/) does track which functions are affected by a cve (if provided). I'm not sure if the tool uses them though.

XiZhao · 2025-05-12T16:39:34 1747067974

I run a sw supply chain company (fossa.com) -- agree that there's a lot of low hanging gains like inventory still around. There is a shocking amount of very basic but invisible surface area that leads to downstream attack vectors.

From a company's PoV -- I think you'd have to just assume all 3rd party code is popped and install some kind of control step given that assumption. I like the idea of reviewing all 3rd party code as if its your own which is now possible with some scalable code review tools.

nyrikki · 2025-05-12T18:52:57 1747075977

Those projects seem to devolve into a boil the ocean style projects and tend to be viewed as intractable and thus ignorable.

In the days everything was http I use to set a proxy variable and have the proxy save all downloaded assets to compair later, today I would probably blacklist the public CAs and do an intercept, just for the data of what is grabbing what.

Fedramp was defunded and is moving forward with a GOA style agile model. If you have the resources I would highly encourage you to participate in conversations.

The timelines are tight and they are trying to move fast, so look into their GitHub discussions and see if you can move it forward.

There is a chance to make real changes but they need feedback now.

https://github.com/FedRAMP

beardedwizard · 2025-05-13T15:04:51 1747148691

+1, I think you have to assume owned as well and start defending from there. Companies like edera are betting on that, but sandbox isn't panacea, you really need some way to know expected behavior.

TZubiri · 2025-05-13T00:09:48 1747094988

When you have so many dependencies that you need to create complex systems to manage and "secure" the dependencies. The problem is that you have too many dependencies, you are relying on too many volunteer work, and you are demanding too many features while paying for too little.

The professional solution is to PAY for your Operating System, and rely on them to secure it. Whether it be to Microsoft or to Red Hat. You KNOW it's the right thing to do, and this is overintellectualizing your need to have a gratis operating system and charge non-gratis prices to your clients in turn.

lmm · 2025-05-13T06:55:10 1747119310

How does that solve the problem? Both Microsoft and IBM/Red Hat have shipped backdoored code in the past and will no doubt do so again. At most you might be able to sue them for a refund of what you paid them, at which point you're no better off than if you'd used a free system from the start.

justinrubek · 2025-05-13T13:39:45 1747143585

I couldn't disagree more with everything you've said.

beardedwizard · 2025-05-07T05:38:32 1746596312

But this is the solution the most cutting edge llm research has yielded, how do you explain that? Are they just willfully ignorant at OpenAI and anthropic? If fine tuning is the answer why aren't the best doing it?

mcintyre1994 · 2025-05-07T09:32:39 1746610359

I'd guess the benefit is that it's quicker/easier to experiment with the prompt? Claude has prompt caching, I'm not sure how efficient that is but they offer a discount on requests that make use of it. So it might be that that's efficient enough that it's worth the tradeoff for them?

Also I don't think much of this prompt is used in the API, and a bunch of it is enabling specific UI features like Artifacts. So if they re-use the same model for the API (I'm guessing they do but I don't know) then I guess they're limited in terms of fine tuning.

int_19h · 2025-05-07T20:16:27 1746648987

Prompt caching is functionally identical to snapshotting the model after it processed the prompt. And you need the KV cache for inference in any case so it doesn't even cost extra memory to keep it around, if every single inference task is going to have the same prompt suffix.

beardedwizard · 2025-05-07T00:08:41 1746576521

But copilot is bundled and is free, and it's still losing to cursor

lolinder · 2025-05-07T00:15:53 1746576953

Define losing? My company pays for Copilot but not for Cursor, and it's not at all clear to me that we're the exception rather than the norm. What numbers and data are you working with?

sdwr · 2025-05-07T02:46:14 1746585974

Copilot has every incumbent advantage, so if Cursor is doing halfway decent in the market (which it is), it's winning by default

lolinder · 2025-05-07T03:17:42 1746587862

That's not actually how unseating an incumbent works. The incumbent can adapt to the threat for quite a while if they act on it, they just have to not be Blockbuster. Copilot is showing every sign of making up ground feature-wise, which is bad news for the runners up.

walthamstow · 2025-05-07T14:19:21 1746627561

Incumbent advantage of being in VS Code already? Thing is, Cursor is basically just VS Code, there's hardly any barrier to switching, so it's quite a weak advantage.

milkshakes · 2025-05-07T20:16:18 1746648978

the incumbent advantage is the default distribution.

defaults matter

starfezzy · 2025-05-07T02:53:51 1746586431

In brand velocity maybe, but copilot is rapidly reaching feature parity with cursor and will invariably overtake it—while costing less to users.

Same with Google vs OpenAI. I tend to agree with the sentiment that I most frequently hear which is that OpenAI is the currently popular brand, but that can only carry them so far against what will eventually be a better offering for cheaper.

beardedwizard · 2025-05-06T13:27:28 1746538048

GitHub has been failing upward for more than 5 years. They could have totally dominated software development and security - failed. Could have been the undisputed champion of code hosting - failed. Should have dominated development co-pilots - failed.

I actually find it a little reassuring that they can't seem to get out of their own way.

stevage · 2025-05-06T13:34:35 1746538475

They're not the champion of code hosting?

beardedwizard · 2025-05-06T16:43:17 1746549797

It's a close call - I make this based on the fact that GitHub is viewed as an anti-choice by some in the community, a huge change from the "you don't use GitHub?!?!" energy they had pre-acquisition.

The MS acquisition traded the developer community to briefly appeal to enterprises, then quickly let both down.

ctkhn · 2025-05-06T17:45:55 1746553555

Both the startups I worked at and the mega corps are all on github or moving there from bitbucket. They are in a bit of autpilot mode in terms of useful new features aside from actions but I can't think of any new bitbucket feature since I graduated and started working.

beardedwizard · 2025-05-06T23:00:37 1746572437

Bitbucket is not a player, as you said there are only people leaving. Gitlab has a better enterprise posture than GitHub and can be deployed more securely. Most developers aren't unhappy with GitHub, but IT and security teams are.

kyawzazaw · 2025-05-06T22:04:59 1746569099

i concur

stevage · 2025-05-08T01:18:29 1746667109

I dunno, for me Github is better than it was pre-acquisition. Sure, the rate of improvement has slowed a lot, but they did fix some old annoyances. But come to think of it, I can't really think of any ways that it has enshittified. I don't use any of the CI/Actions stuff though.

MassiveQuasar · 2025-05-06T13:37:40 1746538660

They were before they got acquired by Microsoft.

The fact that they are is not the results of the Microsoft takeover.

stevage · 2025-05-06T14:14:41 1746540881

Then I don't understand the inclusion in the list above.

sofixa · 2025-05-06T14:09:46 1746540586

To be fair, they have been behind the competition for many years. Gitlab had extremely good CI, security scanning, organisational concepts, etc. for years before GitHub introduced their ones (and Actions still has a worse UX, and GitHub still doesn't have anything below an organisation).

no_wizard · 2025-05-06T18:16:51 1746555411

GitLab UI is inferior IMO, and I've used both quite extensively.

I don't like that GitLab lets you nest organizations and such, it makes it so painful to find things over time. I appreciate GitHub doesn't do this, I view it as a plus

I also disagree about GitLab CI, not that it wasn't smart for them to include alot sooner than GitHub, but Actions is really good and really easy to get up and moving with. I find they run faster, have better features - like they can annotate a PR with lint errors and test failures - with very little comparative configuration.

GitLab CI yaml is a mess by comparison. GitHub was smart to push things to the runner level once a certain complexity threshold is hit.

This has been my experience of course, and so much of it is really subjective admittedly, but I don't think GitLab is truly ahead at this point.

sofixa · 2025-05-07T09:51:36 1746611496

> I don't like that GitLab lets you nest organizations and such, it makes it so painful to find things over time. I appreciate GitHub doesn't do this, I view it as a plus

Nah, I hate that. At my job we have a few different orgs, with terrible SSO boundaries (having to auth multiple times to GitHub because I work on repositories from different GitHub orgs). Allowing you to have a proper structure with nestedness, while still having good search, is great. You can also easily move projects and namespaces around, so if the structure doesn't work, it can evolve.

Why would you have the 50 library repositories you've had to fork as top level projects polluting your org? You also can't really do shared variable, environment, CI configs between repos of the same project/type.

mdaniel · 2025-05-06T14:58:39 1746543519

And it being open core (MIT) means spinning up a version to test something is incredibly easy. Not exactly resource cheap, as it's still a rails app with multiple servers "smuggled" in the docker image, but it is easy

And I have long held that they are hungry, shipping like clockwork on or about the 20th of every month, showing up with actual improvements all the time https://about.gitlab.com/releases/ It seems this month brings 18.0 with it, for whatever that version bump happens to include

They also have a pretty good track record of "liberating" some premium features into the MIT side of things; I think it's luck of the draw, but it's not zero and it doesn't seem to be tied to any underhanded reason that I can spot

beardedwizard · 2025-05-06T16:44:32 1746549872

Why gitlab hasn't been able to capitalize on GitHub's many failures is almost as interesting as GitHub's fall.

I think the GitHub brand is still stronger and people just don't "care" about gitlab.

mdaniel · 2025-05-07T02:09:18 1746583758

Yeah, it's almost certainly the network effect. Although poor GitLab isn't doing themselves any favors by picking what seems to be the slowest web framework one can possibly imagine

But, anytime I am empowered to pick, I'm going to pick GitLab 100% of the time because it has every feature that I care about and "being popular" isn't a feature that I care about

twodave · 2025-05-06T18:00:33 1746554433

Well you’re right (especially wrt things like security scanning), but you sort of have to include Azure DevOps in the conversation nowadays. I think the end goal for Microsoft is to get the larger organizations into ADO, either cross-pollinate pipelines and actions or just replace actions with pipelines at some point, and leave GitHub for simpler project structures and public codebases.

That’s why you won’t see a ton of work go into e.g. issues/projects on GitHub. Those features all already exist and are very robust in ADO, so if you need those kinds of things (and the reporting an enterprise would want to be able to run on that data), then you belong on ADO.

filmgirlcw · 2025-05-06T19:59:01 1746561541

I can say with a high level of confidence that the goal is definitely not to push larger orgs to ADO over GitHub. ADO is and will continue to be supported and you’re right that its project management features are much more advanced than GitHub, but the mission is not to push people off of ADO and into GitHub.

twodave · 2025-05-07T19:24:40 1746645880

Your opening and closing statements aren’t mutually exclusive, but I can’t tell if one is a typo (or if so, which one it is).

I didn’t mean to imply that MS wanted to migrate anyone, just that the different offerings serve different kinds of customers, so you can’t really just compare GitLab to GitHub and say MS is lacking in serving some group of them.

filmgirlcw · 2025-05-08T16:10:47 1746720647

Yeah I had a typo -- the statement should have been the mission is to push people from ADO to GitHub -- sorry.

The official guidance from Microsoft since probably 2019 has been to encourage all greenfield projects to GitHub, as opposed to ADO.