That's fair. He's a prominent figure on Twitter who posts a lot of charts and graphs on different and often benign subjects which get shared around a lot. It's not until you follow him that you realize his core obsession is promoting ideas about race and IQ in broad strokes. For example, singling out ethnic groups as intellectually inferior or more likely to commit crime. He has a history of posting misleading or falsified charts or data if their convenient to his agenda.
If you go deeper, his old Reddit account under a pseudonym was discovered to be a little more mask-off than his Twitter personality. From Wikipedia:
> between 2014 and 2016 Lasker had made many anti-Semitic and racist posts on Reddit under the pseudonym Faliceer.[7] In 2016, the account Faliceer self-identified as a "Jewish White Supremacist Nazi". He also wished Adolf Hitler a happy birthday, promoted eugenics and attacked interracial relationships.
I don't know about "falsified results", that's a pretty strong phrasing, but here's an interesting story about research Lasker presented on Twitter, by David Bessis:
None of you are evidencing your claims that he's unreliable. You just keep repeating it, even after being asked for specifics they other people cannot find.
It seems more likely from these posts that Lasker is actually accurate but you can't handle true claims about IQ and race, so have unilaterally decided those must be fake because they undermines a left wing world view. But group differences in IQ are a very strong and well replicated result. There is nothing unreliable about those. Until you bring receipts we have to assume you are engaging in ad hominem for ideological reasons.
If only we lived in an era of instantly-available information easily accessed through ubiquitous supercomputers that we have in our homes and in our pockets.
Systems are self-perpetuating entities that exist independently of the purpose one person or another places on it, and all the different people inside and outside of it all place different purposes.
That phrase the article is complaining about means that talking about a system purpose is meaningless babble. Only people know about purpose, systems don't.
That phrase has an author, that has made it patently clear what he means by it.
The fact that the article's author spends the entire time reacting to random people on Twitter instead of going after the author, and even dismisses the one person pointing to the original meaning with a no-answer doesn't make anything better.
I might be mistaken, but wouldn't a git signature already be signing trusted things (i.e. the person making the original signature is trusted), making any attack enabled by the input hash function a second preimage attack (i.e. an attacker onky knows the trusted input, not anything private like the signing key)?
Hash collisions mean you can't trust signatures from _untrusted_ sources, but git signatures don't seem to fit that situation.
As you pointed out, signatures make content trusted, but only to the degree of the algorithm's attack resistance. I think it's also important to define trust; for our purposes this means: authenticity (the signer deliberately signed the input) and integrity (the input wasn't tampered with).
If an algorithm is collision resistant a signature guarantees both authenticity and integrity. If it's just second preimage resistant, signing may only guarantee authenticity.
Now, the issue with Git using SHA-1 is that an attacker may submit a new patch to a project, rather than attack an existing commit. In that case they are in control of both halves of the collision, and they just need for the benign half to be useful enough to get merged.
Any future commits with the file untouched would allow the attacker to swap it for their malicious half, while claiming integrity thanks to the maintainers' signatures. They could do this by either breaching the official server or setting up a mirror.
One interesting thing to note though: in the case of human readable input such as source code, this attack breaks down as soon as you verify the repo contents. Therefore it's only feasible longer term when using binary or obfuscated formats.
- SSH3 is a bad name: this isn't a successor to SSHv2 and will only cause confusion
- The authors don't seem to understand that SSHv2 predates all of their chosen technologies, and provides "robust and time-tested mechanisms" they claim to be adding
- How is "hiding your server behind a secret link" a feature? This is, at best, security through obscurity, which can be layered on any network protocol (e.g. https://en.wikipedia.org/wiki/Port_knocking); this implies that the authors don't have much of a security background...?
I concur. They seem to have reinvented a part of the protocol without actually addressing many of the issues of SSH. The paper also doesn't bother to go into details on any the advancements that have been made to SSH since the original RFC, such as keyboard-interactive, GSSAPI, etc.
> Some SSH implementations such as OpenSSH or Tectia support other ways to authenticate users. Among them is the certificate-based user authentication: only users in possession of a certificate signed by a trusted certificate authority (CA) can gain access to the remote server [12]. Available for more than 10 years, this authentication method requires setting up a CA and distributing the certificates to new users and is still not commonly used nowadays.
Somebody had an agenda to make SSH look as bad as possible. You can implement OIDC authentication with keyboard-interactive, no need for HTTP/3 for that. However, it gets very tricky if you want automated / script access, so it doesn't solve the authentication problem.
As an aside, Tatu Ylonen, the original author of the SSH protocol, published a paper in 2019 titled "SSH Key Management Challenges and Requirements"[1], which is an interesting read. It would seem the authors of this paper should have at least read it.
> This is, at best, security through obscurity, which can be layered on any network protocol (e.g. https://en.wikipedia.org/wiki/Port_knocking); this implies that the authors don't have much of a security background...?
This isn't security through obscurity. The url would be a secret. This is a form of capability security, where to connect to the server you must be able to name the server.
A URL with a secret is, in my opinion, far more sane than port knocking, and will be much more efficient as well.
Your points are great but SSH is extensible so openid connect support doesn't mean much since you can do it with existing ssh.
"Security by obscurity" is only a thing if you're relying on that mechanism for security. People already configure SSH port knocking as you noted. It can be considered attack surface reduction and is a good feature given they're not using a secret link for any security control.
One benefit of their approach might be how you can use TLS pki now instead if setting up ssh-ca's. Potentially you would need to manage less pki.
But a criticism I have is how http* has much more vulns and new attack techniques being developed all the time unlike ssh. I can imagine LFI or request smuggling on the same http/2 web server causing RCE via their protocol.
I'd agree with you. The readme calls out "Significantly faster session establishment" and goes into greater detail later on.
> Establishing a new session with SSHv2 can take 5 to 7 network round-trip times, which can easily be noticed by the user. SSH3 only needs 3 round-trip times. The keystroke latency in a running session is unchanged.
I, for one, can say that sometimes session establishment can take a little while but not to the extent that it would be a selling point (so to speak) for me to adopt SSH3.
so if you want to execute uptime on a remote machine, the session will only be open for a few ms, and those extra RTT are a problem. (Yes, I know about openssh controlmaster...)
SSH over HTTP/(url) is a killer feature if you're working on hostile networks that block SSH and go even as far as to try and detect the protocol over the wire.
Git isn't relying on collision-resistance, it's relying on second-preimage[0] resistance, which is to say: in order to sneak a hash collision in to a git repository, you have to sneak _something else_ that's already trusted (e.g. via code review) into the repository; collisions can't (yet) be generated for arbitrary hashes.
I haven't heard of any second-preimage attacks against MD5, much less SHA-1, so mlindner was correct in asserting that MD5 would be fine (assuming 128 bits are enough). See also the analysis in [1].
More to the point, if you're able to sneak something into a repository in the first place (e.g. a benign file that generates a collision with a malicious file), then you're probably able to sneak in something more directly (e.g. [2]) that won't rely on both getting something in a trusted repository and then cloning from a different, untrusted source.
> if you're able to sneak something into a repository in the first place (e.g. a benign file that generates a collision with a malicious file), then you're probably able to sneak in something more directly
Could you imagine using an implementation of TLS that "probably" authenticated your network traffic though? I think there are two separate reasons we prefer to make strong guarantees in cryptography:
1. That's often really what I need. If I'm downloading e.g. software updates over the network, I really need those to be authentic.
2. Even when I arguably don't need strong authenticity, like just reading some news articles, I want to use the same strong tools, because I don't want to have to study and understand (much less teach) the situations where some weaker tool fails. Inevitably I'll get that wrong or just forget, and I'll end up using the weak tool in some case where I should've used the strong one.
In this case, if I imagine teaching how commit signing works with a weak hash function, it sounds like "Signing commits means that no one can sneak malicious content into your repository, unless they first steal your secret signing key, or else you ever committed (or allowed anyone else to commit) a non-text file that they created." Actually writing that second part out makes it feel really bad to me.
> "Signing commits means that no one can sneak malicious content into your repository
Signing commits does not mean that even when using cryptographically secure hash function. All it means is that you put your signature over a particular state of the repo (and, by extensions, its parent states). It has nothing to do with preventing "sneaking things in" - although it could be a (small) part of the whole set of measures taken to prevent someone from doing that.
> All it means is that you put your signature over a particular state of the repo (and, by extensions, its parent states).
That's technically true. Though in practice I think the implied social contract is that signing of a commit means you signal some kind of approval for the diff between the signed commit and its immediate predecessor(s).
I'm not 100% sure I understand your point, but it sounds like you're concerned about signing something using a weak hash function (i.e. where the hash of something is what actually gets signed)?
If that's the case, then my point is pretty simple: yes, SHA-1 is broken for signing untrusted input (due to weak collision resistance), but it is not broken (so far) for signing trusted input (due to strong preimage resistance).
My point earlier was primarily that the contents of a repository are generally trusted (via mechanisms like code review), and signing trusted content still works even with SHA-1.
Note that certificate signing vulnerabilities (which I assume is why TLS was mentioned?) usually rely on a malicious actor presenting one certificate and then presenting a different cert later; they can't arbitrarily fake existing certs from somebody else.
The analogous scenario for git repositories would be to have a malicious actor make a commit (or blob, tree, etc.) that could be swapped out for another. But if you already have malicious actors able to make commits in your repository, then the hash function doesn't matter: they can cause damage in many, many other ways.
> The analogous scenario for git repositories would be to have a malicious actor make a commit (or blob, tree, etc.) that could be swapped out for another. But if you already have malicious actors able to make commits in your repository, then the hash function doesn't matter: they can cause damage in many, many other ways.
The malicious actor can pose as a good-faith contributor and submit Pull Requests to your repository.
You review the code in the PR, and perhaps even prove it correct. Later on, the malicious actor can do the swapping trick. (Eg by running a mirroring service for your repository.)
> You review the code in the PR, and perhaps even prove it correct. Later on, the malicious actor can do the swapping trick. (Eg by running a mirroring service for your repository.)
Having a copy of code that is reviewable and then searching for a malicious collision is a preimage attack; extending two chosen prefixes (e.g. one "valid" and one "malicious") until they meet at a hash collision is how most practical (?) collision attacks work. The latter scenario produces large junk sections in the results, which should be obvious under even mild scrutiny.
If the reviewer misses the kilobytes of garbage in the middle of a file they're reviewing, then an attacker can just sneak malicious code in directly without requiring a hash collision.
If the project relies on an effectively unreviewable binary file that could hold kilobytes of junk (like some YAML files I've seen...), then that's already breaking the review process without requiring a hash collision.
Ignoring all of that, anybody grabbing code from an untrusted source is already vulnerable to whatever attacks that untrusted source wants to employ, with "exploiting hash collision" being one of the higher-effort attacks that could be mounted.
Essentially, any repository that would be vulnerable to any of the known hash collision attacks (via bad review, untrusted upstream, etc.) would be vulnerable to more mundane, easier attacks against the same weaknesses that do not depend on hash collisions.
> Having a copy of code that is reviewable and then searching for a malicious collision is a preimage attack;
No, it's not. You can sneak extra entropy into minor formatting choices or variable names etc, or exactly what you write in your commit messages. Or probably even ordering of files in your directories. (I don't think the git protocol enforces that files have to be in eg alphabetical order.)
> Ignoring all of that, anybody grabbing code from an untrusted source is already vulnerable to whatever attacks that untrusted source wants to employ, with "exploiting hash collision" being one of the higher-effort attacks that could be mounted.
I'm not sure. If your hash works fine, as long as someone trusted gives you the commit hash, anyone untrusted can give you the actual source.
And if you mean accepting PRs: accepting PRs from the untrusted internet basically how open source works..
First - if git really didn't care about collision resistance, there wouldn't have been a need to switch to SHA1DC as the hash function. They switched because they care enough that they were willing to accept the performance penalty.
Second - imagine this scenario: a user creates two commits with the same hash, one with a valid change and the second with a malicious one. The collision could be created by playing around with some data in a binary file - so, this is a collision attack not 2nd pre-image. The user then submits the change to the upstream and gets it approved. The user maintains a mirror of the upstream repo into which they place the malicious commit. Anyone that pulls from this mirror will think they have the same code as the upstream, even if they compare hashes.
So don't use an untrusted mirror? I guess - but that is something that should be possible with a strong hash. And if git really didn't want you to do that, it would provide for better ways of tracking where objects were actually pulled from.
Anyway, collision attacks are real and can impact git. They just aren't as bad as a 2nd pre-image attack.
> First - if git really didn't care about collision resistance, there wouldn't have been a need to switch to SHA1DC as the hash function. They switched because they care enough that they were willing to accept the performance penalty.
Git didn't _need_ to switch to SHA1DC, but they did because the cost was minimal and it's still a good idea to defend against known attacks.
> Second - imagine this scenario: a user creates two commits with the same hash, one with a valid change and the second with a malicious one. The collision could be created by playing around with some data in a binary file - so, this is a collision attack not 2nd pre-image. The user then submits the change to the upstream and gets it approved.
This is a general problem with binary files: they're hard to properly review. Having unreviewable files in a repository (binaries, machine-generated configs, etc.) is already a security problem; hash collisions would just be one (very difficult) way of exploiting that problem.
> The user maintains a mirror of the upstream repo into which they place the malicious commit. Anyone that pulls from this mirror will think they have the same code as the upstream, even if they compare hashes.
Having people pull data from an attacker-controlled source is a security issue, regardless of hash collisions.
> So don't use an untrusted mirror? I guess - but that is something that should be possible with a strong hash. And if git really didn't want you to do that, it would provide for better ways of tracking where objects were actually pulled from.
Git was designed for collaboration between trusted parties; collaboration between untrusted parties (e.g. pulling changes from untrusted sources) is a much harder problem that git doesn't pretend to solve.
> Anyway, collision attacks are real and can impact git. They just aren't as bad as a 2nd pre-image attack.
Collision attacks are real, but they have yet to impact git (beyond adopting SHA1DC, I guess), despite how big of a target popular git repositories are.
> Git didn't _need_ to switch to SHA1DC, but they did because the cost was minimal and it's still a good idea to defend against known attacks.
I'm confused with how a SHA1 collision being found is an "attack" if git truly doesn't care about collision resistance.
> This is a general problem with binary files: they're hard to properly review. Having unreviewable files in a repository (binaries, machine-generated configs, etc.) is already a security problem; hash collisions would just be one (very difficult) way of exploiting that problem.
I don't think you can ignore the use case - people do check binaries into git with the expectation that git will keep track of them.
> Git was designed for collaboration between trusted parties; collaboration between untrusted parties (e.g. pulling changes from untrusted sources) is a much harder problem that git doesn't pretend to solve.
Maybe that is how git was designed. But it's not how git is used. People do pull from repos that they don't fully trust. Maybe just to examine a change before throwing it away. What people don't expect is that by pulling from such a source that an unexpected file could get into their repository due to a collision attack. That is why git switched to SHA1DC - if git truly didn't support that use case, they wouldn't have needed to.
> Collision attacks are real, but they have yet to impact git (beyond adopting SHA1DC, I guess), despite how big of a target popular git repositories are.
I agree that collisions attacks are real but aren't a practical issue yet. What I was responding to was your comment:
> I haven't heard of any second-preimage attacks against MD5, much less SHA-1, so mlindner was correct in asserting that MD5 would be fine (assuming 128 bits are enough). See also the analysis in [1].
In that comment, it seems that you were saying that collisions attacks weren't a problem at all. But, it seems like you are saying in your more recent comment that "collision attacks are real"?
> This is a general problem with binary files: they're hard to properly review. Having unreviewable files in a repository (binaries, machine-generated configs, etc.) is already a security problem; hash collisions would just be one (very difficult) way of exploiting that problem.
That's not a problem in general. Eg having a binary bmp in your repository is fine as far as reviews go.
> Git isn't relying on collision-resistance, it's relying on second-preimage[0] resistance, which is to say: in order to sneak a hash collision in to a git repository, you have to sneak _something else_ that's already trusted (e.g. via code review) into the repository; collisions can't (yet) be generated for arbitrary hashes.
Yes, I know. I was arguing the more general point that 'The use of SHA-1 in Git is not for security purposes,'.
Of course, for anything crypto related we go by the maxim 'guilty, until proven innocent'. MD5 might not have a published second-preimage attack, yet; but its broken enough, that you shouldn't rely on it for anything anymore: it's not a acceptable crypto-hash, and if you don't need a crypto-hash, you can use something simpler like a CRC instead.
For anybody running into a similar problem: most git hosting services set up a subdomain that will allow SSH traffic over port 443, e.g. ssh.github.com, altssh.bitbucket.org, altssh.gitlab.com, etc.
This question reminded me of Project Wonderful[1], which I used on several tiny projects in an attempt to cover hosting costs: $5 per month. I never had enough traffic to make anything close to that, but it was pretty fun trying.