Hacker Newsnew | past | comments | ask | show | jobs | submit | zrm's commentslogin

The problem with fibre isn't the sensitivity. It's that most endpoints have a 1Gbps copper port on them and then Cat6A ports can be used with the common devices but also allow you to add or relocate 10Gbps devices without rewiring the building again.

However — unlike copper twisted pair — the bandwidth current fiber media can carry is nearly limited by nothing but the optics at each end.

That doesn't solve the chicken and egg problem.

What probably would is something like having PCIe and USB to 1Gbps fiber adapters that cost $5.


You've been able to get Intel X520 NICs [0], with transceivers included for ~40USD on Newegg for a long time. This is a little more than double the price of Newegg's cheapest single-port 10/100/1000 copper card, but even the cheapest available such card is three times your "chicken and egg"-solving price point.

I suspect the combination of the absence of cheap-o all-in-one AP/router combo boxes with any SFP+ cages and fiber cabling's reputation of being extremely fragile have much more to do with its scarcity at the extremely low end of networking gear than anything else.

[0] This is a two-port SFP+ PCI Express card


You can get copper ones for $5.99 (quality may vary):

https://www.amazon.com/1000Mbps-Network-Performance-Gigabit-...

https://www.amazon.com/SALAN-Ethernet-Portable-Internet-Conv...

But it's not competing with those, it's competing with the copper port which is already built into most devices.

Another thing that would work is something like this (also $5.99), but with one of the ports as fibre:

https://www.amazon.com/Gigabit-Ethernet-Splitter-1000Mbps-In...

The point being you need some cheap way to plug in existing copper devices if you run fibre to the endpoints.

This plus $5 for a transceiver is pretty close at $15:

https://www.amazon.com/Gigabit-Ethernet-Converter-Auto-Negot...

But +$15 and an extra wall outlet per endpoint is still an inconvenience, and if a two-port device with its own power supply can be made for $15 then where is the PCIe/USB to fibre adapter for <$10?


> (quality may vary):

Yep. Good NICs last for approximately forever, life's way too short to deal with maybe-flaky NICs, and the price difference between the Amazon Special and something that's going to be reliable is -what- two big boxes of Cheerios? Two dozen eggs? Not. Worth it.

> But it's not competing with those, it's competing with the copper port which is already built into most devices.

Correct! That's part of why I was so very surprised to see you suggesting that extremely cheap PCI Express and USB adapters would "solve the chicken and egg problem".

> The point being you need some cheap way to plug in existing copper devices if you run fibre to the endpoints.

That's called a multi-port switch. Netgear sells five-port gigabit ones for like 20 USD. Switches that have two SFP+ cages and eight copper gigabit ports [0] are six times the price of a cheap-o Netgear switch, but are something that's going to last at least a decade. It's also pretty uncommon to find SOHO switches that have SFP+ cages and don't have at least one fixed copper port.

> This plus $5 for a transceiver is pretty close at $15:

If you're connecting a single device, why the hell would you use that when you could slap a copper SFP or SFP+ module in the switch's cage and run a cable? If you're connecting multiple devices, then either install multiple copper modules and run multiple cables, run multiple copper cables from fixed copper ports on the switch, or put a switch where the existing copper devices are.

[0] <https://mikrotik.com/product/css610_8g_2s_in>


> If you're connecting a single device, why the hell would you use that when you could slap a copper SFP or SFP+ module in the switch's cage and run a cable?

The problem to be solved is that you want to be able to put fibre inside the walls of the building instead of copper. Running a new cable to the switch closet is the thing to be prevented.

But if the wall jacks are fibre then you need some economical way of hooking them up to every printer and single-purpose device with a network port. If you have to buy another $100+ switch just to get from fibre to copper even when there is only one device near that jack, people aren't going to go for that.


In practice though 10G via copper requires pretty perfect terminations. The slightest error leads to crosstalk issues.

Ymmv. I've got a mix of cheap premade patch cables and some I crimped from solid core, all cat5e, all holding 10gbe totally happily. I suspect that only works because they're a meter or two long but that reaches across the rack.

> If we can get that to raise a red flag with people (and agents), people won’t be trying to put control instructions alongside user content (without considering safeguards) as much.

At a basic level there is no avoiding this. There is only one network interface in most machines and both the in-band and out-of-band data are getting serialized into it one way or another. See also WiFi preamble injection.

These things are inherently recursive. You can't even really have a single place where all the serialization happens. It's user data in JSON in an HTTP stream in a TLS record in a TCP stream in an IP packet in an ethernet frame. Then it goes into a SQL query which goes into a B-tree node which goes into a filesystem extent which goes into a RAID stripe which goes into a logical block mapped to a physical block etc. All of those have control data in the same stream under the hood.

The actual mistake is leaving people to construct the combined data stream manually rather than programmatically. Manually is concatenating the user data directly into the SQL query, programmatically is parameterized queries.


>All of those have control data in the same stream under the hood.

Not true. For most binary protocols, you have something like <Header> <Length of payload> <Payload>. On magnetic media, sector headers used a special pattern that couldn't be produced by regular data [1] -- and I'm sure SSDs don't interpret file contents as control information either!

There may be some broken protocols, but in most cases this kind of problem only happens when all the data is a stream of text that is simply concatenated together.

[1] e.g. https://en.wikipedia.org/wiki/Modified_frequency_modulation#...


The header and length of the payload are control data. It's still being concatenated even if it's binary. A common way to screw that one up is to measure the "length of payload" in two different ways, for example by using the return value of strlen or strnlen when setting the length of the payload but the return value of read(2) or std::string size() when sending/writing it or vice versa. If the data unexpectedly contains an interior NULL, or was expected to be NULL terminated and isn't, strnlen will return a different value than the amount of data read into the send buffer. Then the receiver may interpret user data after the interior NULL as the next header or, when they're reversed, interpret the next header as user data from the first message and user data from the next message as the next header.

Another fun one there is that if you copy data containing an interior NULL to a buffer using snprintf and only check the return value for errors but not an unexpectedly short length, it may have copied less data into the buffer than you expect. At which point sending the entire buffer will be sending uninitialized memory.

Likewise if the user data in a specific context is required to be a specific length, so you hard-code the "length of payload" for those messages without checking that the user data is actually the required length.

This is why it needs to be programmatic. You don't declare a struct with header fields and a payload length and then leave it for the user to fill them in, you make the same function copy N bytes of data into the payload buffer and increment the payload length field by N, and then make the payload buffer and length field both modifiable only via that function, and have the send/write function use the payload length from the header instead of taking it as an argument. Or take the length argument but then error out without writing the data if it doesn't match the one in the header.


From your previous post:

>It's user data in JSON in an HTTP stream in a TLS record in a TCP stream in an IP packet in an ethernet frame. Then it goes into a SQL query which goes into a B-tree node which goes into a filesystem extent which goes into a RAID stripe which goes into a logical block mapped to a physical block etc. All of those have control data in the same stream under the hood.

It's true that a lot of code out there has bugs with escape sequences or field lengths, and some protocols may be designed so badly that it may be impossible to avoid such bugs. But what you are suggesting is greatly exaggerated, especially when we get to the lower layers. There is almost certainly no way that writing a "magic" byte sequence to a file will cause the storage device to misinterpret it as control data and change the mapping of logical to physical blocks. They've figured out how to separate this information reliably back when we were using floppy disks.

That the bits which control the block mapping are stored on the same device as a record in an SQL database doesn't mean that both are "the same stream".


> There is almost certainly no way that writing a "magic" byte sequence to a file will cause the storage device to misinterpret it as control data and change the mapping of logical to physical blocks.

Which is also what happens if you use parameterized SQL queries. Or not what happens when one of the lower layers has a bug, like Heartbleed.

There also have been several disk firmware bugs over the years in various models where writing a specific data pattern results in corruption because the drive interprets it as an internal sequence.


I distinctly remember bugs with non-Hayes modems where they would treat `+++ATH0` coming over the wire as a control, leading to BBS messages which could forcibly disconnect the unlucky user who read it.

In this particular case, IIRC Hayes had patented the known approach for detecting this and avoiding the disconnect, so rival modem makers were somewhat powerless to do anything better. I wonder if such a patent would still hold today...


https://en.wikipedia.org/wiki/+++ATH0#Hayes'_solution

What was patented was the technique of checking for a delay of about a second to separate the command from any data. It still had to be sent from the local side of the connection, so the exploit needed some way to get it echoed back (like ICMP).

More relevant to this bug: https://en.wikipedia.org/wiki/ANSI_bomb#Keyboard_remapping

DOS had a driver ANSI.SYS for interpreting terminal escape sequences, and it included a non-standard one for redefining keys. So if that driver was installed, 'type'ing a text file could potentially remap any key to something like "format C: <Return> Y <Return>".


You expect the files to still be accessible using relative paths. What do you expect to happen if your cloud storage file path is 50 characters long and is mounted in a folder which is 4050 characters long when PATH_MAX is 4096?

The sync application itself can handle this using openat(2) or similar and should probably be using that regardless to avoid races.


Ah, I forgot that the maximum path length is usually limited by PATH_MAX, it's the path segment that's usually limited by the filesystem.

Point taken, although I still think it's better for cloud storage services to err on the side of compatibility, i.e. what's the lowest common denominator between Linux, macOS, Android, iOS from 10 years ago and Windows 7?


That's not a great idea for three different reasons: Filesystems have to do ugly things when they're almost full like split files into many small blocks and store more metadata to keep track of them all, SSDs get slower and have compromised wear leveling when they're almost full, and it makes you more likely to subject yourself to perils of fully running out which can cause random non-temporary problems even if it only happens temporarily.


A good way to do this is to create a swap file, both because then you can use it as a swap file until you need to delete it and because swap files are required to not be sparse.


I'm not at a machine, linux doesn't zero the swap "file" does it? if i set vm.swappiness = 0, will your "trick" work if i never hit memory pressure?


If you create the file with 'mkswap --file' it allocates the blocks. Trying to use 'swapon' with an existing sparse file won't remove the holes for you but does notice them and then refuse to use it.


There are different governments and different subdivisions within any given government. The only thing you need to get a government that had been pushing Chat Control to do some trust busting is to get more votes.


> C and C++ are usually stuck in that antiquated thinking that you should build a module, package it into some libraries, install/export the library binaries and associated assets, then import those in other projects. That makes everything slow, inefficient, and widely dangerous.

It seems to me the "convenient" options are the dangerous ones.

The traditional method is for third party code to have a stable API. Newer versions add functions or fix bugs but existing functions continue to work as before. API mistakes get deprecated and alternatives offered but newly-deprecated functions remain available for 10+ years. With the result that you can link all applications against any sufficiently recent version of the library, e.g. the latest stable release, which can then be installed via the system package manager and have a manageable maintenance burden because only one version needs to be maintained.

Language package managers have a tendency to facilitate breaking changes. You "don't have to worry" about removing functions without deprecating them because anyone can just pull in the older version of the code. Except the older version is no longer maintained.

Then you're using a version of the code from a few years ago because you didn't need any of the newer features and it hadn't had any problems, until it picks up a CVE. Suddenly you have vulnerable code running in production but fixing it isn't just a matter of "apt upgrade" because no one else is going to patch the version only you were using, and the current version has several breaking changes so you can't switch to it until you integrate them into your code.


This is all wishful thinking disconnected from practicalities.

First you confuse API and ABI.

Second there is no practical difference between first and third-party for any sufficiently complex project.

Third you cannot have multiple versions of the same thing in the same program without very careful isolation and engineering. It's a bad idea and a recipe for ODR violations.

In any non-trivial project there will be complex dependency webs across different files and subprojects, and humans are notoriously bad at packaging pieces of code into sensible modules, libraries or packages, with well-defined and maintained boundaries. Being able to maintain ABI compatibility, deprecating things while introducing replacement etc. is a massive engineering work and simply makes people much less likely to change the way things are done, even if they are broken or not ideal. That's an effort you'll do for a kernel (and only on specific boundaries) but not for the average program.


> First you confuse API and ABI.

I'm not confusing API with ABI. If you don't have a stable ABI then you essentially forfeit the traditional method of having every program on the system use the same copy (and therefore version) of that library, which in turn encourages them to each use a different version and facilitates API instability by making the bad thing easier.

> Second there is no practical difference between first and third-party for any sufficiently complex project.

Even when you have a large project, making use of curl or sqlite or openssl does not imply that you would like to start maintaining a private fork.

There are also many projects that are not large enough to absorb the maintenance burden of all of their external dependencies.

> Third you cannot have multiple versions of the same thing in the same program without very careful isolation and engineering.

Which is all the more reason to encourage every program on the system to use the same copy by maintaining a stable ABI. What do you do after you've encouraged everyone to include their own copy of their dependencies and therefore not care if there are many other incompatible versions, and then two of your dependencies each require a different version of a third?

> In any non-trivial project there will be complex dependency webs across different files and subprojects, and humans are notoriously bad at packaging pieces of code into sensible modules, libraries or packages, with well-defined and maintained boundaries.

This feels like arguing that people are bad at writing documentation so we should we should reduce their incentive to write it, instead of coming up with ways to make doing the good thing easier.


> Use HTTP (secure is not the way to decentralize).

This doesn't seem like useful advice. If you're going to use HTTP at all there is essentially zero practical advantage in not using Let's Encrypt.

The better alternative would be to use new protocols that support alternative methods of key distribution (e.g. QR codes, trust on first use) instead of none.

> Selfhost DNS server (hard to scale in practice).

This is actually very easy to do.


Let's Encrypt is not part of our friends here.

DNS is easy for yourself, but if you host it for others (1000+ of people) and it needs to have all domains in the world, then it becomes a struggle.


Let's Encrypt is a non-profit that defeated the certificate cartel. The main thing you get from using HTTP without it is bad security.

DNS can answer thousands of queries per second on a Raspberry Pi and crazy numbers on a single piece of old server hardware that costs less than $500.


No root certificate is decentralized.

If your DNS port is closed by your ISP, you can't have people use your DNS server from the outside and then you need Google or Amazon which are not decentralized.

Also to be selfhosted you can't just forward what root DNS servers say, you need to store all domains and their IPs in a huge database.


> No root certificate is decentralized.

The root certificates are pretty decentralized. There isn't just one and you can use whichever one you like for your certificate. The browsers or other clients then themselves choose which roots to trust.

The main thing that isn't very decentralized here is Google/Chrome being the one to de facto choose who gets to be root CA for the web, but then it seems like your beef should be with people using Chrome rather than people using Let's Encrypt.

> If your DNS port is closed by your ISP, you can't have people use your DNS server from the outside and then you need Google or Amazon which are not decentralized.

It's pretty uncommon for ISPs to close the DNS port and even if they did, you could then use any VPS on any hosting provider.

> Also to be selfhosted you can't just forward what root DNS servers say, you need to store all domains and their IPs in a huge database.

I suspect you're not familiar with how DNS works.

Authoritative DNS servers are only required to have a database of their own domains. If your personal domain is example.com then you only need to store the DNS records for example.com. Even if you were hosting a thousand personal domains, the database would generally be measured in megabytes.

Recursive DNS servers (like 1.1.1.1 or 8.8.8.8) aren't strictly required to store anything except for the root hints file, which is tiny. In practice they will cache responses to queries for the TTL (typically up to a day) so they can answer queries from the cache instead of needing to make another recursive query for each client request, but they aren't required to cache any specific number of records. A lot of DNS caches are designed to have a fixed-sized cache and LRU evict records when it gets full. A recursive DNS server with a 1GB cache will have reasonable performance even under high load because the most commonly accessed records will be in it and the least commonly accessed records are likely to have expired before they're requested again anyway. A much larger cache gets you only a small performance improvement.

DNS records are small so storing a very large number of them can be done on a machine with few resources. A DNS RRset is usually going to be under 100 bytes. You can fit tens of millions of them in RAM on a 4GB Raspberry Pi.


I have two younger brothers. They have the same last name, first initial, a history of having lived at the same address, and the same birth date, because they're twins.

Every time one of them goes to a particular medical facility, he has to explicitly decline having them merge their charts.



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: