Hacker Newsnew | past | comments | ask | show | jobs | submit | drivenextfunc's commentslogin

Writing C for eBPF is cumbersome and you'd like to avoid it. Okay, that's reasonable. But I don't think it would be a good idea to write a compiler that emits eBPF binary from (a tiny subset of) Python. Why not just write code in pseudo-Python (or whatever language you're comfortable with) and have it translated by an LLM, and paste it in the source code? That would be much better because there would be fewer layers and a significant reduction in runtime cost.


I don't understand...

So, instead of having a defined and documented subset of Python that compiles to eBPF in a deterministic way... use an undefined pseudo language and let the LLM have fun with it without understanding if the result C is correct?

What would be the advantage?


The behavior of CPython and a few other implementations of Python (such as PyPy) is well documented and well understood. The semantics of the tiny subset of Python that this Python-to-eBPF compiler understands is not. For example, inferring from the fact that it statically compiles Python-ish AST to LLVM IR, you can have a rough idea that dynamic elements of Python semantics are unlikely to be compiled, but you cannot know exactly which elements without carefully reading the documentation or source code of the compiler. You can guess globals() or locals() won't work, maybe .__dict__ won't as well, but how about type() or isinstance()? You don't know without digging into the documentation (which may be lacking), because the subset of Python this compiler understands is rather arbitrary.

And also, having an LLM translate Python-ish pseudo code into C does not imply that you cannot examine it before putting it into a program. You can manually review it and make modifications as you want. It just reduces time spent compared with writing C code by hand.


But then we have to write the pseudocode anyway (that cannot be corrected by my IDE, so I don't know if I have pseudomistakes [sorry for the pun]), the LLM 'transpile' (that's not understood at all), and you have to review the C code anyway, so you have to know eBPF code really well.

Would that represent a time advantage?


Are you seriously asking why someone might want to do something guaranteed to behave exactly as they defined it, when they could have an LLM hallucinate code that touches the core of their system, instead?

Why would anyone go with the inaccurate option?


LLMs will never be able to write eBPF code.

eBPF is a weird, formally validated secure subset of C. No "normal" C program will ever pass the eBPF validation checks.


LLMs can easily already write eBPF code. Try it.


> tell me how you never actually developed an eBPF program without telling me you never actually developed an eBPF program


Just try it. Here’s an example that I know it will work flawlessly for, because I used it for this: at $formerjob, all laptops come with a piece of malware called “connections”, which obnoxiously pops up at some point during the day (stealing window/mouse focus) and asks you some asinine survey question about morale on your team and/or the company values. There are a few good ways to solve this: apparmor/selinux (but this runs the risk of your config file conflicting with the rules shipped by IT), a simple bash script that runs pkill every 5 seconds (too slow and it still steals focus, too fast and your laptop fans start spinning), etc. A better way is to use a bpf hook on execve.

Ask an LLM to write a simple ebpf program which kills any program with a specific name/path. Even crappy local models can handle this with ease.

If you’re talking about more complicated map-based programs, you’re probably right that it will struggle a bit, but it will still figure it out. The eBPF api is not very different than any other C api at the end of the day. It will do fine without the standard library, if you ask it to.


By eBPF I mean things like XDP network filters.

The issue here is the static formal validation the kernel does before loading your eBPF program.

(Even humans don't really know how it works. You need to use specific byte width types and access memory in specific patterns or the validation will fail.)


Respectfully, you don’t know what you’re talking about.

1. If you meant XDP, you should have said XDP, not eBPF.

2. The kernel does that validation on all ebpf code that it loads, regardless of whether XDP is involved.

3. Humans know how it works.


"translated by an llm"

smh my head


If you look at the code, you'll be (unpleasantly) surprised, I think. The author does not seem to have known what Y combinator is.


If it helps, you will find the Y-combinator described (indeed, derived) in the first edition (https://cs.brown.edu/~sk/Publications/Books/ProgLangs/2007-0...) of the author's programming languages book (https://www.plai.org/). (Page 228, if that helps, though the derivation begins on page 223.)

For added fun, the day he teaches it in class, he wears a t-shirt from Y-combinator the startup accelerator (and explains what its name means).

Now that we've gotten that out of the way, it remains unclear what is surprising or unpleasantly surprising about the code.


This reminds me of when John Nagle showed up in a thread about his algorithm on here.


Shriram invoking Shriram ... (λ.x (x x) λ.x (x x)) forever \m/ :)


In addition to the general sibling comments, I can personally attest that Shriram knows what the Y combinator is and has been teaching students about it for at least 25 years. My own lecture notes from one of his classes about the lambda calculus and the Y combinator were for a long time on the front page of google results for info about either!


I'm pretty sure Shriram Krishnamurthi understands the Y combinator...



Don't see Y-combinator mentioned anywhere on that page.


But I do see that page mentioned on Y Combinator's page.

The joke can go on forever...


Somebody forgot to add a base case.


No need... Shriram is already based.


These are my favorite types of comments on hn


The HN guidelines suggest assuming the strongest interpretation of what someone said, so obviously the commenter was making a joke. :)


lmao Google him


This reads like GPT-5 output. Anyone familiar with the model will recognize its distinctive style. While using LLM-generated content isn't inherently wrong, why not share the prompts? It's like presenting a book summary without naming the book.


It is, Claude and it boiled down to this.

My original idea was to have the bank sign a thing that contained your ip address and user agent; have the bank add in an age claim; and copy/paste it to the RP.

I figured it would produce a document a little more on point.

This setup with webauthn feels like overkill; but with banks and regs - it feels more beefy without adding a substantial amount of complexity.


Has anyone used OpenTelemetry for long-running batch jobs? OTel seems designed for web apps where spans last seconds/minutes, but batch jobs run for hours or days. Since spans are only submitted after completion, there's no way to track progress during execution, making OTel nearly unusable for batch workloads.

I have a similar issue with Prometheus -- not great for batch job metrics either. It's frustrating how many otherwise excellent OSS tools are optimized for web applications but fall short for batch processing use cases.


I’ve implemented OTEL for background jobs, so async jobs that get picked up from the DB where I store the trace context in the DB and pass it along to multiple async jobs. For some jobs that fail and retry with a backoff strategy, they can take many hours and we can see the traces fine in grafana. Each job create its own span but they are all within the same trace.

Works well for us, I’m not sure I understand the issue you’re facing?


Ok after re reading I think you have issues with long running spans, I think you should break down your spans in smaller chunks. But a trace can take many hours or days, and be analysed even when it’s not finished


Nothing running for days, but sometimes a half hour or so. When the process kicks off it starts a trace, but individual steps of the process create separate spans within that trace (and sometimes further nested spans) that don't run the entire length of the job. As the job progresses, the spans and their related events, logs, etc all appear.

I think this does highlight, to me, the biggest weakness of OTel--the actual documentation and examples for "how to solve problems with this" really suck.


You could use span links for this. The idea is you have a bunch of discrete traces that indicate they are downstream or upstream of some other trace. You’d just have to bend it a bit to work in your probably single process batch executor !


> I have a similar issue with Prometheus -- not great for batch job metrics either.

How do you mean? The metrics are available for 15 days by default. What exactly are you missing?


Hm from what I’ve seen it emits metrics at a regular interval just like Prometheus. Maybe I’m thinking of something else though.


Many companies seem to be using Apache Iceberg, but the ecosystem feels immature outside of Java. For instance, iceberg-rust doesn't even support HDFS. (Though admittedly, Iceberg's tendency to create many small files makes it a poor fit for HDFS anyway.)


Seems like this is going to be a permanent issue, no? Library level storage APIs are complex and often quite leaky. That's based on looking at the innards of MySQL and ClickHouse for a while.

It seems quite possible that there will be maybe three libraries that can write to Iceberg (Java, Python, Rust, maybe Golang), while the rest at best will offer read access only. And those language choices will condition and be conditioned by the languages that developers use to write applications that manage Iceberg data.


This was the same with arrow/parquet libraries as well. It takes a long time for all implementations to catch up


Regarding the stubborn and narcissistic personality of LLMs (especially reasoning models), I suspect that attempts to make them jailbreak-resistant might be a factor. To prevent users from gaslighting the LLM, trainers might have inadvertently made the LLMs prone to gaslighting users.


I share the author's sentiment completely. At my day job, I manage multiple Kubernetes clusters running dozens of microservices with relative ease. However, for my hobby projects—which generate no revenue and thus have minimal budgets—I find myself in a frustrating position: desperately wanting to use Kubernetes but unable to due to its resource requirements. Kubernetes is simply too resource-intensive to run on a $10/month VPS with just 1 shared vCPU and 2GB of RAM.

This limitation creates numerous headaches. Instead of Deployments, I'm stuck with manual docker compose up/down commands over SSH. Rather than using Ingress, I have to rely on Traefik's container discovery functionality. Recently, I even wrote a small script to manage crontab idempotently because I can't use CronJobs. I'm constantly reinventing solutions to problems that Kubernetes already solves—just less efficiently.

What I really wish for is a lightweight alternative offering a Kubernetes-compatible API that runs well on inexpensive VPS instances. The gap between enterprise-grade container orchestration and affordable hobby hosting remains frustratingly wide.


> What I really wish for is a lightweight alternative offering a Kubernetes-compatible API that runs well on inexpensive VPS instances. The gap between enterprise-grade container orchestration and affordable hobby hosting remains frustratingly wide.

Depending on how much of the Kube API you need, Podman is that. It can generate containers and pods from Kubernetes manifests [0]. Kind of works like docker compose but with Kubernetes manifests.

This even works with systemd units, similar to how it's outlined in the article.

Podman also supports most (all?) of the Docker api, thus docker compose, works, but also, you can connect to remote sockets through ssh etc to do things.

[0] https://docs.podman.io/en/latest/markdown/podman-kube-play.1...

[1] https://docs.podman.io/en/latest/markdown/podman-systemd.uni...


The docs don't make it clear, can it do "zero downtime" deployments? Meaning it first creates the new pod, waits for it to be healthy using the defined health checks and then removes the old one? Somehow integrating this with service/ingress/whatever so network traffic only goes to the healthy one?


I can't speak on it's capabilities, but I feel like I have to ask: for what conceivable reason would you even want that extra error potential with migrations etc?

It means you're forced to make everything always compatible between versions etc.

For a deployment that isn't even making money and is running on a single node droplet with basically no performance... Why?


> I can't speak on it's capabilities, but I feel like I have to ask: for what conceivable reason would you even want that extra error potential with migrations etc?

It's the default behavior of a kubernetes deployment which we're comparing things to.

> It means you're forced to make everything always compatible between versions etc.

For stateless services, not at all. The outside world just keeps talking to the previous version while the new version is starting up. For stateful services, it depends. Often there are software changes without changes to the schema.

> For a deployment that isn't even making money

I don't like looking at 504 gateway errors

> and is running on a single node droplet with basically no performance

I'm running this stuff on a server in my home, it has plenty of performance. Still don't want to waste it on kubernetes overhead, though. But even for a droplet, running the same application 2x isn't usually a big ask.


GP talks about personal websites on 1vCPU, there's no point in zero downtime then. Apples to oranges.


Zero downtime doesn't mean redundancy here. It means that no request gets lost or interrupted due to a container upgrade.

The new container spins up while the old container is still answering requests and only when the new container is running and all requests to the old container are done, then the old container gets discarded.


You can use firecracker !


Have you seen k0s or k3s? Lots of stories about folks using these to great success on a tiny scale, e.g. https://news.ycombinator.com/item?id=43593269


I tried k3s but even on an immutable system dealing with charts and all the other kubernetes stuff adds a new layer of mutability and hence maintenance, update, manual management steps that only really make sense on a cluster, not a single server.

If you're planning to eventually move to a cluster or you're trying to learn k8s, maybe, but if you're just hosting a single node project it's a massive effort, just because that's not what k8s is for.


I use k3s. With more than more master node, it's still a resource hog and when one master node goes down, all of them tend to follow. 2GB of RAM is not enough, especially if you also use longhorn for distributed storage. A single master node is fine and I haven't had it crash on me yet. In terms of scale, I'm able to use raspberry pis and such as agents so I only have to rent a single €4/month vps.


I'm laughing because I clicked your link thinking I agreed and had posted similar things and it's my comment.

Still on k3s, still love it.

My cluster is currently hosting 94 pods across 55 deployments. Using 500m cpu (half a core) average, spiking to 3cores under moderate load, and 25gb ram. Biggest ram hog is Jellyfin (which appears to have a slow leak, and gets restarted when it hits 16gb, although it's currently streaming to 5 family members).

The cluster is exclusively recycled old hardware (4 machines), mostly old gaming machines. The most recent is 5 years old, the oldest is nearing 15 years old.

The nodes are bare Arch linux installs - which are wonderfully slim, easy to configure, and light on resources.

It burns 450Watts on average, which is higher than I'd like, but mostly because I have jellyfin and whisper/willow (self hosted home automation via voice control) as GPU accelerated loads - so I'm running an old nvidia 1060 and 2080.

Everything is plain old yaml, I explicitly avoid absolutely anything more complicated (including things like helm and kustomize - with very few exceptions) and it's... wonderful.

It's by far the least amount of "dev-ops" I've had to do for self hosting. Things work, it's simple, spinning up new service is a new folder and 3 new yaml files (0-namespace.yaml, 1-deployment.yaml, 2-ingress.yaml) which are just copied and edited each time.

Any three machines can go down and the cluster stays up (metalLB is really, really cool - ARP/NDP announcements mean any machine can announce as the primary load balancer and take the configured IP). Sometimes services take a minute to reallocate (and jellyfin gets priority over willow if I lose a gpu, and can also deploy with cpu-only transcoding as a fallback), and I haven't tried to be clever getting 100% uptime because I mostly don't care. If I'm down for 3 minutes, it's not the end of the world. I have a couple of commercial services in there, but it's free hosting for family businesses, they can also afford to be down an hour or two a year.

Overall - I'm not going back. It's great. Strongly, STRONGLY recommend k3s over microk8s. Definitely don't want to go back to single machine wrangling. The learning curve is steeper for this... but man do I spend very little time thinking about it at this point.

I've streamed video from it as far away as literally the other side of the world (GA, USA -> Taiwan). Amazon/Google/Microsoft have everyone convinced you can't host things yourself. Even for tiny projects people default to VPS's on a cloud. It's a ripoff. Put an old laptop in your basement - faster machine for free. At GCP prices... I have 30k/year worth of cloud compute in my basement, because GCP is a god damned rip off. My costs are $32/month in power, and a network connection I already have to have, and it's replaced hundreds of dollars/month in subscription costs.

For personal use-cases... basement cloud is where it's at.


> It burns 450Watts on average

To put that into perspective, that's more than my entire household including my server that has an old GPU in it

Water heating is electric yet we still don't use 450W×year≈4MWh of electricity. In winter we just about reach that as a daily average (as a household) because we need resistive heating to supplement the gas system. Constantly 450W is a huge amount of energy for flipping some toggles at home with voice control and streaming video files


That's also only four and a half incandescent lightbulbs. Not enough to heat your house ;)


Remember that modern heating and hot water systems have a >1 COP, meaning basically they provide more heat than the input power. Air-sourced heat pumps can have a COP of 2-4, and ground source can have 4-5, meaning you can get around 1800W of heat out of that 450W of power. That's ignoring places like Iceland where geothermal heat can give you effectively free heat. Ditto for water heating, 2-4.5 COP.

Modern construction techniques including super insulated walls and tight building envelops, heat exchangers, can dramatically reduce heating and cooling loads.

Just saying it's not as outrageous as it might seem.


> Remember that modern heating and hot water systems have a >1 COP, meaning basically they provide more heat than the input power.

Oh for sure! Otherwise we'd be heating our homes directly with electricity.

Thanks for putting concrete numbers on it!


And yet it's far more economical for me than paying for streaming services. A single $30/m bill vs nearly $100/m saved after ditching all the streaming services. And that's not counting the other saas products it replaced... just streaming.

Additionally - it's actually not that hard to put this entire load on solar.

4x350watt panels, 1 small inverter/mppt charger combo and a 12v/24v battery or two will do you just fine in the under $1k range. Higher up front cost - but if power is super expensive it's a one time expense that will last a decade or two, and you get to feel all nice and eco-conscious at the same time.

Or you can just not run the GPUs, in which case my usage falls back to ~100w. I You can drive lower still - but it's just not worth my time. It's only barely worth thinking about at 450W for me.


I'm not saying it should be cheaper to run this elsewhere, I'm saying that this is a super high power draw for the utility it provides

My own server doesn't run voice recognition so I can't speak to that (I can only opine that it can't be worth a constant draw of 430W to get rid of hardware switches and buttons), but my server also does streaming video and replaces SaaS services, so similar to what you mention, at around 20W


Found the European :) With power as cheap as it is in the US, some of us just haven't had to worry about this as much as we maybe should. My rack is currently pulling 800W and is mostly idle. I have a couple projects in the works to bring this down, but I really like mucking around with old enterprise gear and that stuff is very power hungry.

Dell R720 - 125W

Primary NAS - 175W

Friend's Backup NAS - 100W

Old i5 Home Server - 100W

Cisco 2921 VoIP router - 80W

Brocade 10G switch - 120W

Various other old telecom gear - 100W


I care about the cost far less than the environmental impact. I guess that's also a European tell?


Perhaps. Many people in America also claim to care about the environmental impact of a number of things. I think many more people care performatively than transformatively. Personally, I don't worry too much about it. It feels like a lost cause and my personal impact is likely negligible in the end.


Then offsetting that cost to a cloud provider isn't any better.

450W just isn't that much power as far as "environmental costs" go. It's also super trivial to put on solar (actually my current project - although I had to scale the solar system way up to make ROI make sense because power is cheap in my region). But seriously, panels are cheap, LFP batteries are cheap, inverters/mppts are cheap. Even in my region with the cheap power, moving my house to solar has returns in the <15 years range.


> Then offsetting that cost to a cloud provider isn't any better.

Nobody made that claim

> 450W just isn't that much power as far as "environmental costs" go

It's a quarter of one's fair share per the philosophy of https://en.wikipedia.org/wiki/2000-watt_society

If you provide for yourself (e.g. run your IT farm on solar), by all means, make use of it and enjoy it. Or if the consumption serves others by doing wind forecasts for battery operators or hosts geographic data that rescue workers use in remote places or whatnot: of course, continue to do these things. In general though, most people's home IT will fulfil mostly their own needs (controlling the lights from a GPU-based voice assistant). The USA and western Europe have similarly rich lifestyles but one has a more than twice as great impact on other people's environment for some reason (as measured by CO2-equivalents per capita). We can choose for ourselves what role we want to play, but we should at least be aware that our choices make a difference


> My rack is currently pulling 800W and _is mostly idle_.

Emphasis mine. I have a rack that draws 200w continuously and I don't feel great about it, even though I have 4.8kW of panels to offset it.


It absolutely is. Americans dgaf, they're driving gas guzzles on subsidized gas and cry when it comes close to half the cost of normal countries.


In America, taxes account for about a fifth of the price of a unit of gas. In Europe, it varies around half.

The remaining difference in cost is boosted by the cost of ethanol, which is much cheaper in the US due to abundance of feedstock and heavy subsidies on ethanol production.

The petrol and diesel account for a relatively small fraction on both continents. The "normal" prices in Europe aren't reflective of the cost of the fossil fuel itself. In point of fact, countries in Europe often have lower tax rates on diesel, despite being generally worse for the environment.


Good ol 'murica bad' strawmen.

Americans drive larger vehicles because our politicians stupidly decided mandating fuel economy standards was better than a carbon tax. The standards are much laxer for larger vehicles. As a result, our vehicles are huge.

Also, Americans have to drive much further distances than Europeans, both in and between cities. Thus gas prices that would be cheap to you are expensive to them.

Things are the way they are because basic geography, population density, and automotive industry captured regulatory and zoning interests. You really can't blame the average American for this; they're merely responding to perverse incentives.


How is this in any way relevant to what I said? You're just making excuses, but that doesn't change the fact that americans don't give a fuck about the climate, and they objectively pollute far more than those in normal countries.


If you can't see how what I said was relevant, perhaps you should work on your reading comprehension. At least half of Americans do care about the climate and the other half would gladly buy small trucks (for example) if those were available.

It's lazy to dunk on America as a whole, go look at the list of countries that have met their climate commitments and you'll see it's a pretty small list. Germany reopening coal production was not on my bingo card.


I run a similar number of services on a very different setup. Administratively, it’s not idempotent but Proxmox is a delight to work with. I have 4 nodes, with a 14900K CPU with 24 cores being the workhorse. It runs a Windows server with RDP terminal (so multiple users can get access windows through RDP and literally any device), Jellyfin, several Linux VMs, a pi-hole cluster (3 replicas), just to name a few services. I have vGPU passthrough working (granted, this bit is a little clunky).

It is not as fancy/reliable/reproducible as k3s, but with a bunch of manual backups and a ZFS (or BTRFS) storage cluster (managed by a virtualized TrueNAS instance), you can get away with it. Anytime a disk fails, just replace and resilver it and you’re good. You could configure certain VMs for HA (high availability) where they will be replicated to other nodes that can take over in the event of a failure.

Also I’ve got tailscale and pi-hole running as LXC containers. Tailscale makes the entire setup accessible remotely.

It’s a different paradigm that also just works once it’s setup properly.


I have a question if you don't mind answering. If I understand correctly, Metallb on Layer 2 essentially fills the same role as something like Keepalived would, however without VRRP.

So, can you use it to give your whole cluster _one_ external IP that makes it accessible from the outside, regardless of whether any node is down?

Imo this part is what can be confusing to beginners in self hosted setups. It would be easy and convenient if they could just point DNS records of their domain to a single IP for the cluster and do all the rest from within K3s.


Yes. I have configured metalLB with a range of IP addresses on my local LAN outside the range distributed by my DHCP server.

Ex - DHCP owns 10.0.0.2-10.0.0.200, metalLB is assigned 10.0.0.201-10.0.0.250.

When a service requests a loadbalancer, metallb spins up a service on any given node, then uses ARP to announce to my LAN that that node's mac address is now that loadbalancer's ip. Internal traffic intended for that IP will now resolve to the node's mac address at the link layer, and get routed appropriately.

If that node goes down, metalLB will spin up again on a remaining node, and announce again with that node's mac address instead, and traffic will cut over.

It's not instant, so you're going to drop traffic for a couple seconds, but it's very quick, all things considered.

It also means that from the point of view of my networking - I can assign a single IP address as my "service" and not care at all which node is running it. Ex - if I want to expose a service publicly, I can port forward from my router to the configured metalLB loadbalancer IP, and things just work - regardless of which nodes are actually up.

---

Note - this whole thing works with external IPs as well, assuming you want to pay for them from your provider, or IPV6 addresses. But I'm cheap and I don't pay for them because it requires getting a much more expensive business line than I currently use. Functionally - I mostly just forward 80/443 to an internal IP and call it done.


Thank you so much for the detailed explanation!

That sounds so interesting and useful that you've convinced me to try it out :)


450W is ~£100 monthly. It's a luxury budget to host hobby stuff in a cloud.


It’s $30 in my part of the US. Less of a luxury.


We used to pay AU$30 for the entire house which included everything except cooking but it did include a 10 year 1RU rack Mount server. Electricity isn't particularly cheap here.


How do you deal with persistent volumes for configuration, state, etc? That’s the bit that has kept me away from k3s (I’m running Proxmox and LXC for low overhead but easy state management and backups).


Longhorn.io is great.


Yeah, but you have to have some actual storage for it, and that may not be feasible across all nodes in the right amounts.

Also, replicated volumes are great for configuration, but "big" volume data typically lives on a NAS or similar, and you do need to get stuff off the replicated volumes for backup, so things like replicated block storage do need to expose a normal filesystem interface as well (tacking on an SMB container to a volume just to be able to back it up is just weird).


Sure - none of that changes that longhorn.io is great.

I run both an external NAS as an NFS service and longhorn. I'd probably just use longhorn at this point, if I were doing it over again. My nodes have plenty of sata capacity, and any new storage is going into them for longhorn at this point.

I back up to an external provider (backblaze/wasabi/s3/etc). I'm usually paying less than a dollar a month for backups, but I'm also fairly judicious in what I back up.

Yes - it's a little weird to spin up a container to read the disk of a longhorn volume at first, but most times you can just use the longhorn dashboard to manage volume snapshots and backup scheduling as needed. Ex - if you're not actually trying to pull content off the disk, you don't ever need to do it.

If you are trying to pull content off the volume, I keep a tiny ssh/scp container & deployment hanging around, and I just add the target volume real fast, spin it up, read the content I need (or more often scp it to my desktop/laptop) and then remove it.


Do you have documentation somewhere, where you can share ?


I do things somewhat similarly but still rely on Helm/customize/ArgoCD as it's what I know best. I don't have a documentation to offer, but I do have all of it publicly at https://gitlab.com/lama-corp/infra/infrastructure It's probably a bit more involved than your OP's setup as I operate my own AS, but hopefully you'll find some interesting things in there.


You should look into fluxcd this stuff makes a lot of stuff even simpler.


"Basement Cloud" sounds like either a dank cannabis strain, or an alternative British rock emo grunge post-hardcore song. As in "My basement cloud runs k420s, dude."

https://www.youtube.com/watch?v=K-HzQEgj-nU


Or microk8s. I'm curious what it is about k8s that is sucking up all these resources. Surely the control plane is mostly idle when you aren't doing things with it?


There are 3 components to "the control plane" and realistically only one of them is what you meant by idle. The Node-local kubelet (that reports in the state of affairs and asks if there is any work) is a constantly active thing, as one would expect from such a polling setup. The etcd, or it's replacement, is constantly(?) firing off watch notifications or reconciliation notifications based on the inputs from the aforementioned kubelet updates. Only the actual kube-apiserver is conceptually idle as I'm not aware of any compute that it, itself, does only in response to requests made of it

Put another way, in my experience running clusters, in $(ps auwx) or its $(top) friend always show etcd or sqlite generating all of the "WHAT are you doing?!" and those also represent the actual risk to running kubernetes since the apiserver is mostly stateless[1]

1: but holy cow watch out for mTLS because cert expiry will ruin your day across all of the components


I've noticed that etcd seems to do an awful lot of disk writes, even on an "idle" cluster. Nothing is changing. What is it actually doing with all those writes?


Almost certainly it's the propagation of the kubelet checkins rippling through etcd's accounting system[1]. Every time these discussions come up I'm always left wondering "I wonder if Valkey would behave the same?" or Consul (back when it was sanely licensed). But I am now convinced after 31 releases that the pluggable KV ship has sailed and they're just not interested. I, similarly, am not yet curious enough to pull a k0s and fork it just to find out

1: related, if you haven't ever tried to run a cluster bigger than about 450 Nodes that's actually the whole reason kube-apiserver --etcd-servers-overrides exists because the torrent of Node status updates will knock over the primary etcd so one has to offload /events into its own etcd


How hard is it to host a Postgres server on one node and access it from another?


I deployed CNPG (https://cloudnative-pg.io/ ) on my basement k3s cluster, and was very impressed with how easy I could host a PG instance for a service outside the cluster, as well as good practices to host DB clusters inside the cluster.

Oh, and it handles replication, failover, backups, and a litany of other useful features to make running a stateful database, like postgres, work reliably in a cluster.


It’s Kubernetes, out of the box.


> Kubernetes is simply too resource-intensive to run on a $10/month VPS with just 1 shared vCPU and 2GB of RAM

I hate sounding like an Oracle shill, but Oracle Cloud's Free Tier is hands-down the most generous. It can support running quite a bit, including a small k8s cluster[1]. Their k8s backplane service is also free.

They'll give you 4 x ARM64 cores and 24GB of ram for free. You can split this into 1-4 nodes, depending on what you want.

[1] https://www.oracle.com/cloud/free/


One thing to watch out for is that you pick your "home region" when you create your account. This cannot be changed later, and your "Always Free" instances can only be created in your home region (the non-free tier doesn't have that restriction).

So choose your home region carefully. Also, note that some regions have multiple availability domains (OCI-speak for availability zones) but some only have one AD. Though if you're only running one free instance then ADs don't really matter.


A bit of a nitpick. You get monthly credit for 4c/24gb on ARM, no matter the region. So even if you chose your home region poorly, you can run those instances in any region and only be on the hook for the disk cost. I found this all out the hard way, so I'm paying $2/month to oracle for my disks.


I don't know the details but I know I made this mistake and I still have my Free Tier instances hosted in a different region then my home. It's charged me a month of $1 already so I'm pretty sure it's working.


the catch is: no commercial usage and half the time you try to spin up an instance itll tell you theres no room left


That limitation (spinning up an instance) only exists if you don't put a payment card in. If you put a payment card in, it goes away immediately. You don't have to actually pay anything, you can provision the always free resources, but obviously in this regard you have to ensure that you don't accidentally provision something with cost. I used terraform to make my little kube cluster on there and have not had a cost event at all in over 1.5 years. I think at one point I accidentally provisioned a volume or something and it cost me like one cent.


> no commercial usage

I think that's if you are literally on their free tier, vs. having a billable account which doesn't accumulate enough charges to be billed.

Similar to the sibling comment - you add a credit card and set yourself up to be billed (which removes you from the "free tier"), but you are still granted the resources monthly for free. If you exceed your allocation, they bill the difference.


Honestly I’m surprised they even let you provision the resources without a payment card. Seems ripe for abuse


A credit card is required for sign up but it won't be set up as a billing card until you add it. One curious thing they do is though, the free trial is the only entry way to create a new cloud account. You can't become a nonfree customer from the get go. This is weird because their free trial signup is horrible. The free trial is in very high demand so understandably they refuse a lot of accounts which they would probably like as nonfree customers.


I would presume account sign up is a loss leader in order to get ~spam~ marketing leads, and that they don't accept mailinator domains


They also, like many other cloud providers, need a real physical payment card. No privacy.com stuff. No virtual cards. Of course they don’t tell you this outright, because obscurity fraud blah blah blah, but if you try to use any type of virtual card it’s gonna get rejected. And if your naïve ass thought you could pay with the virtual card you’ll get a nice lesson in how cloud providers deal with fraud. They’ll never tell you that virtual cards aren’t allowed, because something something fraud, your payment will just mysteriously fail and you’ll get no guidance as to what went wrong and you have to basically guess it out.

This is basically any cloud provider by the way, not specific to Oracle. Ran into this with GCP recently. Insane experience. Pay with card. Get payment rejected by fraud team after several months of successful same amount payments on the same card and they won’t tell what the problem is. They ask for verification. Provide all sorts of verification. On the sixth attempt, send a picture of a physical card and all holds removed immediately

It’s such a perfect microcosm capturing of dealing with megacorps today. During that whole ordeal it was painfully obvious that the fraud team on the other side were telling me to recite the correct incantation to pass their filters, but they weren’t allowed to tell me what the incantation was. Only the signals they sent me and some educated guesswork were able to get me over the hurdle


> send a picture of a physical card and all holds removed immediately

So you're saying there's a chance to use a prepaid card if you can copy it's digits onto a real looking plastic card? Lol


Unironically yes. The (real) physical card I provided was a very cheap looking one. They didn’t seem to care much about its look but rather the physicality of it


Using AWS with virtual debit cards all right. Revolut cards work fine for me. What may also be a differentiator: Phone number used for registration is registered also for an account already having an established track record, and has a physical card for payments. (just guessing)


>No privacy.com stuff. No virtual cards.

I used a privacy.com Mastercard linked to my bank account for Oracle's payment method to upgrade to PAYG. It may have changed, this was a few months ago. Set limit to 100, they charged and reverted $100.


There are tons of horror stories about OCI's free tier (check r/oraclecloud on reddit, tl;dr: your account may get terminated at any moment and you will lose access to all data with no recovery options). I wouldn't suggest putting anything serious on it.


They will not even bother sending you an email explaining why, and you will not be able to ask it, because the system will just say your password is incorrect when you try to login or reset it.

If you are on free tier, they have nothing to lose, only you, so be particular mindful of making a calendar note for changing your CC before expiration and things like that.

It’s worth paying for another company just for the peace of mind of knowing they will try to persuade you to pay before deleting your data.


Are all of those stories related to people who use it without putting any payment card in? I’ve been happily siphoning Larry Ellisons jet fuel pennies for a good year and a half now and have none of these issues because I put a payment card in



Good call out. I used the machines defined here and have never had any sort of issue like those links describe: https://github.com/jpetazzo/ampernetacle


Nope, my payment method was already entered.


IME, the vast majority of those horror stories end up being from people who stay in the "trial" tier and don't sign up for pay-as-you-go (one extra, easy step) and Oracle's ToS make it clear that trial accounts an resources can and do get terminated at any time. And at least some of those people admitted, with some prodding, that they were also trying to do torrents or VPNs to get around geographical restrictions.

But yes, you should always have good backups and a plan B with any hosting/cloud provider you choose.


Can confirm (old comment of mine saying the same https://news.ycombinator.com/item?id=43215430)


I recenlty wrote a guide on how to create a free 3 node cluster in Oracle cloud : https://macgain.net/posts/free-k8-cluster . This guide currently uses kubeadm to create 3 node (1 control plane, 2 worker nodes) cluster.


Just do it like the olden days, use ansible or similar.

I have a couple dedicated servers I fully manage with ansible. It's docker compose on steroids. Use traefik and labeling to handle reverse proxy and tls certs in a generic way, with authelia as simple auth provider. There's a lot of example projects on github.

A weekend of setup and you have a pretty easy to manage system.


What is the advantage of traefik over oldschool Nginx?


Traefik has some nice labeling for docker that allows you to colocate your reverse proxy config with your container definition. It's slightly more convenient than NGINX for that usecase with compose. It effectively saves you a dedicated vietualhost conf by setting some labels.

One can read more here: https://doc.traefik.io/traefik/routing/providers/docker/

This obviously has some limits and becomes significantly less useful when one requires more complex proxy rules.


Basically what c0balt said.

It's zero config and super easy to set everything up. Just run the traefik image, and add docker labels to your other containers. Traefik inspects the labels and configures reverse proxy for each. It even handles generating TLS certs for you using letsencrypt or zerossl.


I thought this context was outside of Docker, because they used ansible as docker compose alternative. But maybe I misunderstood.


Ah yeah I guess I wasn't clear. I meant use ansible w/ the docker_container command. It's essentially docker compose - I believe they both use docker.py.


Ah yes, makes much more sense.


I created a script that reads compose annotations and creates config for cloudflare tunnel and zero trust apps. Allows me to reach my services on any device without VPN and without exposing them on the internet.


There's very little advantage IMO. I've used both. I always end up back at Nginx. Traefik was just another configuration layer that got in the way of things.


Traefik is waaay simpler - 0 config, just use docker container labels. There is absolutely no reason to use nginx these days.

I should know, as I spent years building and maintaining a production ingress controller for nginx at scale, and I'd choose Traefik every day over that.


> I'm constantly reinventing solutions to problems that Kubernetes already solves—just less efficiently.

But you've already said yourself that the cost of using K8s is too high. In one sense, you're solving those solutions more efficiently, it just depends on the axis you use to measure things.


The original statement is ambiguous. I read it as "problems that k8s already solves -- but k8s is less efficient, so can't be used".


That picture with the almost-empty truck seems to be the situation that he describes. He wants the 18 wheeler truck, but it is too expensive for just a suitcase.


> Kubernetes is simply too resource-intensive to run on a $10/month VPS with just 1 shared vCPU and 2GB of RAM.

That's more than what I'm paying for far fewer resources than Hetzner. I'm paying about $8 a month for 4 vCPUs and 8GB of RAM: https://www.hetzner.com/cloud

Note that the really affordable ARM servers are German only, so if you're in the US you'll have to deal with higher latency to save that money, but I think it's worth it.


I recently set up an arm64 VPS at netcup: https://www.netcup.com/en/server/arm-server Got it with no location fee (and 2x storage) during the easter sale but normally US is the cheapest.


That's pretty cheap. I have 4 vCPUs, 8GB RAM, 80GB disk, and 20TB traffic for €6. NetCup looks like it has 6VCPU, 8GB RAM, 256 GB, and what looks like maybe unlimited traffic for €5.26. That's really good. And it's in the US, where I am, so SSH would be less painful. I'll have to think about possibly switching. Thanks for the heads up.


Thank you for sharing this. Do you have a referral link we can use to give you a little credit for informing us?


Sure, if you still want it: https://hetzner.cloud/?ref=WwByfoEfJJdv

I guess it gives you 20 euros in credit, too. That's nice.


I've been using Docker swarm for internal & lightweight production workloads for 5+ years with zero issues. FD: it's a single node cluster on a reasonably powerful machine, but if anything, it's over-specced for what it does.

Which I guess makes it more than good enough for hobby stuff - I'm playing with a multi-node cluster in my homelab and it's also working fine.


I think Docker Swarm makes a lot of sense for situations where K8s is too heavyweight. "Heavyweight" either in resource consumption, or just being too complex for a simple use case.


The only problem is Docker Swarm is essentially abandonware after Docker was acquired by Mirantis in 2019. Core features still work but there is a ton of open issues and PRs which are ignored. It's fine if it works but no one cares if you found a bug or have ideas on how to improve something, even worse if you want to contribute.


Yep it's unfortunate, "it works for me" until it doesn't.

OTOH it's not a moving target. Docker historically has been quite infamous for that, we were talking about half-lives for features, as if they were unstable isotopes. It took initiatives like OCI to get things to settle.

K8s tries to solve the most complex problems, at the expense of leaving simple things stranded. If we had something like OCI for clustering, it would most likely take the same shape.


Podman is a fairly nice bridge. If you are familiar with Kubernetes yaml, it is relatively easy to do docker-compose like things except using more familiar (for me) K8s yaml.

In terms of the cloud, I think Digital Ocean costs about $12 / month for their control plane + a small instance.


I found k3s to be a happy medium. It feels very lean and works well even on a Pi, and scales ok to a few node cluster if needed. You can even host the database on a remote mysql server, if local sqlite is too much IO.


NixOS works really well for me. I used to write these kinds of idempotent scripts too but they are usually irrelevant in NixOS where that's the default behavior.


And regarding this part of the article

> Particularly with GitOps and Flux, making changes was a breeze.

i'm writing comin [1] which is GitOps for NixOS machines: you Git push your changes and your machines fetch and deploy them automatically.

[1] https://github.com/nlewo/comin


This is exactly why I built https://canine.sh -- basically for indie hackers to have the full experience of Heroku with the power and portability of Kubernetes.

For single server setups, it uses k3s, which takes up ~200MB of memory on your host machine. Its not ideal, but the pain of trying to wrangle docker deployments, and the cheapness of hetzner made it worth it.


How does it compare to Coolify and Dokploy?


Neither of those use kubernetes unfortunately, the tool has kind of a bad rap, but every company I’ve worked at has eventually migrated on to kubernetes


Sure, I'm looking for more of a personal project use case where it doesn't much matter to me whether it uses Kubernetes or not, I'm more interested in concrete differences.


Ah yeah then I’d say the biggest difference is the fact that it can use to helm to install basically anything in the world to your cluster


I run my private stuff on a hosted vultr k8s cluster with 1 node for $10-$20 a month. All my hobby stuff is running on that "personal cluster" and it is that perfect sweetspot for me that you're talking about

I don't use ingresses or loadbalancers because those cost extra, and either have the services exposed through tailscale (with tailscale operator) for stuff I only use myself, or through cloudflare argo tunnels for stuff I want internet accessible

(Once a project graduates and becomes more serious, I migrate the container off this cluster and into a proper container runner)


It’s been a couple of years since I’ve last used it, but if you want container orchestration with a relatively small footprint, maybe Hashicorp Nomad (perhaps in conjunction with Consul and Traefik) is still an option. These were all single binary tools. I did not personally run them on 2G mem VPSes, but it might still be worthwhile for you to take a look.

It looks like Nomad has a driver to run software via isolated fork/exec, as well, in addition to Docker containers.


The solution to this is to not solve all the problems a billion dollar tech does on a personnal project.

Let it not be idempotent. Let it crash sometimes.

We lived without kubs for years and the web was ok. Your users will survive.


Yeah, unless you're doing k8s for the purpose of learning job skills, it's way overkill. Just run a container with docker, or a web server outside a container if it's a website. Way easier and it will work just fine.


I’ve been using https://www.coolify.io/ self hosted. It’s a good middle ground between full blown k8s and systemd services. I have a home lab where I host most of my hobby projects though. So take that into account. You can also use their cloud offering to connect to VPSs


> I'm stuck with manual docker compose up/down commands over SSH

Out of curiosity, what is so bad about this for smaller projects?


Just go with a cloud provider that offers free control plane and shove a bunch of side projects into 1 node. I end up around $50 a month on GCP (was a bit cheaper at DO) once you include things like private docker registry etc.

The marginal cost of an additional project on the cluster is essentially $0


I've ran K3s on a couple of Raspberry Pi's as a homelab in the past. It's lightweight and ran nice for a few years, but even so, one Pi was always dedicated as controller, which seemed like a waste.

Recently I switched my entire setup (few Pi's, NAS and VM's) to NixOS. With Colmena[0] I can manage/update all hosts from one directory with a single command.

Kubernetes was a lot of fun, especially the declarative nature of it. But for small setups, where you are still managing the plumbing (OS, networking, firewall, hardening, etc) yourself, you still need some configuration management. Might as well put the rest of your stuff in there also.

[0] https://colmena.cli.rs/unstable/


6$/m - will likely bring you peace of mind - Netcup hosting VPS 1000 ARM G11

    6 vCore (ARM64)
    8 GB RAM
    256 GB NVMe


They also have regular promotions that offer e.g. double the disk space.

There you get

    6 vCore (ARM64)
    8 GB RAM
    512 GB NVMe
for 6 $ / m - traffic inclusive. You can choose between "6 vCore ARM64, 8 GB RAM" and "4 vCore x86, 8 GB ECC RAM" for the same price. And much more, of course.

https://www.netcup.com/en/server/vps


I'm a cheapskate too, but at some point, the time you spend researching cheap hosting, signing up and getting deployed is not worth the hassle of paying a few more $ on bigger boxes.


Have you tried nixOS? I feel like it solves the functional aspect you're looking for.


I am curious why your no revenue projects need the complexity, features and benefits of something like Kubernetes. Why you cannot just to it the archaic way of compiling your app, copy the files to a folder and run it there and never touch it for the next 5 years. If it is a dev environment with many changes, its on a local computer, not on VPS, I guess. Just curious by nature, I am.


The thing is, most of those enterprise-grade container orchestrations probably don't need k8s either.

The more I look into it, the more I think of k8s as a way to "move to micro services" without actually moving to micro services. Loosely coupled micro services shouldn't need that level of coordination if they're truly loosely coupled.


> Kubernetes is simply too resource-intensive to run on a $10/month VPS with just 1 shared vCPU and 2GB of RAM

To put this in perspective, that’s less compute than a phone released in 2023, 12 years ago, Samsung Galaxy S4. To find this level of performance in a computer, we have to go to

The main issue is that Kubernetes has created good API and primitives for managing cloud stuff, and managing a single server is still kinda crap despite decades of effort.

I had K3S on my server, but replaced with docker + Traefik + Portainer - it’s not great, but less idle CPU use and fewer moving parts


I believe that Kubernetes is something you want to use if you have 1+ SRE full-time on your team. I actually got tired with complexity of kubernetes, AWS ECS and docker as well and just build a tool to deploy apps natively on the host. What's wrong with using Linux native primitives - systemd, crontab, postgresql or redis native package? Whose should work as intended, you don't need them in container.


SSH up/down can be scripted.

Or maybe look into Kamal?

Or use Digital Ocean app service. Got integration, cheap, just run a container. But get your postgres from a cheaper VC funded shop :)


Why not just use something like Cloud Run? If you're only running a microVM deploying it there will probably be at or near free.


I really like `DOCKER_HOST=ssh://... docker compose up -d`, what do you miss about Deployments?


I developed a tiny wrapper around docker compose which work on my use case: https://github.com/daitangio/misterio

It can manage multiple machine with just ssh access and docker install.


Please try https://github.com/skateco/skate, this is pretty much the exact same reason why I built it!


Virtual Kubelet is one step forward towards Kubernetes as an API

https://github.com/virtual-kubelet/virtual-kubelet


Why not minikube or one of the other resource-constrained k8s variants?

https://minikube.sigs.k8s.io/


I use Caprover to run about 26 services for personal projects on a Hetzner box. I like its simplicity. Worth it just for the one-click https cert management.


Have you tried k3s? I think it would run on a tiny vps like that and is a full stack. Instead of etcd it has sqlite embedded.


> I'm constantly reinventing solutions to problems that Kubernetes already solves

Another way to look at this is the Kubernetes created solutions to problems that were already solved at a lower scale level. Crontabs, http proxies, etc… were already solved at the individual server level. If you’re used to running large coordinated clusters, then yes — it can seem like you’re reinventing the wheel.


For $10 you can buy VPS with a lot more resources than that on both Contabo and Ovh


I've used caprover a bunch


What about Portainer? I deploy my compose files via git using it.


As a relatively well-educated Japanese native speaker, I too experience this problem when writing Japanese on paper - being unable to write many kanji characters by hand. I am no exception among Japanese native speakers. While the author seems to interpret this problem as something crucial, I question whether it truly is.

The orthography of Mandarin and Japanese includes an alphabet consisting of thousands of characters, the majority of which comprise dozens of strokes. Although East Asian people have higher IQ scores on average, we are not superhuman - our memory capacity is bound by human limits, and the decreased frequency of actually writing kanji on paper has naturally resulted in our forgetting how to write many of them. Is this surprising?

Furthermore, orthography is not part of language in a fundamental sense - it's merely a useful tool that accompanies a language. Therefore, I do not see the writing system becoming less stable as a significant issue. Consider Korea as an example: they used to use kanji in their orthography but have almost completely eliminated it with virtually no adverse effects. While laypeople often assume orthography is an integral part of a language, this is just not the case from the linguistic perspective.


If you consider that a lot of people using the Latin alphabet does use the cellphone autocomplete to check how to write a word used infrequently...

So I would say this text is biased by the "western" view of the writer, something that could be categorized as "Orientalism". A study about this phenomenon is valid, is important. But this post is not a good study.


But autocomplete even for basic words? My wife is Chinese. I'll never forget when she was helping her family write some formal letter in Chinese in Microsoft Word and she simply could not input the numbers 1, 2, and 3 in Chinese because she forgot how. And I know this may be apples and oranges because this is keyboard input versus writing on paper but as a programmer who can type at a moderate pace since I was a kid (~120wpm) this was perplexing for me! And similar to the article, she's an Ivy league grad. Similarly, when she's communicating with her family via WeChat half the time she simply sends audio messages instead of text messages. I'm pretty surprised this method is so popular instead of some voice-to-text google assistant type system.


I think there may be some confusion. The standard Chinese characters for 1, 2, and 3 (一, 二, 三) are among the simplest characters in Chinese: literally just one, two and three horizontal strokes. These would be extremely difficult to forget! What your wife was likely trying to write were the special variants (壹, 贰, 叁) that are used on checks, official documents, etc. These were specifically designed to be hard to alter or forge (think the difference between writing "100" versus "ONE HUNDRED" on a check). Even highly educated Chinese people might need to look these up since they are specialized characters not used in everyday writing.


That explains it. Yup these were some sort of official / govt documents. Thanks for the explanation!

Edit. I should have realized that. I just came back from China and my kids were watching a children's show with the following subtitles: "一二一二一二一二一二一二一二一二一二一二". Took me a while to realize the subtitles were not broken. The characters were marching chanting "one two one two..." :)


I think this is specifically more an IME (input method software) issue than a typing one. Japanese has similar "official" numbers (壱, 弐, 参, maybe some of the few cases where modern Japanese is more simplified than Simplified Chinese). These numbers couldn't be easier to type. I just type 1, 2, 3 (i.e. the digit keys on top of my keyboard), hit the convert key and select the right character (I also get offered 三, ③, 3⃣,³ and several other options to choose from). That's it.

I tried the same with Google's IME and I couldn't use digits as input, like the Japanese IMEs let you do. I could find the character for 叁 quickly enough, but 壹 was only on the second or third page. Still, I suck at Chinese and I found it.

Now, writing these characters is an entirely different story. I think any character that's rarely written and appears only in one common word runs the risky of being forgotten, even if that word is quite simple and used on a day-to-day basis. A word like 喷嚏 (sneeze) in Chinese or 薔薇 (rose) in Japanese fit the bill.

The Japanese fallback, in case you forgot the character is quite simple: you'd just use either Katakana or Hiragana with different connotations[1]. I'm not quite sure what the fallback would be in Chinese, but I guess that would often be picking another character with a close or same pronunciation, as Chinese speakers often do on purpose as a sort of pun.

I also expect there are still fewer cases of "character amnesia" in China than Japan, since the fallback mechanism is simpler and more standardized in Japan, and children are taught far less Kanji in school than their counterparts in Mainland China, Hong Kong or Taiwan.

[1] While Hiragana gives a familiar connotation, writing the word as バラ in Katakana is "more official", if anything, since names of flora and fauna are conventionally written using Katakana in official contexts, especially when you want to use the exact scientific name. This is the equivalent of using Latin names in Western countries, e.g. Rosa hirtula would be サンショウバラ.


>The standard Chinese characters for 1, 2, and 3 (一, 二, 三) are among the simplest characters in Chinese: literally just one, two and three horizontal strokes.

Does that work for larger numbers, keep adding strokes?


No, 4 is 四. Numbers are simple characters, but only 1,2,3 are made by adding strokes.


I am not from Asia so I would trust more what our wife has to say than me. But I would argue that it is common for people living in a country with different language from they native language to forget how to write or even say some simple words. There's a good active effort to learn a new language.


It might be surprising but, in terms of written words, sneeze (喷嚏) is not "basic".


That's very much the impression I get. I've never seen pinyin used in Chinese writing, and the Chinese friends I've met have said they've never seen it either (they said they'd probably just look up the character or write a homonym instead, but even then it's pretty rare that it comes to that).

That's not to say it's never done, but it feels like an outlier. As if a friend found a word too hard to understand and drew a picture instead, and then the author wrote an article about how spelling is so difficult that it leads English speakers to draw words instead of writing them.

But the thing that struck me the most was just how confused people were when I asked them about it. It just didn't seem to be anything that was an actual issue for them.


> "This is such a gratifying experience, in fact, that I have actually kept a list of characters that I have observed Chinese people forget how to write. (A sick, obsessive activity, I know.) I have seen highly literate Chinese people forget how to write certain characters in common words like "tin can", "knee", "screwdriver", "snap" (as in "to snap one's fingers"), "elbow", "ginger", "cushion", "firecracker", and so on. And when I say "forget", I mean that they often cannot even put the first stroke down on the paper. Can you imagine a well-educated native English speaker totally forgetting how to write a word like "knee" or "tin can"? Or even a rarely-seen word like "scabbard" or "ragamuffin"? I was once at a luncheon with three Ph.D. students in the Chinese Department at Peking University, all native Chinese (one from Hong Kong). I happened to have a cold that day, and was trying to write a brief note to a friend canceling an appointment that day. I found that I couldn't remember how to write the character 嚔, as in da penti 打喷嚔 "to sneeze". I asked my three friends how to write the character, and to my surprise, all three of them simply shrugged in sheepish embarrassment. Not one of them could correctly produce the character. Now, Peking University is usually considered the "Harvard of China". Can you imagine three Ph.D. students in English at Harvard forgetting how to write the English word "sneeze"?? Yet this state of affairs is by no means uncommon in China. English is simply orders of magnitude easier to write and remember. No matter how low-frequency the word is, or how unorthodox the spelling, the English speaker can always come up with something, simply because there has to be some correspondence between sound and spelling. One might forget whether "abracadabra" is hyphenated or not, or get the last few letters wrong on "rhinoceros", but even the poorest of spellers can make a reasonable stab at almost anything. By contrast, often even the most well-educated Chinese have no recourse but to throw up their hands and ask someone else in the room how to write some particularly elusive character."

- https://pinyin.info/readings/texts/moser.html


Not at all - forgetting kanji just isn't similar to forgetting how to spell English words, as I think TFA made fairly clear. It's the simplest analogy to make, but it's not near enough to draw conclusions from.

The analogy I've used in the past is, you read kanji with your mind but you write them with your hand, so being unable to remember a kanji is more akin to forgetting a guitar chord or a keyboard shortcut - if your hands stop making the motions, you'll eventually forget them.


Most people cannot accurately draw a bicycle.


Yeah - the other analogy I've used is that everyone can recognize a Starbucks logo, but even if you went to the trouble of learning to accurately draw one, you'd forget if you didn't practice.


I am Italian and was taught cursive in elementary school and I can barely remember upper case cursive letters[0] thirty years later.

In my experience, most people of my generation have generally forgot and usually just write "lower case letters but big" or block letters.

So yeah, I don't think there's anything inherently chinese about forgetting writing things you don't use.

[0] https://www.genitorialmente.it/2016/10/alfabeto-corsivo-maiu...


I'm studying Japanese at the moment and what struck me is how important context is, particularly in reading. You need to know where to read 1-3 letters ahead to read a word and interpret it. That's not really a thing in English - a word is a word, and the individual letters that it's composed of are almost always pronounced the same way.

I think digital is a big crutch for Japanese/Chinese because you have input methods that help you write what you want to say, so you don't actually need to remember how to write kanji as much in daily life.


> You need to know where to read 1-3 letters ahead to read a word and interpret it. That's not really a thing in English

It happens in a English too, where you see a chunk of letters and mis-predict which word they represent in a way which affects its meaning [0], and sometimes that will also affect pronunciation. [1]

An example from the link:

> "The complex houses married and single soldiers and their families."

A reader linearly scanning along doesn't know whether "complex" is an adjective or a noun, and then whether "houses" is a noun or a verb. I'm pretty sure all human languages have similar problems where a certain amount of look-ahead or backtracking is necessary.

For another example to highlight pronunciation changes, consider the ambiguity of:

"I saw the rhino live in the zoo."

That could mean that the rhino was doing the verb of living, in which it rhymes with "give", or it could also mean that the speaker was seeing it in-person, in which case it rhymes with "drive".

[0] https://en.wikipedia.org/wiki/Garden-path_sentence

[1] https://en.wikipedia.org/wiki/Heteronym_(linguistics)


seems like an opportune time to also talk about buffalo buffalo buffalo buffalo buffalo buffalo buffalo buffalo.

https://en.wikipedia.org/wiki/Buffalo_buffalo_Buffalo_buffal...


The Chinese equivalent would be the "The story of Mr. Shi Eating Lions":

https://www.yellowbridge.com/onlinelit/stonelion.php

Both rely on intonation (in addition to volume and pauses) for disambiguation, but the fun trick is that in the Chinese version the intonation is an integral part of the lexeme (i.e. it distinguishes between "words").

But I have to say, these kind of sentences (and full-fledged poems) are quite a different beast from simple cases of garden path sentences or syntactic ambiguity[1]. The poem lion-eating poet and the "buffalo buffalo buffalo..." sentence are both highly contrived and unlikely to be understood correctly on the first few goes even with the perfect prosody. They are cool "language hacks", but they do not occur in daily language and I personally believe (although I guess die-hard generative linguists would disagree) that they don't teach us very much about the language itself (except for what are the cool artistic possibilities it opens).

[1] https://en.wikipedia.org/wiki/Syntactic_ambiguity


Incorrect capitalization.


When this happens in English, teachers will label this as "bad English" and ask you to rewrite. That's how the formal language deals with this problem.


If anything, isn't that an informal solution? It relies on other people to complain that they dislike the sentence, without being able to point to any hard-and-fast rule.


The hard and fast rule is that repeating a word right next to itself is generally frowned up. It comes up with “that” a lot, like “he said that, that led to something else”. Sometimes people are doing something clever with the words, but it’s usually just poor English.


Honestly, it rarely happens in English other than in contrived examples used to demonstrate the concept.


Yes, this happens in English too, but to find examples like this you have to go to Wikipedia, or wrack your brain and see if you remember one. In Japanese, almost every other word is like this.

I went to the first link in your comment ( https://en.wikipedia.org/wiki/Garden-path_sentence ), selected the Japanese version of the article, and took the first sentence:

> 袋小路文(ふくろこうじぶん)とは、文法的には正しいけれども、誤読が生じやすい書き出しで始まる文のことである。

As is usual for Japanese, this sentence contains a mix of Chinese(-origin) ("kanji", e.g. 袋 小 路 文 法 的) as well as Japanese phonetic ("kana", e.g. ふくろこうじぶん) characters. Usually, when in a multi-kanji word, kanji are pronounced with (a time-changed version of) Chinese pronunciation. For example, 文法 is "bun-pou", not "fumi-nori" or something else. However, the first character of the article title (fukurokoubunji), 袋, is "fukuro" here despite being in a four-kanji word. Further, 小 is "kou" here, which is nonstandard enough that its dictionary entry does not even list it as a possible pronunciation! [1] Then 路文 are both in Chinese pronunciation (ji-bun), but this does not necessarily make sense because the word is not split in two down the middle, but instead as 袋-小路-文 (bag-lane-sentence, where bag-lane is English cul-de-sac / blind alley). [2]

Now fukurokoubunji is a bit of a specialised word, so it might not be a great example. But in the rest of the sentence, we find 文, which is always pronounced "bun" (sentence) here, even when appearing separately, but could also (though more rarely) have been "fumi" (letter) — nothing but semantical context helps distinguish. Then we have 正しい "tada-shi-i", where 正 could have been "sei" as in 正確 "sei-kaku" (accurate) or "shou" as in 正直 "shou-jiki" (honest), but it isn't just because しい come after. Similarly, 生 in 生じやすい is "shou"(-ji-ya-su-i), which is conjugated from the base form 生じる "shou-ji-ru" and could have been "u" (生まれる "u-ma-re-ru") or "sei" (先生 "sen-sei") or "i" (生きる "i-ki-ru") or more (生 is somewhat infamous for having many readings). And I could go on: 書 could be "syo" (文書 "bun-syo") but is "ka" (書き出して "ka-ki-da-shi-te" conjugated from 書く "ka-ku").

This is a bit like the comments elsewhere here noting that the Chinese word for "sneeze" is a bad example because it happens to have so uncommon characters in it — and then people point to examples like "onomatopoeia" and "diarrhoea" as similar tricky examples in English. I can't comment on Chinese, but existence does not necessarily say much about frequency.

[1]: https://jisho.org/search/%E5%B0%8F%20%23kanji — Kun are the Japanese readings (chiisai, ko, o, sa), and On are the Chinese readings (only "shou" in this case)

[2]: This analysis of 袋小路文 is not completely etymologically honest. By the etymology ( https://en.wiktionary.org/wiki/%E5%B0%8F%E8%B7%AF#Etymology_... ), we see that the "kouji" pronunciation of 小路 is really a corruption of ancient "ko-michi", which is a consistent Japanese-Japanese reading of the two characters. However, because "ji" is also an (uncommon) Chinese reading of 路, if you don't know the etymology of the word, the re-analysis is appropriate in the context of how hard it is to read the written language.


> However, because "ji" is also an (uncommon) Chinese reading of 路,

It's not a Chinese reading at all (as you can tell because it's ... wildly out of place with the the actual Chinese-derived readings ろ・る, onyomi are supposed to have semi-regular correspondences with each other and with Chinese Chinese readings). It's really just rendaku of ち, the basic root of fossilized compound みち (with still-salient prefix "honorific" み).

But most importantly, you never really see either 袋 or 小路 and expect them to have any other readings; maybe you'd expect しょうろ if you don't know the latter, but unless you're already literate in a Chinese or are blindly memorizing kanji tables, the other reading of 袋 (たい) probably isn't even salient, because it's one of those kanji that almost always takes its kunyomi even in compounds.

Side note, the line about u-onbin kind of buries the implication that this is a loanword from western Japanese, which is the culprit of several quasi-systematic but unevenly distributed divergences from regular sound changes.


I stand corrected, you clearly know more about this than I do. :) (I'm only an intermediate learner.)

So perhaps my analysis of 袋小路文 wasn't very accurate at all. Yet I hope my point about 正, 生, 書, etc. stands.


It's only, oh, just about the worst writing system since the Hittites or so, yeah.


> "The complex houses married and single soldiers and their families."

Wow.. I had to read that sentence three times before I got it right.


Maybe because I've seen a similar example used before, but I immediately read it correctly the first time. Honestly these sort of 'problems' only ever seem to occur when specifically created to demonstrate this problem and almost never happen in regular writing.


"I saw the rhino live in the zoo"

Might also mean; "Noted native-American zoologist 'I Saw The Rhino' lives at the zoo"


No it couldn't.


Given shenanigans like "thee stallion" as part of a name... sure it could.


Completely irrelevant. It couldn't because "live" and "lives" are different words.


I agree that Chinese/Japanese has it worse, but any language where "Spelling Bee" is a thing cannot be considered phonetic in a conventional sense.


And yet, given the definition and language of origin, most high-level spelling bee participants can make a pretty good guess at spelling a word they may have never seen before.

English is phonetic, it just borrows its pronunciation rules from many differing (and sometimes directly opposed) other languages.


Very true - and every demonstration of “English is hard to spell/pronounce” focuses directly on the exceptions which exaggerates the problem. One analysis I’ve seen puts it that with a single set of rules, 59% of a sample corpus of 5000 English words can be pronounced perfectly from the spelling (of course, there will be regional accent and dialect differences so that percentage will be a bit different for each one) and up to 85% can be pretty close with only slight errors.

Then there’s a percentage where they’re just direct borrowings from other languages and you need to have an idea of how that language pronounces words (especially French), so really only 10-15% or so of English words end up being true exceptions.

1. https://www.zompist.com/spell.html


> a single set of rules, 59% of a sample corpus of 5000 English words can be pronounced perfectly from the spelling

To do this you need to know 56(!) rules.

I think this actually demonstrates how complex English pronunciation actually is.


And you still only get 59% of the way to the correct pronunciation.

As a non native speaker of English, and a native speaker of a phonetic language, I strongly object to the notion that it's easy to guess English word pronunciation by just reading it.


And that's another reason why there are so many English speakers who don't know how to read properly. It is so much harder to read compared to more sensible languages line German (and many others).


Those numbers are very bad, given that proper phonemic orthographies can give you a 90+% confidence with far fewer rules.

There's a simple and consistent way to compare languages in this way too, too: train a neural net to map spelling to pronunciation on one half of the dictionary, then test it on the other half. The more complicated and less consistent the orthography is, the more mistakes it'll make. People have in fact done this exact experiment, and English scores extremely poorly in it; for spelling, closer to Chinese, in fact, than many other European languages: https://aclanthology.org/2021.sigtyp-1.1/


Maybe it's the right time to once again quote this poem :

https://jochenenglish.de/misc/dearest_creature.pdf

The joy of English pronunciation

George Nolst Trenit´e (1870–1946)

1 The text

Dearest creature in creation

Studying English pronunciation,

I will teach you in my verse

Sounds like corpse, corps, horse and worse.

I will keep you, Susy, busy,

Make your head with heat grow dizzy;

Tear in eye, your dress you’ll tear;

Queer, fair seer, hear my prayer.

Pray, console your loving poet,

Make my coat look new, dear, sew it!

Just compare heart, hear and heard,

Dies and diet, lord and word.

Sword and sward, retain and Britain

(Mind the latter how it’s written).

Made has not the sound of bade,

Say—said, pay—paid, laid but plaid.

Now I surely will not plague you

With such words as vague and ague,

But be careful how you speak,

Say: gush, bush, steak, streak, break, bleak,

Previous, precious, fuchsia, via,

Recipe, pipe, studding-sail, choir;

Woven, oven, how and low,

Script, receipt, shoe, poem, toe.

Say, expecting fraud and trickery:

1

Daughter, laughter and Terpsichore,

Branch, ranch, measles, topsails, aisles,

Missiles, similes, reviles.

... (7 pages of pain follow) ...

and the the Oxford and US pronunciation (at the time, it has changed since) in phonetic.


Huge difference is: English is pretty much THE language that you can butcher and still have people perfectly understand (and hopefully politely correct) you. Even other European (stay mad) languages don't hold up to just how flexible English is in this regard.


Well yes, that's (I believe) the reason English actually works as an international language, despite being horrible in so many respects (pronunciation, tons of exceptions, etc etc): It also has so much redundancy that even if you get all the grammar wrong the meaning is still there. "I is strongs". When someone knows a tiny bit of English it's often easier to communicate in English than in that person's language, even if you're studying said language. Unfortunately, kind of, but that's how it is.


Yeah exactly. "Me arms big power" would make me go "Oh yeah you do have mighty biceps my dude".

And to the latter point I got that all the time in Japan, but I think main reasons are: they wanna practice, but even more they wanna practice with a native English speaker bc it's a novel experience for em!


Oh hurrah, I think that link is what I've been looking for for nearly a decade. I ran across it, or something like it, a long time ago and could never find it again. I don't remember all the special syntax, I think the one I found was written more in plain English with more examples (and I don't think the one I found back then mentioned ghoti either), but can't be sure it's been so long - maybe it was just that page and I don't remember it. It does have around the same number of rules I remember though.


This is satire, right? 56 rules to get 59% correct pronunciation on a corpus of 5000 words? And these rules don't even include the base sounds - it doesn't tell you how to actually pronounce "m", or "e". So in fact there are more than 70 rules required to get to a base pronunciation (you need to add at least one rule for each letter).


"ough" has at least 9 different possible pronunciations, how is that phonetic?


>"ough" has at least 9 different possible pronunciations, how is that phonetic?

Does a language stop being phonetic when you have to include other information provided by the rest of the word? I'm not a linguist by any means, but "ough" being pronounced a couple different ways depending how it's used doesn't seem like it'd preclude the language from being considered phonetic in general.


9 is not a couple, unless you're in a very open relationship - which English words might be - but a language stops being phonetic at the point that the mappings between symbols and sounds are no longer clear and reliable. The most phonetic languages have one-to-one mappings with very few exceptions e.g. Japanese, Spanish, Italian, Finnish.

English, on the other hand, has silent letters, inconsistent mappings even within the same word, exceptions, irregularities, and sounds that are represented by multiple letters and spellings.

English is not a phonetic language except in the sense that it does have mappings between sounds and characters, which would make sense if one were to compare it to a wholly written language like Python, but not any human language.


Fruit flies like a banana. English has its own ambiguity, so it isn’t really that different.

I can only write Chinese via an IME these days. For one, I’m left handed so writing characters was always a struggle since stroke order worked against me, but it’s mostly how I only use Chinese anyways.

I told my wife our kid should learn to write via an IME as well and she was just horrified about that, though. None of the teaching material really supports it.


Time flies. I can't they're too fast.


I've been (very) casually learning Japanese for a couple years, and almost every time I think I find something "weird" that Japanese does, I almost immediately think of a very similar example in English.

The alphabet is a pretty awesome invention (alphabet > kana-style syllabary > kanji-style logography) but English writing is at least as complex as JP writing, just in different dimensions.

JP's phonetics, for example, are dead simple compared to English's, but they do a good job making up for it by having a few thousand Kanji.


> JP's phonetics, for example, are dead simple compared to English's

I'm not so sure about that. Do you know about pitch accent?

https://en.wikipedia.org/wiki/Japanese_pitch_accent


I'm not a native English speaker, so I don't really know why, or if, there's a problem for native English speakers to learn or "get" pitch accent. For speakers of many other European languages Japanese pitch accent is not tricky. You listen, and then you speak. Just as you would listen to English, and repeat it the same way.

Japanese, despite being extremely logical and so beautiful in so many ways, is still hard to learn for me, and of course learning the writing system is not done in the blink of an eye (unlike the Latin-based writing system we use), but pitch accent isn't really the problem here.


Is that any more complicated than English stress, though? And regardless, Japanese has a very small number of phonemes (compared to English) and extremely restricted phonotactics.


Yeah, but I don't expect this to be substantively harder than learning most regional accents (could be wrong), and afaik it's also not critical for legibility.


In English you have to know a word in order to pronounce it.

The “ou” diphthong in “hound” and “double” or “would” is pronounced differently. Or “ieu” in “lieutenant” vs “lieu”. Or “oo” in “poor” vs “root” Or “berry” in “berry” vs “strawberry”

I could go on forever. There’s no other western language I know of that behaves like that.


English is a quasi-phonetic language in that most words can be mostly pronounced how they're written, but in some cases it inherits the pronunciation of the language the word came from. I'd imagine many English speakers would consider this an undesirable quirk, though.

Indeed, there has been a tendency over the centuries, particularly in the US, to move towards writing words how they sound or pronouncing words how they're written. Lieutenant is an interesting example, since in the UK we pronounce that "lef-tenant" traditionally, but the US moved to the (IMO superior) "lieu-tenant". Nowadays, most young people would probably use the US pronunciation.

I do take some slight umbrage with the implication that some people seem to be making in this thread that language features can't be criticised or that one language can't be better than another. I'm don't see why this would necessarily be true. Even with spoken languages. There are a ton of annoying aspects to English that simply aren't issues in other languages, and I think it's fair to criticise other languages for their failings too. This is especially true of writing systems, which are human inventions rather than something we learn intuitively.

Logographic/logo-syllabic orthographies are harder to learn and remain proficient at than alphabets/abjads, for native speakers and second language learners alike. Alphabets are an innovation that improved on ancient orthographies and enabled a wider range of people to be able to communicate as easily by writing as they do by speaking. Besides the issue mentioned in the article, the writing systems in China/Japan are associated with other issues we rarely see here. Even dictionaries are a non-obvious challenge with logographic languages, which has resulted in several competing ways to sort words.


I don't think one can reasonably claim that in English "words are mostly pronounced how they're written". I mean, "i" can stand for /i/, /ɪ/, or /aɪ/, for example (and also for /ə/ if you don't count "ir" as a distinct grapheme). Although vowels at least (mostly) follow some predictable patterns based on syllables - but e.g. it's impossible to say whether "ch" stands for /k/, /tʃ/, /ʃ/, or /x/ without knowing the etymology of the word.


Americans pronounce "lieutenant" closer to the native French pronunciation.


> “ieu” in “lieutenant” vs “lieu”

> “berry” in “berry” vs “strawberry”

Am I misunderstanding the point you are making or is my pronunciation just off? I would pronounce both parts of both examples the same.



Strawberry is often pronounced as Strawbry. Sou the 'e' becomes silent. And Lieutenant as Lutenant (or leftenant in Britain)


>Strawberry is often pronounced as Strawbry.

Only in some dialects, not in the standard form.


French can be pretty bad. Not as bad as English for reading, but it's much worse for writing because there are so many spelling options for the same thing.


You are right, but you can read French words without knowing the language, because a written word has a unique correct pronunciation.


You have the right idea on “ou” but your other examples don’t make sense.


>That's not really a thing in English - a word is a word, and the individual letters that it's composed of are almost always pronounced the same way.

https://en.m.wikipedia.org/wiki/Ough_(orthography)


> in English - a word is a word, and the individual letters that it's composed of are almost always pronounced the same way

Are you sure about that?

https://en.wikipedia.org/wiki/Ghoti


Posted up above, here's a collection of English pronunciation rules that English speakers have internalized so well they can't generally explain them: https://www.zompist.com/spell.html

"Ghoti" is mentioned a few times there, but basically "fish" is a nonsensical pronunciation that breaks several rules. There's a reason (well, a few reasons) why if you ask English speakers how to pronounce "ghoti" and they've never seen it before, they'll probably all guess some variation of "go-tee" or "go-tie".


That's such a dumb example because it claims to follow english rules for those letters while ignoring the actual rules. It makes a somewhat humorous joke, but people pretending that it means anything linguistically are either ignorant or intentionally trying to confuse people.


shure!


reads like it would be pronounced with an aspirated -s- not sh.


Not so much in terms of meaning but in terms of pronunciation, sometimes you also need to read ahead in English to know how a certain word is pronounced. For example: "I read a book yesterday." and "I read a book every night." Depending on the context that follows, "read" is pronounced differently. The same thing happens for "present" and "record". Admittedly, these are exceptions to the rule.


When teaching reading and English, learning about context clues is one of the ways students are taught to figure out the meaning of words.


> in English - a word is a word, and the individual letters that it's composed of are almost always pronounced the same way

Some context-dependent examples: "read": /ɹid/ vs. /ɹɛd/; "lead": /lid/ vs. /lɛd/ (plumbum); "desert": /ˈdɛz.ɚt/ vs. /dɪˈzɝt/.



I think you’ll find all of those things are true of English too.


People find value in the tradition of writing. If Japanese were to ditch kanji as Korea did, I think there would be some complaints.


People love to complain about how much work other people can do in order to slightly convenience themselves. And the media loves to run air their complaints, because they are snappy, and photogenic, and easy to pitch as "feel good stories about how much I care for the old ways unlike those lazy sloppy people over there"... even if "I" also find myself forgetting how to write kanji.

It doesn't matter. It won't be a top-down decision. It'll just be a long, slow progression of people slowly realizing that writing in kanji for this character is annoying, so maybe I'll just write it phonetically, and then that character, and then there will be a year or two where there's a phase change and suddenly it's everywhere, even though nobody decided.

And people will complain and whine and moan about the "beauty" of the kanji disappearing. And even though they have a point, it won't matter because the kanji will still be there as much as they ever were, and all one has to do is go study them... but they won't. Because complaining about how other people should keep doing something hard is easy, but actually doing the hard thing yourself is hard, and the vast, vast majority of the complainers won't actually do anything about it other than complain, but take the easier options themselves, just maybe a year or two later than others.

I have no beef with the people taking the easier option. Life is full of things to spend effort on and we can't give maximum effort to all of them. I am annoyed at people who complain about how other people can do vast, vast quantities of work so they can briefly feel slightly better about themselves in some way.


You'd be surprised how much staying power these things have. The nation and its language are two concepts inherently intertwined. Take the case of Welsh in Wales. It was an almost dead language that no one spoke, but as soon as the Welsh got the ability to self-govern, they enacted laws to mandate all documents and road signs were available in Welsh, required it to be taught in schools, etc. It's very difficult to kill a language in a democratic state because it's a very bad look to oppose laws that "protect the nation's culture". The people who want these things are endlessly pandered to as a result.


Welsh is generally highlighted as the example of a successful language revitalism movement, but it's also one of the rare examples of such movements succeeding. By contrast, you can look at Irish--where the need for the language that wasn't English was seen as absolutely essential as part of the (successful) revolution and independence movement--and see that the language revitalism there is more or less a failure. A century after independence, the number of L1 speakers of Irish has gone down, and I believe the Irish government still conducts most of its business using English (despite English officially being the lesser of the official languages) since so few members of government are sufficiently proficient in Irish.


Kanji is never going away. I struggle to believe anyone who fully mastered kanji would say something like this.

Kanji is not just “harder”. It’s better.


I've studied kanji to some degree. I'm not a "master", but I am aware of the way it resolves a lot of ambiguities in Japanese.

But that does not on its own mean that Japanese couldn't evolve out of Kanji. It is not the case that if Kanji goes away, the entire rest of the language MUST stay static. It in fact would not. It would begin a multi-decade process of adjustment to the new issues.

It has happened before in other contexts, and it will happen again. There's a lot of signs that Chinese is on the verge of such a change (on a decadal time scale), which carries somewhat different baggage, but roughly the same amount of it.

What really throws the wrench into the whole thing is computers, and I don't just mean that it will simply speed up or slow down such a change, but that it could send all of this flying out in an entirely new direction. If we're all wearing augmented reality goggles full time in 20 years, what will happen to ideograms if every ideogram you see comes with floating pronunciation guides, and your googles can also translate phonetic spellings transparently in real time back into kanji/ideograms? Could languages like English start growing something like ideograms, presumably descended from modern-day emoji, if computers erase the disadvantages of emoji that cause languages to largely go alphabetic thousands of years ago?

What I absolutely do know is this: In 50 years, no language will be the same as it is today. Guessing what the changes will be, especially in a rapidly evolving novel landscape, is really hard. I don't think kanji/ideograms being seriously diminished is off the table.


Why is this if you don't mind me asking? I thought that hiragana could already write all the words. What makes kanji so much better than that?


In addition to the phoneme problem, it's about readability. Yes, really. The first time I saw わたし written as 私 I just about instantly remembered the latter (it is, after all, used constantly in writing). That kanji is much easier and faster to read than the corresponding hiragana, and it was like that from way back when I had just started learning Japanese. I still have a way to go.. learning a language at my age turns out to be quite slower than when I was younger.. but everything is just easier to read, as soon as one's able to read something in kanji instead of hiragana. The latter is hard and slow to read, even though it's such a simple character system to learn.


Nah as someone that learnt it for 3 years, did a 6 month exchange and then stopped after that I totally disagree.

Not only are kanji needlessly complex because of history, there's also extra work like stroke order (another needlessly "important" thing).

Hira/kata is so much easier, but I ended up giving up the language after I both realised that I wouldn't live there and that they're just romanising so much anyways.


This is equivalent of saying you studied engineering for 6 months and turns out arches are useless, you can just get rid of arches in all bridges and nothing bad will happen.


Japanese is very syllable-poor and so there are a colossal number of homonyms and homophones. In speech a lot of these are distinguished by tone and pronunciation, but in writing kanji is the only way to tell them apart. Reading kana-only Japanese is not impossible, but it's a fast path to a headache and leads to huge numbers of ambiguities even in the best case.


This just indicates that kana orthography is not phonemic enough; but there's no reason why it couldn't be improved to cover tones etc.


The issue is not the writing system. Japanese phonetics are extremely simple. There’s nothing you can do about that.


Japanese doesn't have tones, it has pitch accent, and pitch accent applies to words, not phonemes. You would have to invent a system where pitch accent could be indicated for each word. The difference between 橋 (bridge) and 箸 (chopsticks).. the pitch accent is slightly different. But written the same in Hiragana: はし So there would have to be something (wavy line above the text?) to indicate pitch accent. Not sure how that should be done. And then there are the words with little or no pitch accent difference, only context.. in kanji they're different, would be the same in hiragana, so how do you encode that.. compromises would have to be made. I'm sure people have tried to come up with something, somewhere. Maybe.

But then again.. it's that other problem: Reading when there's kanji is much faster. Even for beginners. If you don't understand a word in kanji then it doesn't work, but as soon as you understand it it's way easier and faster to read.


> You would have to invent a system where pitch accent could be indicated for each word

Really not hard to do. A symbol on the syllable bearing the pitch accent would solve the issue

> And then there are the words with little or no pitch accent difference, only context

What's happened is that effectively a written "shorthand" has emerged that has evolved somewhat separately from how people speak. Losing kanji would mean losing this shorthand, in favor of writing more closely akin to the way people actually speak, but this is how the vast majority of written languages work. Preserving this shorthand seems like thin gruel to justify the complexity of kanji.


Pitch accent is not accent as in English, it's not any "the" syllable. If you've ever seen any of those videos about it, you'll see these down-up-flat patterns over the whole multi-syllable word. From high to low, from low to high, or low to flat plus/and other variations.

I wouldn't compare kanji to shorthand. Shorthand is typically not easy to read, normal writing is easier. Reading written, fully-spelled English is fast. Reading hiragana is slow (and I've been reading hiragana for a long time)- it's slow, and mentally much harder than reading with kanji. The only issue (and that is of course an issue, but tiny compared to Chinese) is that there's a lot to learn before everything can be read fluently. But reading only hiragana is just.. too hard, for any serious amount of text. It's not hiragana per se, it's the language itself with its limited set of phonemes which contributes to the difficulty.


Pitch accent in Japanese is deterministic based on the mora that is "accented". While it's true the effect of this accent "spreads" across the entire word, you only need to mark a single mora to know the effects word-wide.

> Reading hiragana is slow (and I've been reading hiragana for a long time)- it's slow, and mentally much harder than reading with kanji.

What's the ratio of hiragana-only text that you read compared to Kanji? And does the hiragana text uses spaces between words? My strong suspicion is "low" and "no", respectively. Familiarity breeds comfort with any writing system, and word breaks are a fabulous ergonomic tool for easing reading.


When I started Japanese a long time ago I would read (small) children's books because all I could read was hiragana. With spaces, for the smallest children. And that was all I read and could read. And yet.. as soon as I could read various words with kanji, the reading got easier and faster.


>And yet.. as soon as I could read various words with kanji, the reading got easier and faster.

Could part of that be due to the fact that your vocabulary was also increasing at that time?


No, it wasn't because of vocabulary, which has only very slowly increased over time. The reading difference is instant and very noticeable. I can't read hiragana fast enough (matching speech) to follow subtitles which are all in hiragana, for example, while I can if there's kanji (though only if I can read it, there's still lots I can't read). This can be changed forth and back and tested with sites like Animelon, for example.


> I'm sure people have tried to come up with something, somewhere.

Perhaps related is the abjad used in Arabic and Farsi. Vowels are written with diacritics above or below the main part of the character, which represents a consonant. However, in modern Arabic, the vowels are rarely written and are inferred from context.

The bigger problem for Japanese is the absence of spacing between words. Even if you write everything in hiragana with spacing, it's significantly slower to read than when kanji is present without spacing. The mixing of kana and kanji usually provides a hint as to where word boundaries are, because there are few cases where kana is followed by kanji in the same word (eg お and ご), and kana which follows the kanji are most often a continuation of the word (okurigana) or a particle. Some words are usually written in kana despite having kanji available, and their presence can sometimes make it more difficult to read because they might look ambiguous with a particle or okurigana, and you have to figure out from context what was intended, which slows down reading slightly.


I can't help but feel these languages are just silly, or at least very badly designed. Maybe in the future, when AI is good enough to translate everything in real time, we will just find a language that is best and teach children that instead. It would save a lot of headaches, and probably also cure dyslexia.


To call a language silly is.. silly. I don't know Chinese. But for a person like me, Japanese is incredible. It's so extremely logical. Exceptions are almost non-existing. Sentences are modular. Etc. I love it, as a person with a programmer's mind. It has very few phonemes and that's one reason it's hard to "fix" the writing system, but that's also one of its good points, for someone learning the language.

As for "translate in real time", that won't happen because from Japanese to English it would mean to translate before the sentence is done, knowing the intention of the speaker before the speaker says anything. For the simple reason that in Japanese the verb comes at the end while in languages like English it's typically the second word. Using an AI wouldn't be any better than when I used to translate for my wife and the other way around. It works but is hardly satisfactory for anything more than occasionally (speak, wait to hear the translation, speak back, ditto).

A Star Trek universal transparent real-time translator will not happen.

As for dyslexia.. I don't see the connection. Dyslexia is a problem of reading and writing, and it exists independent of the language, and also the writing system (it has been sometimes claimed that Japanese children are less affected by dyslexia than people learning Latin-based languages, and I for some time kind of thought so too.. but I have since seen multiple cases of dyslexia related to Japanese as well, it's the exact same problem)


> it exists independent of the language

Rates of dyslexia are much higher in countries with less phonetic spelling systems. The general conclusion from this is that, while dyslexia may exist at equal rates in countries with phonetic spelling, its effects are diminished to the point where many individuals with it can read unimpaird.

> A Star Trek universal transparent real-time translator will not happen.

I never claimed it would. A delay of a few seconds between speech and translation is acceptable, much the same way actual translators do it.


I would like to see actual research into dyslexia vs spelling systems, because I've tried to find it and I haven't been able to. Instead I see only claims as the above, which, so far, appear to be based on "common sense", which doesn't actually work here. Common sense says that languages with complicated spelling rules (English, French) should affect dyslectics more than straight-forward languages like Italian and Finnish, but it doesn't, to any noticeable effect.

As an individual I only have anecdotal "evidence", but for what it's worth - I already mentioned that I've seen dyslexia in Japanese children, but not only that - I've also seen that dyslectic bi-lingual children have dyslexia both in Japanese and in their European language.

Unless I see real evidence I'll continue to assume that dyslexia is simply under-reported in e.g. Japan. As has been the case for so many other things - nobody speaks of lactose intolerance in Japan, though it obviously exists.


Yes I was interested in this myself so, before posting what I just wrote, I looked into it and went through the sources on a few papers. I ended up at this fairly authoritative-sounding book which made the claim, though I don't remember the source they cited and I can't be bothered to find it again. The claim made was not that dyslexia wasn't present in other languages, but that its effects were reduced in phonetic ones. The same way that someone in a wheelchair still has broken legs, but can benefit greatly from the installation of ramps.


This is not a reason for Japanese people to keep Kanji, but Chinese tourists can read Japanese at about 50% comprehension level just due to Kanji without knowing at all how the words are pronounced in Japanese.


Another commenter pointed out the ambiguity in Japanese phonetics which is very true.

Imo, the biggest efficiency gain from kanji comes from reading. Meaning is grasped instantly because you don’t need to worry about phonetics. Pronunciation follows a general set of rules, such that even when encountering new words you can guess at how they’re pronounced, while grasping meaning at a glance.

To compare it to latin languages, the difference is like going from reading everything out loud to reading silently.


How does pronunciation follow any rules? There are none that I know of where a given kanji can have several meanings completely independent of one another, there is no structure there.

I'd agree with you if you'd said Korean, where the makeup of the character has direct rules for pronouncing it, if you learn the simple rules then you can read any Korean character - this is the middle ground they should drop kanji for, imo


The main radical in a character usually dictates how it’s read. General language familiarity tells you which of the readings to use. That’s accurate most of the time, and when it isn’t there’s furigana on the word.

For example, 青 is read as “sei”, and characters that use it as a radical are either read as “sei” or “jou”, such as in 情熱(jounetsu) or 清潔(seiketsu). So when you run into a rare character in a word that uses this same radical, you can assume that it uses a standard reading. For example, the word for fairy, 精霊, isn’t one you run into very often, but when you do you can assume that it’s read as “seirei” based on the radicals, and you’d be correct.

I’m explaining this in length here but with native level proficiency this process happens instantly, as you’re reading.

Japanese should not drop kanji. The only people that think that are foreigners that failed at learning the language. This is not a shared sentiment among japanese speakers.


Sure, and there were complaints in Korea, too. Lest we forget, Hangul was developed in 15th century, and was promptly condemned by the educated elites while being enthusiastically adopted by the underclasses. But the elite pushback, going as far as outright bans in some periods, meant that it wouldn't become the standard orthography until 1900s.

I don't think anyone today would seriously argue that Hanja is preferable, though. In retrospect, it's clear that the benefits of easily accessible universal literacy are too substantial to ignore for the sake of tradition.


> I don't think anyone today would seriously argue that Hanja is preferable

It's necessary to use Hanja today in educated contexts because Hangul has too many homophones, and most educated (technical, literary, scientific) vocabulary has a Sinitic origin and therefore are more homophonic than typical Korean words.


Sure, and lawyers in English-speaking countries similarly use Latin and Old French jargon to reduce ambiguity. But this is a fairly narrow use case that is really more of a specialized notation - it's not used day-to-day even by people who regularly use Hanja professionally.


Hanja still get used in some contexts --- had to memorize ~500 of them when I was studying Korean.


AFAIK (maybe someone can correct or confirm) it is essential for studying law in Korea. To avoid ambiguity with identically sounding words, Chinese characters are used in law.


This is the reason Chinese characters are not going away. It is essential to comprehending written documents, because the Chinese language (and similars) have too many sounds that are the same or very similar for different words. So, if they abolish the characters and use something purely phonetic they'll have to reinvent the whole language to be understandable, especially for anything that is not colloquial.


This is not a problem in other languages. The word "set" in English has 7 different meanings, yet you rarely struggle to tell which is intended. If the language can be understood when spoken, it can be understood when written phonetically.


Other languages are not Chinese. In Chinese a lot of the meaning in the spoken language is conveyed through tones and other conversational cues.


Tones can be written, and all human spoken communication involves conversational cues, Chinese is not special in this respect.


If this was so easy, pinyin (a standardized writing) would have replaced characters decades ago!


But it is that easy. Pinyin has a standard notation for the tones of words. Your position on this matter cannot seriously be "if it were possible to write Chinese phonetically, the Chinese alphabet would no longer exist".


Their position is "since it is possible to write Chinese phonetically and yet characters didn't go away, there might be more to the story" (than self-proclaimed language experts on HN think)


This is incorrect and shows a basic lack of reading comprehension. Neither I nor anyone else claimed to be a language expert. The post in question said:

> If this [writing tones] was so easy, pinyin (a standardized writing) would have replaced characters decades ago!

Which is saying that the reason Pinyin has not replaced traditional characters because it cannot accurately transcribe Chinese speech.


As the previous comment says, you're the one with reading comprehension problems. The topic of discussion is the replacement of Chinese characters with a phonetic writing. I said that pinyin already exists and it has not replaced characters, so this cannot be easy as you imagine (just writing down the phonetics of the language).


Which is your perspective, and distinct from the argument just made: that if it was possible to write Chinese tones, the traditional characters would have been replaced. It's obvious that the characters are not replaced due to cultural factors, rather than the inability to come up with a system that can transcribe Chinese speach.


I'd say nationalism is really the answer there.


Can you quantify this? From what I understand, Chinese speakers can understand Pinyin text even without the tone marks.


nd nglsh spkrs cn ndrstnd nglsh wtht wrttn vwls.

Easy to understand for a fluent speaker, but a learner might struggle.

We saw back when we had keypad phones, the youth would write "txt" speak because it was faster to type with 10 digits. I'm pretty sure there was a decline in literacy rate around this time, the youth struggled with spelling because they wrote rarely, but texted frequently. Smartphones fixed that problem, because they provide the full keyboard and auto-correct.

My guess is, if you took the tones out of pinyin, then a generation or two later there would be less literacy. Children would struggle to add the tones even though they know how to speak the word. Writing already contains far less information than speech. Over several more generations, the speech could even change because the written word has lost the tonal information. Compared to the past, we read far more, speak less, and write even less, and most writing has been replaced by typing.


Most importantly, you can always pick a simple, predictable sentence, or one with enough redundancy to "prove" that point. Some everyday simple sentences might work in pinyin even without the accents for tones. Try an excerpt from a patent application and I'm sure even with tones you'll fail.

> mchncl lmt xsts t th wdth f sngl xhst prt, t bt 62% f th br dmtr fr rsnbl pstn rng lf.

> Th rd vlv s smpl bt ffctv frm f chck vlv

That's just from a Wikipedia page I have open from earlier. Already quite a bit harder to decipher.


> the youth struggled with spelling because they wrote rarely, but texted frequently

I wonder if there is any evidence of that other than boomers complaining about it.

>Over several more generations, the speech could even change because the written word has lost the tonal information.

That happens automatically with every language already. It's not like a race to the bottom where suddenly no one knows how to communicate though.

>Compared to the past, we read far more, speak less, and write even less, and most writing has been replaced by typing.

That's not necessarily a bad thing.


>This is not a problem in other languages.

I don't have a dog in this fight one way or the other, but it really is surprising that all these pro-kanji comments seem to ignore the concept of context altogether. It's very circular reasoning being used to try and explain why kanji are necessary.


> they'll have to reinvent the whole language to be understandable

Frankly, the whole language seems like such a mess that maybe they should?


Good luck convincing 1.5 billion people that they need to reinvent a language they have used for thousands of years in order to satisfy somebody else...


In the Chinese style of government, people don't necessarily need to be convinced of something for it to be implemented.


It seems there's room for "legal innovation" there, by providing definitions early on in various texts to disambiguate, and then sticking to them throughout the text!?

I assume it's already done anyway for some terms. Why isn't this more widespread?


Innovation is quite often resisted by those who have mastered the hard way of doing things, though I have no idea whether this is the case here.


I suppose for the same reason that law in English-speaking countries still uses so much Latin?


But that's the opposite of innovation? Basically, instead of describing things in detail drafters opt to use shortcuts, but that's how people end up getting fucked in court by some "technicality".

Innovation would be to just put in the verbiage, precisely define terms, fuck tradition.


The Latin (and Old French) words don't require a complicated arrangement to type them on a keyboard.


For what it's worth, writing Japanese on a normal keyboard is easy, even for me. Fast too. And my wife is super-fast. I have no knowledge of how this is done in Chinese, or Korean for that matter (not to mention other non-Latin languages like Arabic, Thai or Hebrew), but for Japanese it's easy. There are two main ways of doing it, some prefer one, some the other.


Yeah, because Japanese without Kanji is at least 2x harder to read.


"Although East Asian people have higher IQ scores on average, we are not superhuman" weird explicit racism as the highest voted comment


Before calling it racist we could ask him for a reference. After all, it might be a truthful fact.

But yes, it stood out to me too, and I'm confused how you're the only person commenting on it.

I wonder if these people justify having a shitty writing mechanism by being smart. "It's so needlessly complex, but we're smart so we can afford it" when in reality if you're smart you want it as simple as possible.

And then he comes on HN and rationalizes the fact that he can't spell. Ironic.


IQ has been disproved as not an accurate evaluation of intelligence even for problem solving, so asking for a scientific reference for this is like asking a scientific reference for how a certain locally dominant ethnicity have better chakras than another one. they don't, it's just racist.


The absolute state of orange reddit


I found the "we are not superhuman" bit annoying and condescending - replacing with different groups:

"While men are stronger than women on average, we are not superhuman" <- To me, this doesn't seem condescending. Not 100% sure why - perhaps because the difference in the trait (between men and women) is much larger?

"While left-leaning people are smarter than right-leaning on average, we're not superhuman" <- This definitely does seem condescending - perhaps because the skill in question is intelligence, and the statement reads as "You're stupider than me, but please strain your brain to understand".

"While rich people have higher IQ scores on average, we're not superhuman" <- I find this a bit less condescending than the previous one, not sure why. But still annoying for the same reason as previous.

"While whites are stronger than east asians on average, we're not superhuman" <- Again condescending and annoying, but I still think the intelligence statements are more grating.

My conclusion - the statement is irritating because it carries with it an implication people would be surprised Asian people can forget things too. Additionally, it gives the intention that because other groups are stupider, it needs spelling out in simple terms that they're not godly intellects.


Maybe it's less of a factor since the standardization of mandarin, but the difference between kanji and an alphabet like Korean and Vietnamese has moved to is that writing with the alphabet leaves an artifact that is only understood by speakers of the same language, whereas kanji can have the same meaning but different spoken words entirely, such that cultures can communicate through written edicts without totally erasing linguistic differences through standardization. So you're right that the individual language/culture doesn't suffer from alphabetization or pinyinification, but I would submit there is change on the level of multicultural interactions, decreasing the mutual intelligibility between cultures for better or worse


> Although East Asian people have higher IQ scores on average

Citation needed.


Maybe parent was referring to the studies by Lynn, a self-declared "scientific racist".

I think that despite lower IQ scores on average South Korea has been consistently beating Japan in go in the recent years, and more importantly they get rid of hanja (Korean version of kanji) from their writing system.


[flagged]


Ignore the SPLC's own editorializing --- I don't take SPLC's analyses all that seriously either --- and just read the quotes on this page.

https://www.splcenter.org/fighting-hate/extremist-files/indi...

Nobody here is defaming Lynn. You can disagree with the appellation (though I find a number of fact-checked publications claiming that he does describe himself that way), but I don't think it's reasonable to call the argument libelous.


Calling someone a racist is one of the most defamatory statements one can make. People get cancelled for less. Instead of name calling, people should focus on the empirical question (is there a positive correlation between East Asian ancestry and IQ?) not on trying to undermine the reputation of someone.


While I don't agree with any of that, at all, we don't even reach the question, because the point of my comment is that he calls himself that. The SPLC has him in primary source quotes.

I don't think you're going to have an easy time of un-cancelling Richard Lynn. As it stands, your argument comes across more as trying to launder his most inflammatory claims back into the conversation. I'm not interested in debating phrenology, only the more specific question of what terms are and are not reasonably to apply to this person. That's a question we can actually answer empirically with sources available to us.


> While I don't agree with any of that, at all, we don't even reach the question, because the point of my comment is that he calls himself that.

Apparently he does not. You haven't produced a quote to show that he does.

> I'm not interested in debating phrenology

Correlations between ancestry and IQ have nothing to do with phrenology. If they exist, they are not racist. Facts just are what they are.


For whatever it's worth, that was literally what David Duke said. I'm not saying you're David Duke, just that the rhetoric you're deploying isn't persuasive.


And it's a fact that Lynn is a self-proclaimed "scientific racist", and that most "race scientists" are racist.


> Apparently he does not. You haven't produced a quote to show that he does.

That's not how it works.


You accuse someone of saying X, but can't provide evidence they ever said X, then there is no reason to believe you. The burden of proof is on your side.


"Apparently he does not" is a positive claim ... it does not follow even if you were right about not providing evidence--but you're not. And they can and did provide evidence, you simply ignored it. There are plenty reasons for honest people to believe it. Meanwhile, it's an empirical fact that Lynn is a racist, as well as most "race scientists" and their defenders. I won't comment on this further.


> And they can and did provide evidence, you simply ignored it.

That's simply a lie. They didn't. Nowhere on that website was Lynn quoted as describing himself as a "scientific racist".

> Meanwhile, it's an empirical fact that Lynn is a racist

What would be the evidence for this claim? There could in any case be an association between IQ and East Asian ancestry. Whether there is, is an empirical question, and an empirical hypothesis can't be racist, it is just true or false, or supported/unsupported by the evidence. In the source I provided, Lynn references statistical evidence that supports it.


Calling someone a racist is actually nearly never defamatory, since it’s almost always an opinion based either on disclosed facts or an opinion based on nothing.


Of course it's defamatory, as many cases of cancellation show.


He wasn't called a racist here ... the claim was that he labels himself as one.

OTOH there are those who protest too much that "it's just science".


Do you expect him to tatoo a word "racist" on his forehead (in kanji)?

That you even dare to quote this pseudo scientific crook is just mind boggling.

Relevant quote:

What is called for here is not genocide, the killing off of the populations of incompetent cultures. But we do need to think realistically in terms of "phasing out" of such peoples.

Calling his output research is abhorrent.


What is the source of this quote?


The SPLC link above that you didn't read. None of the (sourced) quotes it gives from Lynn are scientific.


Here is one citation: https://www.sciencedirect.com/science/article/abs/pii/S10416...

But also in most other IQ tests, Ashkenazi Jewish and East Asian people tend to score the highest.


a) Bad science. b) Unsupported claim. c) Who cares, and why?

As an Ashkenazi Jew with an IQ 3 SD above the mean myself, I focus my attention on that question and have a good grasp of the answer. I also have particular insight into why some Jews have high scores, and how the people who care so much about the average IQs of various populations draw all the wrong conclusions from them because of their ideology. (I would also note that many of those who care so much have lower IQs than a very large fraction of the populations they disparage.)


"While the author seems to interpret this problem as something crucial"

Does he? Read his last paragraph.


My wife is college educated and native Korean, so these are just my observations of her and her friend group's engagement with Chinese characters (Hanja).

Hanja, in daily life, has largely disappeared from colloquial Korean for those under 40 or so. It's still preserved in some formal settings like medicine and law, and is used to appeal to older generations. I've been with my wife long enough to remember when Hanja was still very common to see on newspapers.

There are some small vestigial problem with eliminating from daily life, the large number of monosyllabic Chinese-origin loan words in modern Korean can sometime create ambiguity when written in Hangul. Native Korean speakers will sometimes disambiguate these words by referring to the Hanja, but that's largely disappearing as a habit as well.

Younger Korean generations still learn it in K-12, but it's mostly wasted class time in an already overly crammed education. The kids who focus on it are really geared towards becoming lawyers, and certain kinds of doctors (mostly traditional medicine). STEM focused kids will focus on English instead. As a result there's an active linguistic process occurring where English loan words are slowly replacing Chinese-origin words and concepts in active and modern Korean.

I don't too much about Japanese, but I do have a sense from native speakers that writing the same words in the four major writing systems offers some sense of nuance to how close a reader might be to a concept, or how they might consider it in various ways. From visits there, I did notice the expectation that native speakers could seamlessly read and jump between the systems, often within the same sentence. But I also understand that the pronunciation of Kanji is somewhat nonstandard, and it's not immediately clear how to say something written purely in Kanji (sometimes this is supported by providing explanatory superscripts in another system next to the Kanji). Why persist with this? I suppose it's the nuance that's being conveyed, and this nuance is still prized among native Japanese speakers.

I do get the sense that China has no particular plans on moving away from the system, as it's a unifying source of national identity (and has been for centuries). And they really have very few other options. The main problem is that China is a highly linguistically diverse country, and Chinese offers the ability to transmit ideas instead of sounds which allows speakers of non-mutually-intelligible "dialects" to communicate. Moving to a Latinate system or even to Zhuyin Fuhao (Bopomofo) encodes sounds, not ideas, and risks fracturing the state. It would only become possible if there was a concerted effort, maybe over a couple generations, to Mandarinize and discourage the use of local dialects, but that would also be highly disruptive. Koreans, Japanese (and other adjacent non-Sino languages like Vietnamese, etc.) escaped this either through a higher level of linguistic uniformity, or strong efforts to standardize or teach a national dialect that the writing system (Hangul, Chữ Quốc ngữ, Hiragana, etc.) could amplify.


While I acknowledge that JSON Patch can be useful in certain contexts, I find that there are far better alternatives, particularly for the scenarios I encounter. Specifically, when using kustomization.yaml to generate slightly different Kubernetes manifests for various environments (dev/staging/production), tools like jsonnet offer superior functionality and flexibility.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: