It's also that whenever an hour is poured into Haskell, Scheme, or OCaml, there's hundreds of hours poured into Javascript, Python, C, Java, ...
Eventually that lets you optimize things more at the compiler level because you have the extra time to maintain it as well.
My primary argument is that a more restrictive language, if designed correctly, has programs which are resistant against bit-rot over time. You can take a program from 15 years ago and compile it. It'll run faster due to optimizations, and hardware improvements. And the program will still be the "right thing to do."
In contrast, if you take C code that's older, you'd shudder and immediately begin identifying things left and right which needs to be redone because the modern machine is different from the aged machine. Part of the efficiency comes from the fact we are building the C code such that it has affordance on current hardware for efficient execution. But when the hardware updates, then the code needs amendment to keep up. Meanwhile, your SQL statement is the same because it's written declaratively.
The definition has to do with certain classes of spatial and temporal memory errors. Ie., the ability to access memory outside the bounds of an array would be an example of a spatial memory error. Use-after-free would be an example of a temporal one.
The violation occurs if the program keeps running after having violated a memory safety property. If the program terminates, then it can still be memory safe in the definition.
Segfaults has nothing to do with the properties. There's some languages or some contexts in which segfaults is part of the discussion, but in general, the theory doesn't care about segfaults.
> The violation occurs if the program keeps running after having violated a memory safety property. If the program terminates, then it can still be memory safe in the definition.
I don't know what you're trying to say here. C would also be memory-safe if the program just simply stopped after violating memory safety, but it doesn't necessarily do that, so it's not memory safe. And neither is Go.
The point is that a segfault is not an indication for memory unsafety. It is the opposite: The OS stops some unsafe access. The problem with C implementations is that it often comes to late and the segfault does not stop a prior unsafe read or write. But this is also an implementation property, you can implement C in a memory safe way as many have shown. Rust has, unfortunately, changed the narrative so that people now believe memory safety is a property of the language, when it is one of the implementation. (there are, of course, language properties that make it harder to implement C in a memory safe way without sacrificing performance and/or breaking ABI).
(EDIT: removed the first part since I realized you were replying to some comment further up, not my example.)
> Rust has, unfortunately, changed the narrative so that people now believe memory safety is a property of the language, when it is one of the implementation.
I am not sure I agree with that (the concept of memory-safe languages looong predates Rust), but you can just define a memory-safe language as one where all conforming implementations are memory-safe -- making it a feature of the language itself, not just a feature of a particular implementation.
The segfault seen here is not a property of the language implementation, it's just a consequence of the address chosen by the attacker: 42. If you replicated this code in C you would get the same result, and if you used an address pointing to mapped memory in Go then the program would continue executing like in similar exploits in C.
The only reason this isn't a more critical issue is because data races are hard to exploit and there aren't lot of concurrent Go programs/system libraries that accept lot of attacker controlled inputs.
Whether you can a segfault if you access an out-of-bounds address or not is part of the language implementation. An implementation that guarantees a segfault for out-of-bounds accesses is memory safe.
You can't really guarantee that all out-of-bounds accesses will segfault, because memory protection mechanisms are not that granular. (And actual memory segmentation, that did have the required granularity, has fallen out of use - though CHERI is an attempt to revive it.) That's why a segfault is treated as something to be avoided altogether, not as a reliable error mechanism.
What you can say though (and the point I made upthread) is that if a language manages to provably never segfault, then it must have some sort of true language-enforced safety because the difference between segfaulting or not is really just a matter of granularity.
You are using a narrower definition than me. The language implementation builds on the functionality of the a larger system. An implementation can utilize the functionality of the overall system and close the loopholes. For example, using sanitizer you can turn out-of-bounds accesses to arrays into traps. This is not a segmentation fault but SIGILL, but it also builds on the trapping mechanism to achieve bounds safety (if you limit yourself to arrays).
Both spatial and temporal memory unsafety can lead to segfaults, because that's how memory protection is intended to work in the first place. I don't believe it's feasible to write a language that manages to provably never trip a memory protection fault in your typical real-world system, yet still fails to be memory safe, at least in some loose sense. For example, such a language could never be made to execute arbitrary code, because arbitrary code can just trip a segfault. You'd be left with the sort of type confusion logical error that happens all the time anyway in all sorts of "weakly typed" languages - that's not what "memory safety" is about.
The core of a properly built, resilient/robust system is that you have compartmentalized code into different small erlang processes. They work together to solve a problem. A bug in one is isolated to that particular process and can't take the whole system down. Rather, the rest of the system detects the problem, then restarts the faulty process.
The reason this is a sound strategy is that in larger systems, there will be bugs. And some of those bugs will have to do with concurrency. This means a retry is very likely to solve the bug if it only occurs relatively rarely. In a sense, it's the observation that it is easier to detect a concurrency bug than it is to fix it. Any larger system is safe because there's this onion-layered protection approach in place so a single error won't always become fatal to your system.
It's not really about types. It's about concurrency and also distribution. Type systems help eradicate bugs, but it's a different class of bugs those systems tend to be great at mitigating.
However, if you do ship a bug to a customer, it's often the case you don't have to fix said bug right away, because it doesn't let the rest of the application crash, so no other customer is affected by this. And you can wait until the weekend is over in many cases. Then triage the worst bugs top-down when you have time to do so.
A good way to gauge if "asynchrony" is a term we need is to test if it is useful in other contexts than just a single language, or a single concurrency design.
If it's needed to reason correctly in a wide set of concurrency models, then I'd say it's going to be a useful addition. If not, then I'd say it's not really worth using in the grander scheme of things.
I.e., does this make any sense in Haskell, Erlang, OCaml, Scheme, Rust, Go, .... ?(assuming we pick one of the many concurrency models available in Haskell, Rust and OCaml).
More generally: if things are cooperatively scheduled, then there's a need for attention to additional details. This is because it's much easier for a bad piece of code to affect the system as a whole, by locking it up, or generating latency-problems. In a preemptively scheduled world, a large group of problems disappear instantly, since you can't lock up the system in the same way.
You can make a programming language where cycles are impossible. Erlang is a prime example.
Region inference is another strategy in this space. It can limit the need for full-blown garbage collection in many cases, but also comes with its own set of added trade-offs.
Reference counting is just a different kind of garbage collection, really. It acts like a dual construction to a tracing GC in many cases. If you start optimizing both, you tend to converge to the same ideas over time. Refcounting isn't void of e.g. latency problems either: if I have a long linked list and snip the last pointer, then we have to collect all of that list. That's going to take O(n) time in the size of the list. For that reason, you'd have to delay collecting the large list right away, which means you are converging toward a tracing GC that can work simultaneously with the mutator. See e.g., Go's garbage collector.
> latency problems either: if I have a long linked list and snip the last pointer, then we have to collect all of that list. That's going to take O(n) time in the size of the list. For that reason, you'd have to delay collecting the large list right away
These latency issues are inherent to deterministic destruction, which is an often desirable feature otherwise; they have little to do with reference counting itself. In principle, they can be addressed by "parking" objects for which delayed disposal is non-problematic onto a separate, lower-priority task.
> It acts like a dual construction to a tracing GC in many cases
yeah one of the most helpful realizations I’ve read is that tracing and ref counting are essentially two formulations of the same problem - one is finding objects that are alive (by tracing), and the other is finding things that are dead (i.e. their ref counts reach zero). and of course, every object is either dead or alive!
It's a useful realization but the follow on (unfortunately rather popular) claim that this inverse relation makes them the same thing is clearly wrong. They exhibit entirely different performance characteristics in places where it matters.
It's not very novel. There's far better ways of solving this than allowing a random string to be embedded as aux information to a struct field. Examples: F# type providers, or OCamls PPX system for extending the language in a well defined way. Macro rewriting systems also allow for better safety in this area.
This allows you to derive a safe parser from the structural data, and you can make said parser be really strict. See e.g., Wuffs or Langsec for examples of approaches here.
I’m not disagreeing that there are better ways to solve this given how other languages have implemented theirs but considering the constraints they had at the time the Go team designed this, it allowed them to implement marshaling fairly easily and leaves it open for extensions by the community.
Eventually that lets you optimize things more at the compiler level because you have the extra time to maintain it as well.
My primary argument is that a more restrictive language, if designed correctly, has programs which are resistant against bit-rot over time. You can take a program from 15 years ago and compile it. It'll run faster due to optimizations, and hardware improvements. And the program will still be the "right thing to do."
In contrast, if you take C code that's older, you'd shudder and immediately begin identifying things left and right which needs to be redone because the modern machine is different from the aged machine. Part of the efficiency comes from the fact we are building the C code such that it has affordance on current hardware for efficient execution. But when the hardware updates, then the code needs amendment to keep up. Meanwhile, your SQL statement is the same because it's written declaratively.