I made extensive use of Sage during a recent research project. The most useful thing about Sage for me is that it integrates cleanly many mathematical libraries, making it easy to export results from one tool into the other. The visualization tools are a bonus, and made debugging quite easy. In my case, Sage neatly connected graph isomorphism tools with group theory libraries, saving me a tremendous amount of work.
The main downside for me is the performance overhead of using Sage. It is great for quickly prototyping ideas, but I found that some components were not designed with performance in mind. For instance, I used the Sage interface for Gap for much of my group theory computations, and I found that this interface was incomplete (it does not natively support the product replacement algorithm, for example) and quite leaky (I needed to manually garbage collect to keep from running out of memory).
Thanks for this, the reviews for this paper were very interesting.
I do wish that they provided the meta-review, however, which summarizes the ultimate decision. Meta-reviews are written by area chairs, which are the final arbiters for whether or not the paper is accepted. Seeing their perspective on this paper would be very interesting.
I would like to point out before this discussion moves too far along that UCLA does in fact have an introductory programming course which is taught using Python: http://web.cs.ucla.edu/classes/enroll/97/
Students that are unfamiliar with programming are encouraged to take this course before the C++ course.
piemonkey said that they're encourage to take the Python one first, meaning they still have to take the C++ course as well. The Python one is optional and intended for those who are having trouble jumping straight into C++.
Can someone with more knowledge than me contextualize this result? It appears to be making progress on the Riemann hypothesis, but why is this particular effort getting attention on HN?
While I agree that there are a lot of structural problems with universities and the way funding is allocated, I disagree with the premise that beginning with taxing graduate students is the correct way to do it. Granted, as a current PhD. student at a University of California university, I have some skin in this game which colors my view-point, so take what I say with a grain of salt.
In particular, state schools have a lot less flexibility in terms of how we allocate funding to graduate students; our pay rates are set by California state law, and the salary of every single employee of the university is subject to public scrutiny [0]. We cannot simply dip into our endowments to correspondingly raise wages, since the amount which can be withdrawn each year is strictly capped as a portion of the total, to ensure the long-term growth of the fund (see [1], "Endowment Spending Policy").
It is true that colleges and universities have grown tremendously in their size and scope. This is a popular anecdote (and often repeated without data), but consider this: a university has become a mini-society in a lot of ways. Universities offer an unbelievable number of services and drive their own mini-economies. They are often among the largest employers in their area, even in thriving metropolitan centers [2]. They provide health insurance, mental health care, their own police forces, job-searching centers, support groups, etc. As our broader American society scales back its social services, universities are increasingly on the hook to provide these services to students.
I agree that universities have grown too large, and resent extremely large unnecessary salaries (for example for sports coaches), but I also acknowledge that such waste exists in any sufficiently large and complex system, and it is a constant and active process to limit its effects. Condemning the whole system on account of this seems like cutting off the nose to spite the face.
Rust macros are extremely impressive. This reminds me of Will Crichton's Lia library[0], which embeds a convenient matrix manipulation domain specific language inside of Rust. This seems like a very promising direction for fusing the low-level performance of Rust with the high-level readability and usability of a managed language. Exciting times.
One amusing thing about this epic tale is Serious Industrial OCaml User disregarded direct, relatively easy to implement, very sound advice from Xavier Leroy about how to debug their system! I would like to think that, were I in a similar situation being advised by an expert of that calibre, I would at least humor his suggestions.
Why seek the expert if not for his advice? It brings to mind people disregarding doctors who give them inconvenient medical advice.
Not saying this is what happened in this circumstance, but even an expert has to explain the 'why' of things. "Disable HT and try again" comes with a hint of "please go away", whereas "disable HT so we can eliminate a potential cause" smells more like "I really want to understand this thing."
I know nothing of OCaml culture or why the author is deemed worthy of having his name italicized, but the doctor comparison is upsetting. If the guy who wrote this:
I was tired of this problem, didn't know how to report those things (Intel doesn't have a public issue tracker like the rest of us), and suspected it was a problem with the specific machines at SIOU (e.g. a batch of flaky chips that got put in the wrong speed bin by accident).
were a doctor, he'd be guilty of malpractice. This bug went unreported eight months longer than it needed to. Am I misreading all this somehow?
His name is italicized because he is the primary author of OCaml (and a plethora of other great tools, like CompCert, the first fully-verified compiler). Overall, an exceedingly competent and productive programmer and scientist.
The doctor metaphor isn't perfect; what I was going for is, when you are seeking out an expert's advice and you ignore it, why do you go to see the expert in the first place?
Thanks, that makes sense. I don't disagree about taking expert advice! I was just really disappointed to read the part of the article where the author doesn't bother to figure out how to navigate Intel support and report the bug.
Bug reports are gifts from inconvenienced users to vendors who have just done free work on the vendors behalf. There is absolutely no obligation on the part of a user to report bugs.
I am shocked by this attitude. Maybe I shouldn't be, and I'll certainly give this all some thought.
There's at least one obvious error in your statement: if an inconvenienced user's bug report results in less downtime for other users, it is a "gift" to other users, as well as a "gift" to the vendor.
But it says something about our profession if we regard putting flags down to mark the landmines we find a mere courtesy (a gift!) instead of an obligation. I guess that's a debate for a different time and place.
Well, guess what: you're as entitled to your opinion as I am to mine.
But just like a user has no obligation to 'mark the landmines' for vendors they also have no such obligation towards other users. They do have a right to receiving bug free software in the first place, alas our industry is utterly incapable of doing so which has lowered our expectations to the point where you feel that we have an actual obligation as users to become part of the debugging process.
That is not going to make our lives better.
What will make our lives better is if software producers accept liability for their crap they put out and if they were unable to opt-out of such liability through their software licenses and other legal trickery.
You're just a small step away from making it an obligation rather than an optional thing for users to report bugs, the only difference is that for you the obligation is a moral one rather than a legal one. I really do not subscribe to that, when I pay for something I expect it to work and I expect the vendor (and definitely not the other users) to work as hard as they can to find and fix bugs before the users do.
But we're 'moving fast and breaking shit' in the name of progress and part of that appears to extend to being in perpetual beta test mode. That's not how software should be built and I refuse to subscribe to this new world order where the end user is also the Guinea pig.
Keep in mind that users have their own work to do, are not on the payroll of the vendors usually have forked over cold hard cash in order to be able to use the code (ok, not in the case of open source) and tend to be less knowledgeable about this stuff than the vendors. They really should not have a role in this other than that they may - at their option - upgrade their software from time to time when told very explicitly what the changes are (and hopefully without pulling in a boatload of things that are good for the vendor but not for them).
> You're just a small step away from making it an obligation rather than an optional thing for users to report bugs, the only difference is that for you the obligation is a moral one rather than a legal one.
I'd argue that a person does have that obligation in some circumstances, yes. And yes, I am thinking in moral rather than legal terms. The legal picture is pretty far outside my expertise, and the professional ethics of software engineering (which would in turn inform the legal picture) seems to be woefully opt-in. As you say, 'moving fast and breaking shit,' perpetual beta test mode, etc. So I'd put the legal stuff aside for now.
For me, the key is that "user" is a deceptive term here. A mere user cannot point to a small piece inside a much larger machine and say "that will blow up occasionally, and I know exactly when." We are talking about engineers. Or at least, I was thinking of the professional obligations of engineers - on the user side of the fence and the vendor side of the fence - and that was informing my comments.
> Keep in mind that users have their own work to do, are not on the payroll of the vendors usually have forked over cold hard cash in order to be able to use the code (ok, not in the case of open source) and tend to be less knowledgeable about this stuff than the vendors.
Yeah, and I don't think I disagree with you in the "user" case. I really think a software engineer finding a CPU bug is a different case. It seems me that if we're in possession of knowledge of something as serious and wide-reaching as a CPU bug, we have a reproducible test case, and we don't do anything with it (I mean, at least a tweet or something, for the love of God) we are part of the problem with our profession.
I'm on board with a reporting duty if such a thing will always result in:
(1) a payment from the vendor to the reporter compensating them for time and effort spent at getting the bug to be reproduced
and, crucially,
(2) a requirement for all vendors of software and hardware to timely respond to bug reports and to have a standardized reporting process.
In that case I can see how such a shared responsibility would work, but as it is the companies get the benefits and the users get the hardship with a good portion of reported bugs (sometimes including a solution) that go unfixed, that's not a fair situation.
Case in point: I've reported quite a few bugs to vendors over the years but I've stopped doing it because in general vendors simply don't care, most of the time bug reports seem to result in a 'wont fix' or 'here is a paid upgrade for you with your fix in it'.
The security guys seem to be converging on a way of managing these - the compensation of the person reporting the bug and the factors motivating vendors to respond to the bug report in a timely fashion or suffer consequences - with bug bounties. Intel should make it easy to report this stuff, but if everyone understood that finding something genuinely interesting resulted in a serious payday, nobody would skip making the call.
The difference between something like Google's bug bounty (capped at over $30k, I think) and a hypothetical bounty for Intel is, well, Intel has a lot more at stake. It's honestly strange that they don't have something in place already. Something like Skylake costs on the order of billions to get out there. It's cool that this Skylake bug was fixable via microcode, but the Pentium FPU bug back in the day cost them half a billion dollars. If such a bug exists, that is the kind of thing Intel should want to have reported as soon as humanly possible. Even the reputational damage they take from something milder like the Skylake bug would justify a bounty system with very serious payouts.
I am genuinely wondering why you think this way? I do not expect my users to report failures. If they do I am more than pleased (not sure I would call it a "gift") because it shows that my product is good enough that they won't just leave out of frustration.
As technologists, we develop tools and services to capture bugs (both server side and client) so that we gain more insight into how our product operates. A user that takes time to give a well crafted bug report is rare. Most of the time it tends to be legitimate gripe that a feature isn't working.
Update: After writing I am re-reading your comment and now thinking we are on same page.
Since we're communicating via rhetorical questions, are you asserting that it's literally impossible to contact Intel support with a defect, or merely tedious? Does it become easier or more difficult if you have a reproducible test case and the sort of connections a prominent FP compiler person has?
I'm trusting the account of the prominent FP compiler person who had a test case and wrote in the linked article that he found no way to submit the issue. You're the one saying this cannot be true, but you're not saying why you think so.
It's a bad analogy, yes. I'm not sure who the people who suffered intermittent system problems for months longer than they otherwise would have, had he reported the problems when he found them, are under the doctor analogy - not patients under his treatment, certainly.
> Doctors are not obligated to write up case studies or submit them to medical publications.
That's true, although I think we'd all agree that a person who has the knowledge to create a lifesaving treatment for a disease and doesn't bother writing it down because, well, writing is boring, is behaving rather unethically.
But this is merely computer science, not medicine.
> K-means is based on some quite wild assumptions - your data follows a specific case of the Gaussian distribution. Plus side is that the algorithm is relatively easy to understand and implement so it is a good starting point into clustering.
K-means is a non-parametric clustering algorithm, which assumes no underlying properties about your data. It is hard to evaluate the quality of a clustering algorithm without some clearly defined objective. In particular, k-mean's cousin in the classification community, k-nearest neighbors, actually is provably a a Bayes optimal decision strategy as k and the number of samples increases, regardless of the underlying distribution of your data (i.e., it is strongly consistent, see [0]).
> K-means can be levelled up with the EM algorithm
In fact, the k-means algorithm is actually a special case of fitting a mixture of k Gaussian distributions where all the Gaussians are isotropic (have identical covariance matrices with uniform elements on the diagonal, i.e. a scaled identity matrix).
I think it's important to understand what's meant when someone says some piece of software is "verified".
In this case, when they say LLVM is verified, they mean that they designed a formal correspondence between the (1) semantics of the LLVM virtual machine language and (2) the actual execution of the underlying CPU (in this case, modeled using actual assembly language). Essentially, this amounts to writing two interpreters (one for LLVM assembly and one for low-level assembly), writing a compiler to generate low-level assembly from LLVM assembly, and proving that the interpreter for LLVM assembly does the same thing as the assembly language interpreter running on the generated code.
The tricky part here is "proving that they do the same thing"; this means that, for any program written in LLVM, there is a corresponding program in low-level assembly that has equivalent semantics. Reasoning about all possible programs requires reasoning over an infinite family of objects; there are a few tools in mathematics for doing this, but by far the most useful for reasoning about programming languages is induction. Coq, the tool used in the linked page, is a very popular tool for writing inductive proofs on heterogeneous tree structures (aka, an abstract syntax tree).
Ultimately, they use Coq to define the semantics of LLVM by writing (1) execution rules (for example, adding two integers results in a new int in a certain register) and, (2) in the case of LLVM (but not low-level assembly), typing rules which provide certain guarantees themselves. Coq is a proving environment which provides a functional dependently-typed programming language which can be used to construct and verify constructive proof arguments about programs. For example, you can prove existence (that there always exists an assembly language program for every LLVM program), or type soundness (that an LLVM program that is well-typed never performs an illegal operation according to its type semantics).
Ultimately, the value of proving this correspondence is that we know that LLVM has (1) equivalent expressive power to low-level assembly and (2) that, barring any bugs in the way that a CPU executes its assembly, the generated LLVM code will perform the same operations as the assembly language that it generates.
Edit: As some follow-up posts point out, this particular project is more concerned with verifying intra-LLVM transformations (i.e., LLVM to LLVM). This is quite different from what I described; my post describes a verified compiler, similar to CompCERT.
The focus of this particular project is to verify transformations that work on the LLVM intermediate language. So there is only one interpreter, for the LLVM assembly. This could be used e.g. to prove the correctness of compiler optimization passes.
Of course, the same LLVM semantics could also be used if someone wanted to verify a compiler backend that generates low-level assembly (similar to what CompCert does), but I haven't heard if anyone has any concrete plans to do that. Rather than verifying a complete a new back-end, it would probably be quicker to write verified translations between the CompCert intermediate language and LLVM, and slot in the Vellvm verified transformations as an extra pass in CompCert.
> In this case, when they say LLVM is verified, they mean that they designed a formal correspondence between the (1) semantics of the LLVM virtual machine language and (2) the actual execution of the underlying CPU (in this case, modeled using actual assembly language).
I don't think that's accurate. Vellvm is verifying transformations of IR rather than executions on any given CPU.
You are right, perhaps my original post was unclear. The goal is to verify the transformation between LLVM and low-level assembly. I embellished slightly by equating the semantics of low-level assembly with CPU executions.
I'm pretty sure they're talking about IR to IR transformations (e.g. high level IR optimizations) rather than IR to low level assembly transformations (code generator backends). The former is grad school stuff; the latter is hard.
The main downside for me is the performance overhead of using Sage. It is great for quickly prototyping ideas, but I found that some components were not designed with performance in mind. For instance, I used the Sage interface for Gap for much of my group theory computations, and I found that this interface was incomplete (it does not natively support the product replacement algorithm, for example) and quite leaky (I needed to manually garbage collect to keep from running out of memory).