I think comments like there are border-line trolls in a community like HN.
The languages used in various parts of a CS curriculum are one component to the whole. Holding up that component as the reason that the sky is falling is, at best, disingenuous. I can construct terrible CS curriculums that start with C, and terrific ones that start with Java.
If those who construct the curriculum want the beginning courses to focus on algorithmic thinking, I think it's fair to use a language that abstracts away much of the physical machine. The abstractions can be peeled away in later courses.
If, instead, they want the beginning courses to focus on the realities and difficulties of dealing with computer systems, it makes sense to start with something like C. They can then introduce the abstractions that let people manage those difficulties in later courses.
I think both approaches are valid, as long as a student gets a view of the important points of the field. I can even see arguments why one approach might be better than the other. But claiming that one approach represents the failure of our CS academic system is zealotry.
You have to meta it one level. The reason C was taught was because it was close to the machine without actually being assembly language. That's pedagogically useful because it exposes the student to all the properties of the Turing machine without having to be too specific about implementation (not that MIPS assembly isn't also a good choice). Similarly, the reason to teach Smalltalk is to teach OO, the reason to teach Lisp is to teach Lambda Calculus. The only reason Java is taught is because students demand it because they've heard that if you know Java you can get a good job.
Now, that doesn't imply that Java shouldn't ever be taught. But the reasons for choosing any language should be academic reasons. Particularly, it is bizarre to see academia trailing industry in language adoption.
You claim that "the only reasons Java is taught is because students demand it," and it appears your objection is based on this claim. Do you have any support for this claim outside of your personal belief? My undergrad made the switch to Java right before I graduated, and they are a counter-example to your claim.
Also, I doubt your claim about C is true. C was used because it was close to the machine without actually being assembly. I suspect C was then taught because it was used everywhere.
I think his claim is based on the belief that Java has no academic merit (in the context of the other languages he mentioned) - it's just useful for development.
One step further: If you think that learning a specific language in university is going to get you a job you are not studying computer science, you should be going to a trade school.
A cs degree is universal, it should be language agnostic.
One computer language or another, it doesn't matter one bit, they're all functionally equivalent. Just like a chef cook has 30 knives to choose from you have a palette of languages that you could choose from to solve a given problem.
If you really understand computers then the languages are just a means to an end.
I disagree with your statement that choice of language "doesn't matter one bit." Some languages are better at some tasks than others. Appealing to functional equivalence ignores the relative cost (in time and characters typed) of expressing the same idea in different languages.
I suspect you're abusing the term "theoretical computer science," which is surprisingly common in this crowd. I assume you actually mean basic computer science concepts relevant to programming.
To address your question, Java abstracts away memory management - not just dynamic memory allocation, but common off-by-one mistakes will result in a runtime exception. It's possible, but unlikely, that you'll get a segfault in C. You probably own the memory just past your array, and you're more likely to get strange errors because you're invisibly overwriting values.
If you want to focus on algorithmic thinking, and not the realities of a computer, this is a win. As another posted pointed out above, I think other languages are better suited for this, but Java is still valid.
"If those who construct the curriculum want the beginning courses to focus on algorithmic thinking, I think it's fair to use a language that abstracts away much of the physical machine. The abstractions can be peeled away in later courses."
I don't know that Java does this all the much better than C. The problem with C for a beginning student isn't so much that you have to manage memory manually--it generally takes a few weeks to even get to malloc() in a C-based introductory course--but that C gets in your way with explicit typing, #includes, etc. Java does away with some of that but introduces its own OO scaffolding to get in your way too. Instead of having to write main()s and #include's, the Java student has to enclose their functions in a class and so forth. Let's compare Hello World in C, Java, and C#.
C:
#include <stdio.h>
int main(void)
{
printf("Hello, World!\n");
return 0;
}
Java:
class HelloWorldApp
{
public static void main(String[] args)
{
System.out.println("Hello World!"); // Display the string
}
}
The Java and C# examples are even more cluttered than the C example when it comes to superfluous tokens: it has a class declaration, the method signature is more unnecessarily elaborate, and the print command has like three levels of object-drilldown in it. When you get to the simple procedural programs that a beginning student will write, this mysterious crud remains unresolved for longer. It's not enough to explain typing as you would in C or Pascal, but you have to talk about object-oriented programming before you get into problems complex enough to justify that level of abstraction.
If you really want an abstract language to enforce algorithmic thinking, pick one that doesn't have all that extra mental burden when you first approach it.
Perl 5.8
print "Hello World!\n"
Perl 5.10
say "Hello World!"
Python
print "Hello World!"
Ruby
print "Hello World!"
The cool thing is that these languages still have subroutines and classes and so forth, but they don't force you to declare a class, declare a subroutine, and call an object method just to code "hello world".
Java has advantages over C. These advantages don't include "letting beginning programmers focus on algorithmic thinking by using high level abstractions". Java's higher level than C in that it protects you from naked pointers and lets you do OOP, but that's not the type of high-level abstraction that helps a beginning programmer, especially not when it comes at the cost of forcing them to put everything in classes and methods.
If those who construct the curriculum want the beginning courses to focus on algorithmic thinking, I think it's fair to use a language that abstracts away as much as possible. We have no shortage of good interpreted languages to accomplish this.
Actually I'd argue that with C, you have to start managing memory manually before you even get to malloc. The abstraction advantage of Java over C (not that I think Java is necessarily a better intro language) is that you can generally explain the syntax in abstract concepts and then use it like you'd expect. With C, it's far more likely to encounter scenarios that don't fit a simple model of understanding.
For example, unless you're concerned with specific performance issues, you're not likely to care how a string is implemented in Java. It's difficult to use strings in C without understanding memory. Without understanding when strings are mutable and when they aren't, what null-terminated means, how "%s" works, and such you will quickly run into some unexpected behavior and will likely just trial and error until you get something that seems to work. When you understand that C is a syntax for allocating and manipulating memory, it tends to make a lot more sense.
I purposefully phrased my statement to allow for dynamic language like Python - which is what I would probably choose for a starting language. I constructed my comment to also address the same arguments that have popped up in the "MIT switched to Python" discussions.
I've taught intro to Java labs to undergrads, and I did get questions on the scaffolding required. I answered their question, but also told them they don't need to understand that now. I'd rather not have to do that.
I would argue that Scheme fulfills the same purpose we discussed--getting out of the way and letting students focus on algorithms--except it emphasizes expressing those algorithms in a functional style.
I think the "Java <-> C"-debate is an instance of the problem "Should studying computer science be mindwreckingly hard or should studying computer science make you able to program things?". Plus, you can also bash Java in this special instance, which is always nice coughs.
Other instances include "Compiler construction or not?", "Theoretical computer science or not?", "Assembler or not?".
And then you had to debug something which resulted from corrupted pointers into the stack, partially being still good data and partially being nonsense.
>I think the "Java <-> C"-debate is an instance of the problem "Should studying computer science be mindwreckingly hard or should studying computer science make you able to program things?".
Your wording is heavily loaded. In any case, learning computer science should absolutely not be about learning to program.
You make a good point - I was thinking that Java shouldn't be taught in introductory CS classes for people planning to be a CS major, but obviously didn't get that across.
It would be fairly easy for someone to learn Java once they've been taught C. Professors should not have to spend time going over pointers and memory management in an OS class, however.
I think OS classes are the best place to cover pointers and memory management. Assuming you are covering how how an OS works vs an intro to UNIX class.
Granted in our OS theory class we spent most of our time using / talking about ASM, but C could also work fairly well.
This course has four purposes. First, you will learn about the hierarchy of
abstractions and implementations that comprise a modern computer system. This
will provide a conceptual framework that you can then flesh out with courses such
as compilers, operating systems, networks, and others. The second purpose is to
demystify the machine and the tools that we use to program it. This includes
telling you the little details that students usually have to learn by osmosis. In
combination, these two purposes will give you the background to understand
many different computer systems. The third purpose is to bring you up to speed in
doing systems programming in a low-level language in the Unix environment.
The final purpose is to prepare you for upper-level courses in systems.
This is a learn-by-doing kind of class. You will write pieces of code, compile
them, debug them, disassemble them, measure their performance, optimize them,
etc.
At my school, between an intro class that did C and a class where we programmed microcontrollers in assembly, pointers and bit-twiddling were already pretty well established by the time we get to OS classes.
At my university, Intro to Prog. is in C and C++ (yes, both), and the course that immediately follows is in Scheme, some made up language (we had to build an interpreter for that language in Scheme) and Haskell (for the motivated students).
No. Java and Python and Scala are not improvements. Why? They're too easy. Pointers are hard (relatively). Compiler design is hard. Complexity theory is hard. A loss in any one of these sections is a loss to the degree as a whole. The ACM ICPC helps a little in promoting intelligent problem solving (every CS student should be able to write a program to use Dijkstra's algorithm in under 20 minutes from memory), but it's the university's fault in the end. Fortunately in the United States, some of our top universities seem to have (thus far) escaped the treatment that other ones did. Stanford, Yale, Caltech, MIT, CMU, et al. all continue to teach very much the same curriculum that they taught intro CS students 10 years ago. Systems is a required course and taught in C. Intro programming is a required course and taught in Scheme/LISP. Unfortunately, many schools did not fair so easily and now teach Java or some-such exclusively. I think that this is partially to blame for the number of unfortunately bad computer science students with degrees. How to change this, I have no idea.
See? This syntax trouble is already lesson #1 you learn from pointers: The _address_ of some value and the _value itself_.
If you have a pointer (that is, the address), then you need to prefix it with a * in order to get the value in order to do useful things. If you have a value, you need to prefix it with & in order to get the value's address in order to pass it around more efficient (at least it will be more efficient if it is some large data blob).
Did anyone ever mention addresses and values and their difference when looking at Java from a users point? Not to me, to be honest.
Pointers are easy. I thought several people in our CS program how to use them in ~2 hours. What's hard for most people is thinking abstractly between what the code looks like and what happens when you run it.
Pointers are simply the first thing that forces most coders to consider that split. But, a reasonably competent JAVA developer moving to C can pick them up in little time. The problem is reading other peoples C code that looks more like line noise than structure. But, pointers are a tiny step along that path.
Sadly, you are not entirely correct. Stanford's intro course uses Java (though learning C is also required) and MIT just switched from Scheme to Python.
I would agree that low-level programming should be taught at some point. But one course is sufficient and it could be one that is taken in second or third year.
Low-level memory issues should be avoided whenever possible by using a higher level language. There's a reason why garbage collection was invented.
Depends on what you're studying. System programming is and should always be done in C/assembly. Personally I am doing virtual machine design in C at the moment.
I'd rather see C taught than Scheme. Hardware matters, and a good CS-education should be centered in languages that recognize that.
Software engineering, on the other hand is too important a subject to be left to the schools. Let them learn Scheme on their own. They'll appreciate it more that way.
They should learn both. Any CS student who does not understand simple hardware concepts such as page faults does not deserve a degree. Similarly, every CS student should learn functional and imperative programming. Period.
But this would produce a generation of students that know how to do things that we already know how to do really well already. Maybe it would be better if we skipped page faults and taught things that would advance CS. CS has moved on to more interesting areas that simple page faults. Think about how massively parallel systems run, or how distributed databases handle consistency, or high volume scaling, or NLP or any of the other areas of CS that are far more interesting that page faults.
Language details are all academic, pointless debates to be had by people who like one over another where the differences are often trivial (C++/Java). If you have a great mind and can understand what languages are doing then either will do you just fine.
I'd rather not teach the next generation of CS under-graduates the same old stuff that I did (and yes that included how the VAX-11/750 - one of the first to do so if I recall correctly - didn't have to have all a programs data or code in memory...)
You need to understand what has come before so that you can build upon it effectively. Taking past progress for granted will not help develop the future; if anything, you'll get a lot more reinvented wheels.
Among the higher-level problems you mentioned, how many of them are interrelated? I'd rather see professors cover material that is useful in 80% of cases, even if they are well-trodden topics - students can specialise in the rest. In my CS course (UK) we studied parallel systems, including MapReduce; distributed databases and NLP can be studied optionally along with several other topics of much higher level than mere page faults. That said, not all CS courses are created equal.
you do not need to understand how your compiler works, or how De Morgans law is responsible for all those NOT gates in your CPU to produce useful and meaningful CS.
Page faults illustrate the idea of the memory hierarchy, which is one of the most important architectures in computer science. Programmers who don't understand the memory hierarchy write slow code. You also can never do "massively parallel systems run, or how distributed databases handle consistency, or high volume scaling, or NLP or any of the other areas of CS that are far more interesting that page faults" without virtual memory, paging, and page faults.
You're confusing what's important to you with what's actually important.
"Programmers who don't understand the memory hierarchy write slow code" is just nonsense.
Most programmers today use languages where they aren't even aware of, or able to manipulate in any way how and when memory is managed. And this trend will continue as the hardware their programs run on are increasingly virtualized.
Let's face it, virtual memory is a done deal, its now fundamental until we find computer hardware architectures that don't ever have address spaces that exceed the physically addressable memory. With virtualized kernels, possibly its a better solution to boot operating systems that have no concept of constained memory resources, and let the virtualizer do the paging.
C would be my first preference as well, but I also appreciate that Scheme (or another functional language) would provide better opportunities than C or Java for students to learn more "advanced" algorithmic techniques such as recursion.
The problem with Java (and C, to a lesser extent) in basic problem solving is that you can almost brute-force-code and get a working solution that will get full credit, but after writing it a more elegant (and efficient) solution isn't always obvious. In functional languages, that "more correct" solution almost always seems to stand out more, at least to me.
Recursion is only considered an "advanced" algorithmic technique because people are taught to think of it that way. Recursion is actually pretty straightforward once you have a bit of practice using it.
Well, yes, most things do become easier with a bit of practice.
If a functional language was taught in beginner CS classes I think it would be easier for students to see how it works. It certainly "clicked" for me once I'd learned a bit of Scheme in my first AI class.