Another avenue for learning is to start from hard problems. A hard problem is a problem that is very hard without using a known technique or solution. Somebody who's unfamiliar with it has no clue on how to even start solving it. Solving such a problem will add new techniques to your tool-belt for solving a whole class of problems.
Problems that I've found worth studying:
- Parsing. This is hard if you don't know the standard techniques: mostly recursive descent parsing with precedence -- or a parser generator, but that won't teach you much unless you write the generator yourself. Moreover I've found that you quickly run into limitations with parser generators and they don't make parsing easier in the long run, that's probably why most real world parsers are hand written (e.g. gcc, llvm). You can probably hack something together that sorta works, but knowing the right techniques makes writing an efficient parser that you have high confidence in a breeze. I've seen many ad-hoc parsers that would have benefited greatly from proper parsing.
- Constraint solving. Examples are solving sudoku, the zebra puzzle, SAT solving, optimal register allocation and many others. The three most important techniques are backtracking, constraint propagation and conflict learning. Even professional programmers can't do this if they don't know these techniques (e.g. http://xprogramming.com/articles/oksudoku/).
- Interpretation/Compilation. In addition to being a hard problem, writing an interpreter and a compiler makes you understand how programming languages work. This is sometimes seen as a black art, but it is quite easy once you got the pattern.
- Numerical algorithms. The speed and the simplicity of numerical algorithms is astonishing. Newton's method is particularly amazing. In about 10 lines of code you can solve equations, do state of the art numerical optimization or solve differential equations. It can even handle problems that are usually considered to require specialized algorithms, like max flow and linear programming. It can also perform arithmetic operations, like division and square root.
- Machine learning. This gives you a different perspective on problem solving. Many people when presented with the task of spam filtering or determining the language a snippet of text is written in will respond with a hard coded scheme based on a lot of rules. A much better approach is to take a data set of known text and analyze the statistics of the data set and compare that with the snippet of task text. The same applies to many other problems.
I find this list quite striking. Actually I've not seen those particular topics all listed in one place before.
I have to admit I am slightly interested in the psychology of that list. Would this list be immediately on the mind of someone who teaches computer science for a living, or would it be a list made by a self-taught programmer who happens to have studied all of these topics or is it just received wisdom that these are complex programming tasks?
I think if I knew something about each of these, at least enough to say something non-trivial I would explore them together in a blog or article. Somehow I find the list interesting because in some way it represents where computing is headed in the next few decades and it should therefore inform the design of languages of the future (yes I accept that it has informed the design of some of the languages of the past).
Great list. I found this comment more interesting than the OP, personally.
The best addition to the list that I can think of is natural language processing. Extracting information from text, especially. There's a ton of interesting pieces, and most of them are useful for one kind of large-scale internet application or another.
That leaves numerical algorithms and machine learning, which I agree are useful to understand anyhow and different programming languages offer little leverage ;)
Haskell and parsing combinators takes an enormous bite out of the parsing problem, too. I won't call it "solved" but it brings it down to the point that writing a parser for a custom minilanguage is more like "a day's work" than "a month's work".
At the level they are supported at least in one language they are not hard in many other languages e.g., `import something` solves many established problems in Python. http://xkcd.com/353/
Mercury (relative of Prolog) greatly simplifies constraint solving and parsing as well. Packrat parsers for PEGs (a kind of unambiguous grammar) are a natural consequence of Mercury's support for DCG notation and memoization. Mercury also has builtin support for user-defined constraint solvers.
Problems that I've found worth studying:
- Parsing. This is hard if you don't know the standard techniques: mostly recursive descent parsing with precedence -- or a parser generator, but that won't teach you much unless you write the generator yourself. Moreover I've found that you quickly run into limitations with parser generators and they don't make parsing easier in the long run, that's probably why most real world parsers are hand written (e.g. gcc, llvm). You can probably hack something together that sorta works, but knowing the right techniques makes writing an efficient parser that you have high confidence in a breeze. I've seen many ad-hoc parsers that would have benefited greatly from proper parsing.
- Constraint solving. Examples are solving sudoku, the zebra puzzle, SAT solving, optimal register allocation and many others. The three most important techniques are backtracking, constraint propagation and conflict learning. Even professional programmers can't do this if they don't know these techniques (e.g. http://xprogramming.com/articles/oksudoku/).
- Interpretation/Compilation. In addition to being a hard problem, writing an interpreter and a compiler makes you understand how programming languages work. This is sometimes seen as a black art, but it is quite easy once you got the pattern.
- Numerical algorithms. The speed and the simplicity of numerical algorithms is astonishing. Newton's method is particularly amazing. In about 10 lines of code you can solve equations, do state of the art numerical optimization or solve differential equations. It can even handle problems that are usually considered to require specialized algorithms, like max flow and linear programming. It can also perform arithmetic operations, like division and square root.
- Machine learning. This gives you a different perspective on problem solving. Many people when presented with the task of spam filtering or determining the language a snippet of text is written in will respond with a hard coded scheme based on a lot of rules. A much better approach is to take a data set of known text and analyze the statistics of the data set and compare that with the snippet of task text. The same applies to many other problems.
What should be added to this list?