I wouldn't call constraint propagation "brute-force". It isn't enumerating every combination of values and backtracking (as either a non-contstraint Prolog program or a C program with up to 81 nested for-loops would). Rather, it's removing every possibility that has already been canceled out (constraint propagation), then saving undo information, making one guess, and seeing if that sets off another chain reaction or reaches a solution.
He acknowledges the possibility of using more "reasoning" in the article, but dismisses it:
"We could try to code more sophisticated strategies. For example, the naked twins strategy looks for two squares in the same unit that both have the same two possible digits. [...] Coding up strategies like this is a possible route, but would require hundreds of lines of code (there are dozens of these strategies), and we'd never be sure if we could solve every puzzle."
He acknowledges the possibility of using more "reasoning" in the article, but dismisses it:
"We could try to code more sophisticated strategies. For example, the naked twins strategy looks for two squares in the same unit that both have the same two possible digits. [...] Coding up strategies like this is a possible route, but would require hundreds of lines of code (there are dozens of these strategies), and we'd never be sure if we could solve every puzzle."
If you want to read more about constraint programming, there is an excellent overview chapter in CTM (http://www.info.ucl.ac.be/~pvr/book.html). "The Art of the Propagator" (http://dspace.mit.edu/handle/1721.1/44215) is also good, though it focuses more on arithmetic value propagation than set propagation problems like sudoku. For more advanced material, look at the clp(FD) papers by Danial Diaz (http://cri-dist.univ-paris1.fr/diaz/publications/cv-short.ht...); familiarity with Prolog terminology will be helpful.