The Programming Paradigm to Find One Solution Among 8,080,104 Candidates

triska · on May 1, 2022

Constraint solvers are a great attraction of many Prolog systems, and in fact they are often an important reason for buying a fast commercial Prolog system. For example, here is a solution of the first puzzle shown in the article, using Scryer Prolog and its CLP(ℤ) solver, i.e., constraint logic programming over integers:

    :- use_module(library(clpz)).

    solution(N) :-
            N in 0..10_000,
            N #= 13*_,
            N #= 19*_,
            N #= 37*_.

Note how seamlessly constraint programming blends into Prolog: We simply state these constraints as relations that must hold for logic variables, just as always in Prolog! In fact, plain Prolog can be regarded as a particular instance of constraint logic programming (CLP), namely as constraint logic programming over Herbrand terms, abbreviated as CLP(H).

Sample query and answer:

    ?- solution(N), indomain(N).
       N = 0
    ;  N = 9139.

We see from this answer that there are in fact two solutions to the puzzle: 0 and 9139. Prolog reports all solutions on backtracking.

A nice feature of the Scryer Prolog toplevel is that we can simply press "a" to enumerate all solutions. For example, if we increase the upper bound to 100_000 as mentioned in the course of the article, we get:

    ?- N in 0..100_000,
       N #= 13*_, N #= 19*_, N #= 37*_,
       indomain(N).
       N = 0
    ;  N = 9139
    ;  N = 18278
    ;  N = 27417
    ;  N = 36556
    ;  N = 45695
    ;  N = 54834
    ;  N = 63973
    ;  N = 73112
    ;  N = 82251
    ;  N = 91390.

So, 11 solutions (not 10).

Hence, the sample code shown in the article should probably change:

    model.NewIntVar(1, ...)

to

   army = model.NewIntVar(0, ...)

in order to correctly model the equation shown before the code.

The knapsack problem shown in the article can also be easily expressed and solved:

    ?- Vs = [Bread,Meat,Beer],
       Vs ins 0..sup,
       Bread + 3*Meat + 7*Beer #=< 19,
       Popularity #= 3*Bread + 10*Meat + 26*Beer,
       labeling([max(Popularity)], Vs).

yielding solutions in decreasing order of popularity:

       Vs = [2,1,2], Bread = 2, Meat = 1, Beer = 2, Popularity = 68
    ;  Vs = [5,0,2], Bread = 5, Meat = 0, Beer = 2, Popularity = 67
    ;  Vs = [0,4,1], Bread = 0, Meat = 4, Beer = 1, Popularity = 66
    ;  Vs = [1,1,2], Bread = 1, Meat = 1, Beer = 2, Popularity = 65
    ;  ... .

This answer shows that the optimal solution with popularity 68 is unique in this case. The program output in the article shows "3712 popularity", yet the text also mentions 68: The number 3712 stems from using 1_000 as the capacity.

It is notable how compactly and conveniently such tasks can be expressed and solved with Prolog.

flobosg · on May 1, 2022

> Hence, the sample code shown in the article should probably change (...) in order to correctly model the equation shown before the code.

The equation should be 1 ≤ army ≤ 10000, since you cannot have an army of zero soldiers.

throwawaygh · on May 1, 2022

Slightly tangential, but I wonder if the massive amount of people using Python makes it easier to "sell" OR/SAT/etc. tools to developers.

I've been working with some young devs who only learned Python and web stuff (?!). One thing I've noticed about these devs is that they have some really bad intuitions about how long it takes to brute force problems like this. Brute forcing the hardest problem in this article requires less than 3 seconds for a C program but requires 3 minutes for a Python program.

Bumping the capacity to 10,000 (so from 8 million options to one trillion options) takes about 40 minutes in C, but would probably take all day in Python. Once you get much past 10,000 the effort of pulling in a library starts to make sense because, even though C brute force is still tractable, you need to start parallelizing things and that's more effort than an import statement.

Also, note to author: there are several mistakes in this article. There are bonds errors in your army examples and you used capacity = 19 instead of capacity = 1000 in one of the questions for the bread/meat/beer problem.

runeblaze · on May 1, 2022

I think it is just not in standard curriculum. The algorithms class deals with theoretical bounds. The programming classes deal with OOP design. Only in competitive programming did I actually learn how to estimate running time.

shoo · on May 1, 2022

i agree that naively trying to do compute heavy stuff in pure python compared to C / cython is likely to be 100x slower, and 1000x slower than a thoughtfully rewritten array-oriented version of the computation in C / cython that allocates slabs of memory up front and then avoids frequent tiny allocations anywhere near hot loops.

but

real world applied optimisation problems (not simple illustrative toy problems as in the article) often have an exponentially large number of candidate or feasible solutions, attempting to brute force the search space in C is not going to be a fruitful exercise. e.g. you might be trying to solve a facility location problem to place `m` depots on nodes in a graph with `n` vertices, where each node can have 0 or 1 depot, so the space of feasible solutions is `2^n`, the powerset of the set of nodes. for a smallish applied problem, n might be `n=5000` . there might be a bunch of set-cover-like constraints that require a bunch of demand nodes on the graph are close enough to a depot node to be "covered", and objective function costs for each depot constructed or the distance from each demand node to the closest depot.

commercial mixed integer solvers such as Gurobi have a whole bag of tricks to reduce the problem into a smaller problem during a presolve phase before the real MIP solver even gets woken up. a presolve phase employs a bunch of heuristics to detect common kinds of problem structure or substructure -- for example in the toy "optimisation and beer" problem it might immediately recognise that the only constraint is a knapsack constraint and palm the entire problem to a specialised knapsack solver which trivially solves the problem. more realistic applied problems are often weirder and messier -- more generally a presolve might be able to eliminate some but not all of the decision variables, or use substructure-specific techniques harvested from the literature to prove and inject additional constraints that make the real MIP solver's search space smaller.

throwawaygh · on May 2, 2022

> allocates slabs of memory up front and then avoids frequent tiny allocations anywhere near hot loops.

My observation was intended primarily as a pedagogic critique. The C solutions to these problems don't require any cleverness and are transliterations of the Python equivalents.

I agree re: optimizers of course. The article would be better if the final example was something where even really cleverly optimized "is this really correct?" enumeration techniques under-performed the optimizer.

vikingerik · on May 1, 2022

Yet if the Python program took 3 fewer minutes to write/test/debug than the C program, then that way does come out ahead.

throwawaygh · on May 1, 2022

The code is identical, modulo minor syntactic sugar. We're talking about 6 ints, 3 for loops, and 3 boolean comparisons between integers.

klyrs · on May 1, 2022

I sat down for 2 minutes and wrote a brute-force solution that ran the 10k scenario in 400ms in python. Hint: you only need two loops.

milansuk · on May 1, 2022

I wrote it in C with just one loop: for(i=0; i <= 10000; i+=37) and one condition: if(i > 0 && i%13==0 && i%19==0) and it takes only 0.000003 sec.

klyrs · on May 1, 2022

I was referring to the beer/meat/bread problem, but yeah. Brute force is ruthlessly effective for small problems.

shoo · on May 1, 2022

if all the inputs are constants and don't depend on program input, it'd be interesting to disassemble that to see if the compiler solved the problem at compile time and generated a "load <constant_precomputed_solution>"

you may not necessarily be timing what you think you are timing

throwawaygh · on May 2, 2022

> brute-force solution... Hint: you only need two loops.

Well, yes, there are all sorts of tricks, including the use of a solver :)

In my earlier post, read "brute-force" as "full enumeration". The comment was about dev's intuitions re: the time required to fully enumerate a space; ie when is it worth thinking about anything other than the stupidest solution possible.

klyrs · on May 1, 2022

Constraint solvers are indeed powerful tools, but 8M cases is just not that many. I don't have a python terminal handy, but I suspect that a brute-force triple for-loop wouldn't take more than a second.

Avshalom · on May 1, 2022

There's 8 million solutions to the constraint 1000 ≥ Bread×1 + Meat×3 + Beer×7.

a brute force triple loop would check a bunch of possibilities that aren't solutions either the very naive 1000×1000×1000 (for one billion) or slightly less naive 1000×333×142 (for 47.3 million).

Not that a billion would take that long either on a modern computer, but if you look at the OR-Tools solution, it's really not any longer or more complicated than loops would be. As long as you have good tools (coughprologcough) it doesn't take very long for constraint solving to come just as naturally as loops.

klyrs · on May 1, 2022

Sure, sure. I wasn't at a computer at the time and didn't pay the closest attention to the problem. Turns out that the brute-force solution involves two for-loops, not three, because you'd be a fool if you don't make bread the innermost loop and optimize it away by filling the remaining capacity.

So the N=1k takes about 6.5ms on my desktop, and the N=10k takes a mere 400ms. Only when I try 100k do I need to actually wait, for an egregious 40s.

And, a more trifling note, their variable bounds are too generous...

   meat = model.NewIntVar(0, 1 + capacity // 3, 'meat')
   beer = model.NewIntVar(0, 1 + capacity // 7, 'beer')

reduces the triple-loop estimate to a mere 47M cases.

deathanatos · on May 1, 2022

So the first part one only has 10k cases, and takes 542µs for Python on my laptop to find the two solutions that exist. (There's the 9,139 soldiers in the article, but "the enemy camp was abandoned", i.e., 0¹ also fits the given constraints.)

Part II only takes 73 µs for the small example, and 6.77s for the large one (the one in the title).

¹ah, I re-read the article and "The lower bound is 1 since we know there's an army" … but that's not ever established in the math parts. Do we know there's an army? Our scouts seem … eccentric. Of course, while the prose claims a lower bound, the math immediately below that paragraph contradicts it with "0 ≤ army ≤ 10000"… (given the waffling in the article on this bound, I suspect the author's initial implementation of a constraint solver probably returned 0, at which point they updates some parts of the article…)

tgv · on May 1, 2022

The first case consists of counts that are relative primes, so it's a multiple of the product. No need for code, just a hand-held calculator, but that's not the point of the post, of course.

djtriptych · on May 1, 2022

Had the same thought reading the title. If N = 8000000 and I only need the answer once, I'd definitely just brute force and like... count to ten.

dragontamer · on May 1, 2022

Modern CPUs are 4GHz processors.

If you need 100 clock cycles per brute-force check, you can brute-force 40 million cases per second.

8-million is actually very small. Modern computers (GPUs really) are on the order of reasonably brute-forcing 2^40-ish search spaces these days (or one-in-a-quadrillion, depending on the complexity per check)

~1GHz clock on GPUs x 4096 shaders == 4-trillion gpu-shader core cycles per second. If you're willing to wait 1-hour (3600 seconds), that's 14-quadrillion gpu-shader clock cycles.

charcircuit · on May 1, 2022

Don't forget that modern cpus have multiple cores so that's another 8 or 16 times speedup.

ceceron · on May 4, 2022

I am a big fan of the Constraint Programming paradigm. Still, there are several distinguishable approaches:

1) Constraint Logic Programming — quite cool as long as you have experience in Prolog, but choice of underlying solvers is limited: Prolog runtime has built-in solver.

2) Answer Set Programming - syntactically similar to Prolog, but later compiled (grounded) to elementary logic solvable with SAT techniques — very efficient.

3) Constraint Programming via Solver API — there are many solvers: Choco, Oscar, OR-tools that are usable programmatically. This approach is advertised in the article.

4) common API - there are libraries, e.g. Numberjack, that generalize API over several solvers

5) Constraint Programming via Modelling Language: there exist languages designed just to define Constraint Programming models. MiniZinc is at the moment the most mature and has the most extensive support from industry.

6) related techniques: SAT, SMT

Normally I start with 5) as it allows me to quickly prototype a solution for the problem. Only when it's not enough, I switch to 3).

OMGWTF · on May 1, 2022

  $ python3 -c 'import math; print(math.lcm(13, 19, 37))'
  9139

lqr · on May 1, 2022

If the author is here: There is a mistake in this article. In Section II, the text discusses a wagon capacity of 19, but the code and printed output are for a wagon capacity of 1000.

ogig · on May 1, 2022

Solvers use is one of those skills managers love, a lot.