Give me a break, we're talking about vectors here.
I like Clojure, spent some time messing around with it a couple years ago and will one day actually use it for something, probably involving complex configuration where code-as-data really shines along with concurrency/performance.
But if you're talking about working a ho-hum vector with 100-10k entries, a linear scan over a mutable, contiguous array will typically be faster than the most clever multithreaded code you can come up with, and take up less of the CPU while it's working. 10 cores are a Bad Idea for that kind of work.
Amdahl's law tells us we should look at larger units of concurrency in our architecture rather than getting all excited about some auto-parallelized map function. At that point, it starts being important how fast the (single-threaded!) individual tasks run.
Well, no. A linear scan over a large memory array is going to crap all over the CPU caches if you have to do it more than once.
Break into blocks < CPU cache size, perform multiple stages on each block.
Having all that handy control-flow stuff makes it easier to get the block-oriented behavior you need to maximize performance, which in these cases is all about memory bandwidth.
I like Clojure, spent some time messing around with it a couple years ago and will one day actually use it for something, probably involving complex configuration where code-as-data really shines along with concurrency/performance.
But if you're talking about working a ho-hum vector with 100-10k entries, a linear scan over a mutable, contiguous array will typically be faster than the most clever multithreaded code you can come up with, and take up less of the CPU while it's working. 10 cores are a Bad Idea for that kind of work.
Amdahl's law tells us we should look at larger units of concurrency in our architecture rather than getting all excited about some auto-parallelized map function. At that point, it starts being important how fast the (single-threaded!) individual tasks run.