In the analysis he counts instruction set features like number of registers, number of addressing modes, number of memory accessed per instruction. He compares over a dozen architectures.
ARM comes out as the least RISCy RISC, but definitely on that side of the line, and x86 as the least CISCy CISC. (This was before amd64.)
It's for sure a hybrid given that they were microcoded on early ARM cores. But a really, really useful half way point given that those early ARM cores lacked caches unlike prototypical RISC chips and these instructions would other wise be competing with the memory transfers themselves if they didn't maximize density to a single aligned instruction.
> A RISC is a computer with a small, highly optimized set of instructions
but later:
> The term "reduced" in that phrase was intended to describe the fact that the amount of work any single instruction accomplishes is reduced—at most a single data memory cycle—compared to the "complex instructions" of CISC CPUs that may require dozens of data memory cycles in order to execute a single instruction
From this[1] piece it seems the the original goal was indeed both:
> Cocke and his team reduced the size of the instruction set, eliminating certain instructions that were seldom used. "We knew we wanted a computer with a simple architecture and a set of simple instructions that could be executed in a single machine cycle—making the resulting machine significantly more efficient than possible with other, more complex computer designs," recalled Cocke in 1987.
IIRC the idea is that the judge of the complexity is ultimately how direct it maps to the underlying implementation. For example VLWI machines follow the same principles but with the focus around super scalars, i.e. they favour explicit parallelism defined in the instruction stream as opposed to dynamic circuitry implementing instruction reordering and dependency tracking