You are talking about implementation, the OP was talking about raising the concept with interested parties and seeing whether it is worth even starting to think about it.
They could fork, they could add threading to some sub systems and roll it out over several versions.
I don't know enough about the code but, of course, it is a hard problem but the solution might be to build it from the ground up as a threaded system, using the skills learned over 30 years and taking the hit on the rebuild instead of reworking what is there.
I am most interested because I didn't realise there was a performance problem in the first place.
Am I going crazy, or has the obvious implementation of such a change been missed on people? If they were proposing taking a multi-threaded app and splitting it into a multi-process one, I would predict they would find a hell of a lot of unexpected or unknown implicit communication between threads, which would be a nightmare to untangle.
Going the other way, there is an extremely well understood interface between all the processes which run in isolation: shared memory. Nearly by definition this must be well coordinated between the processes.
So the first step in moving to a multi-threaded implementation would be to change nearly nothing about each process, and then just run each process in its own pthread, keeping all the shared memory ‘n all.
You would expect performance to be about the same, maybe a little better with the reduces TLB churn, but the architecture is basically unchanged. At that point, you can start to look at what are more appropriate communication/synchronisation mechanisms now you’re working in the same address space.
I just don’t understand why so many people seem to think this requires an enormous rewrite - having developed as a multi-process system means you’ve had to make so much of the problematic things explicit and control for them, and none of these threads would know anything at all about each other’s internals.
They could fork, they could add threading to some sub systems and roll it out over several versions.
I don't know enough about the code but, of course, it is a hard problem but the solution might be to build it from the ground up as a threaded system, using the skills learned over 30 years and taking the hit on the rebuild instead of reworking what is there.
I am most interested because I didn't realise there was a performance problem in the first place.