I've build columnar OLAP databases and database engines in C++ for work. Now I'm...

I've build columnar OLAP databases and database engines in C++ for work. Now I'm doing it in my free time. Based on my experience the Phi and it's architecture is very exciting for OLAP databases workloads.

Reasons:

- Even in a OLAP database you end up with quite a few places that have very branchy code. Research on GPU friendly algorithms on things like (complex) JOINS and GROUP BY is pretty new. Additionally complex queries will functions and operations that you might not have a good GPU implementation for (like regex matching)

- Compression. You can use input data that compressed in anyway that there is a x86_64 library for. So you can now use LZ4, ZHUFF, GZIP, XZ. You can have 70+ independent threads decompressing input data (it's OLAP so it's pre-partitioned anyways). (Technically branching, again)

- Indexing techniques that cannot efficient implemented on the GPU can be used again. (Again branching)

- If you handle your own processing scheduling well, you will end up with near optimal IO / memory pattern (make sure to schedule the work on the core with local memory) and you not bound PCIe speed of the GPU. With enough PCIe lanes and lots of SSD drives you process as near memory speeds (esp. when we'll have Xpoint memory)

So the bottom line is if can intelligently farm out work in correct size chunks (it's OLAP so it's prob partitioned anyways) the the Phi is fantastic processor.

I'm primarily talking about the bootable package with Omni-Path interconnect (for multiple).