We could definitely expand on the search protocol in our WhitePaper, thanks for the input! However, we provide client side search and not server side search, thus, our servers have no idea of what the user is searching.
The keywords (along with other file metadata) are also encrypted using the user's key, so analyzing access patterns of the chunks, would not be possible for us.
Client-side search is secure if the client downloads all the metadata, and then nothing else based on the result. The access pattern is simply which and how many encrypted chunks are selected. If they are selected based on a search result there could be a statistical correlation.
Not that every leakage is critical, for example if you write chunks when the user uploads a file the client just leaked that an upload happened, it also leaked how much data was written. Depending on the use case this might not lead to sensitive information leakage. It however might, for example if a hospital has an app that let's you download information about your disease an adversary could leak which disease you have without compromising the encryption.
The most advanced technique of hiding which chunks are accessed, or whether they are written or read is called ORAM. ORAM makes chunk accesses indistinguishable, however this technique has a logarithmic overhead, it fails to hide how many chunks are accessed, and it is also hard to design a search protocol on top of it that does not create patterns in the ORAM accesses, which can be also analyzed.
A practical solution is a search protocol that tries to decouple the results from the accesses.
This paper is just an idea, I'm designing a different protocol that is more efficient, but requires persistent client cache, which is just becoming a reality for web clients.
The encrypted file metadata (and the search indices) are downloaded at once. It is then decrypted using the user's key on device. The user then performs a client side search to get the relevant file-ids. The file-ids are then used to retrieve retrieve the file from the decentralized storage.
The keywords (along with other file metadata) are also encrypted using the user's key, so analyzing access patterns of the chunks, would not be possible for us.