We are looking for a founding engineer to help us continue our mission of building a fully-automated Managed Detection & Response (MDR) platform.
Our ideal engineer has a broad set of experience across frontend, backend, infrastructure, cybersecurity, and data(bases). If you've been responsible for managing production infrastructure, talking to customers, and having full ownership of your features from start to finish, we'd love to talk with you.
Wirespeed | Founding Engineer | REMOTE (US) | Full-time | https://wirespeed.co
We are looking for a founding engineer to help us continue our mission of building a fully-automated Managed Detection & Response (MDR) platform.
Our ideal engineer has a broad set of experience across frontend, backend, infrastructure, cybersecurity, and data(bases). If you've been responsible for managing production infrastructure, talking to customers, and having full ownership of your features from start to finish, we'd love to talk with you.
We are looking for our first engineering hire to help us continue our mission of building a fully-automated Managed Detection & Response (MDR) platform.
Our ideal engineer has a broad set of experience across frontend, backend, infrastructure, cybersecurity, and data(bases). If you've been responsible for managing production infrastructure, talking to customers, and having full ownership of your features from start to finish, we'd love to talk with you.
What I still struggle to understand with these systems is they seem great for single resource authorization, but how do you perform bulk queries? For example, a user wants to query all blogs they have access to (assuming there are large amounts of them), does that require separate authorization logic in the DB?
Zanzibar in particular is designed to be able to answer the question "what can this user access?", or "who can access resource X?" as well as "can user Y access resource Y?".
This article from OSO [1] explains how, with references to tweets from Lea Kissner (one of the authors of the paper and implementors) which are unfortunately less useful now that Twitter threads have been vandalised.
Full disclosure: I'm a maintainer of SpiceDB, the most mature open source project inspired by Zanzibar
For this exact use case, SpiceDB created two APIs not available in Zanzibar: LookupSubjects and LookupResources. For other scenarios, there's also a BulkCheck API to performing many checks with less request overhead. The sibling comment here is correct that there isn't filtering/sorting available in SpiceDB yet.
Additionally, there are folks using SpiceDB today by replicating denormalized checks back into their database (e.g. Postgres) or search index (e.g. Elastic) so that you can filter them natively. This is the combination of the aforementioned Lookup APIs with our Watch API. While this strategy requires moving parts, it is necessary beyond a particular scale which is well beyond the point at which policy engines typically fall over.
While I'm biased, I do find this article somewhat misleading when describing Zanzibar-inspired systems; it presents opinion without any evidence or examples to justify the claim and concludes it as fact, but that might be because they're leaning on their previous article. Zanzibar is novel because it is fundamentally designed to be ran at the edge and solves the difficult problem of keeping the view of data at the edge consistent. This article conveniently leaves out how other systems get data to the edge while still keeping it consistent for their authorization logic. Latency is also brought up, but we recently managed to scale SpiceDB to >1M requests per second with 100B relationships while maintaining a 5ms p95 measured at the client application[0]. The claim that you absolutely need a service to run a Zanzibar system is a provably false claim based on the number of clusters in the wild running SpiceDB or Ory's Keto project.
> Additionally, there are folks using SpiceDB today by replicating denormalized checks back into their database (e.g. Postgres) or search index (e.g. Elastic) so that you can filter them natively. This is the combination of the aforementioned Lookup APIs with our Watch API. While this strategy requires moving parts, it is necessary beyond a particular scale which is well beyond the point at which policy engines typically fall over.
Would you say that because of this, Zanzibar engines like spicedb only become useful on systems of a certain size / complexity? Fundamentally you run into data synchronization issues whether you are syncing denormalized data back to your db via Watch or whether you write the relationships to both data stores in the first place. This article[0] on the latter topic touches on this, but brushes over some tricker parts of implementing such a thing correctly (eg. 2 writes section only covers insert not update or delete which is generally less harmful to have a ghost update that persists in spicedb, streaming updates brushes over some major footguns).
Granted there's nothing unique to spicedb in this sort of complexity, but by nature of being a db, using spicedb mandates that users must take on the complexity.
Is it then fair to say that it is appropriate to use spicedb once a project reaches a certain size / complexity, or would you expect a startup to adopt it from the beginning?
>Fundamentally you run into data synchronization issues whether you are syncing denormalized data back to your db via Watch
The Watch and Lookup APIs emit revisions so that any replicated data can include revisions to guarantee consistency. The linked article covers replicating data into SpiceDB and not the other way around; this is generally done for brown-field projects and does come with consistency trade-offs.
It's true that this complexity isn't unique to SpiceDB. The important part is that SpiceDB makes this _possible_ because if you architect a solution where it isn't, you'll find one day you've backed yourself into a corner.
>Is it then fair to say that it is appropriate to use spicedb once a project reaches a certain size / complexity, or would you expect a startup to adopt it from the beginning?
I briefly touch on this subject a bit in this post[0]. Unfortunately, there's no dead simple answer. We do have customers that are startups in various stages, but they all deeply considered the implications of focusing on authorization before they jumped in. IME, startups really need to find product market fit first. Build your MVP using whatever it takes and and only move on to thinking about authorization when it becomes critical. When is it critical, but not too late? I think that's once you start noticing that each PR implementing a feature request is also touching authorization code/SQL. There are also other big signals: microservices architecture or enterprise customers are almost certain indicators that your authorization logic isn't going to remain a small library in your monolith.
Jimmy I truly think you're awesome (And so is SpiceDB), but the irony here stands out:
"it presents opinion without any evidence or examples to justify the claim and concludes it as fact"
You mean stuff like:
1) "SpiceDB, the most mature open source project inspired by Zanzibar" (though I'd vouch for that one)
2) " it is necessary beyond a particular scale which is well beyond the point at which policy engines typically fall over."
3) "Zanzibar is novel because it is fundamentally designed to be ran at the edge"
4) "we recently managed to scale SpiceDB to >1M requests per second with 100B relationships while maintaining a 5ms p95 measured at the client application" - you should bundle that statement with you need to set it up within your own VPC for it to be fair.
5) "The claim that you absolutely need a service to run a Zanzibar system is a provably false claim based on the number of clusters in the wild running SpiceDB or Ory's Keto project" - how many clusters? :)
Re: "This article conveniently leaves out how other systems get data to the edge while still keeping it consistent for their authorization logic"
The article actually does mention OPAL [0]
Your critique of my comment is quite fair; we're both guilty of making claims, but not including all the supporting evidence for brevity's sake. I think we can both agree that everyone working in this space is doing awesome work and bringing authorization the attention that it's sorely needed.
What do you mean separate authorization logic? There are many layers to auth and usually they act as interceptors in request that go very fast. If you have blanket permissions to list, you are able to list resources you have access to... that's trivial. However `Blog` resources might have explicit deny policies on them as well, so yes those are also evaluated. Not sure how else you'd expect it to work sans caching like current state of resources and access.
Yes, you need to consider authorization at every layer. You can blanket deny a lot of things in a midlayer, but sooner or later you need to start interpreting business logic to do the rest.
Data filtering implementation has different approaches among the mentioned policy engine. For OPA, custom Rego code could return the allowed data, and the caching mechanism will ensure its consistency and reliability. For Zanzibar, since the policy derived from the data relations, data filtering is using is an internal part of the paper.
I recommend the following article for more information about policy as code and policy as data to understand the context better - https://www.permit.io/blog/zanzibar-vs-opa
This, especially when you combine it with pagination, filtering and ordering requirements.
Zanzibar implementations (eg. Spicedb, keto etc) offer functionality for listing resources accessible to a given principal, but as far as I can see none have a coherent solution for filtering and ordering.
The only solution I can see to this is as you suggest maintaining a shadow copy of the relationships in your db so you can answer the question with a regular SQL query. This obviously comes with a lot of headaches, and is the sole factor preventing us from adopting one of these systems, so I really hope I'm wrong about this.
From short scans of the papers, at least with Zanzibar, AFAIK you can define entities and relations (think groups of users and directories) and infer rights based on those. I'm assuming Zanzibar backs the actual Goolge 360 document sharing so presumably it would scale for that use-case.
The google paper refers to the existence of some 'permissions-aware index' (paraphrasing) that's used to answer range queries like this, but doesn't cover how this index would work.
I know various Zanzibar implementations have exposed APIs to solve this problem, but I still don't have a great intuitive understanding of how they work beyond 'push the ACL logic into the data layer', which brings us back to a pre-zanzibar world.
I always find myself coming back to it, both for work and personal life. I'm doing the whole "build in public" thing for my SaaS and use the public notion page to warehouse all of our updates and information.
I've found https://getoutline.com to be a pretty solid contender, although with slightly less functionality.
In an environment where your DB may be combined with an Elasticsearch cluster, caches, etc... how do you realistically use branching features in a DB? I love the idea, but idk how I could test if my ES cluster doesn't have a similar branching feature.
Thanks! We just switched to Docusaurus, would absolutely recommend. We started w/ a separate marketing site and docs site, but since our team are basically all engineers, we just decided to use Docusaurus for everything.
We are looking for a founding engineer to help us continue our mission of building a fully-automated Managed Detection & Response (MDR) platform.
Our ideal engineer has a broad set of experience across frontend, backend, infrastructure, cybersecurity, and data(bases). If you've been responsible for managing production infrastructure, talking to customers, and having full ownership of your features from start to finish, we'd love to talk with you.
Tech stack:
We don't care about your resume, tell us why you're awesome: https://wspd.link/founding-engineer