Couchbase as Storage Engine


Opening this issue in case anyone else is interested in using @Couchbase as a backend store with Cayley (I definitely am)

Hello I would be really interested in having @Couchbase as the backend store for Cayley. Where do we start? :wink:

Definitely interested.

Here’s an IRC convo on #cayley I had with @barakmich about adding Couchbase as a backend (I’ve been busy so still haven’t had the opportunity to dive into this myself, though I’m still very much interested):

PaulCapestany: I haven’t had a chance to really dig into it much… but on a very superficial level, it seems cayley is meant to be backend-agnostic

barakmich: PaulCapestany: Totally meant to be backend-agnostic. Implementing a new backend is implementing two interfaces – it’ll be slow at first, but work – and you can improve things incrementally

barakmich: So if you’re big on Couch, I’m definitely open to having it, if you want to dig into writing it.

PaulCapestany: barakmich: hey, thanks for the response! it’s been a loooong time since I looked into it, but I vaguely remember there being some open issues on how to have Cayley properly deal with scaling up (I forget if it was on the backend storage side, and/or also scaling Cayley nodes themselves as well…)

PaulCapestany: like, replication issues

PaulCapestany: (trying to find the issues on Github right now…)

barakmich: Ah, right. That depends a little on the backend, but not too much.

barakmich: There’s even a hook for it now, the writer interface, if that’s your cup of tea

PaulCapestany: barakmich I think this was one of the threads that made me back off from attempting anything with Couch

barakmich: Right. So similar to Cassandra, one needs a writer interface that can potentially use Cassandra itself as a consensus layer, as necessary. Though, there’s a fair argument it’s not.

barakmich: s/Cassandra/Couch/ for similar things

barakmich: Actually, this part may “just work” already, as there’s some relationship to the way Postgres happens.

PaulCapestany: hmm. I may have to dig into this a bit more now in that case…

barakmich: For example, we’re running a half-dozen Cayley-backed processes against the same RDS

barakmich: Which is more or less the same model with Cassandra/Couch/et al

PaulCapestany: hmmm

PaulCapestany: barakmich: what’s the story when querying a cluster of cayley nodes? (sorry if it’s a dumb question, I haven’t looked at the repo in a while)… are you saying that part “just works” now, as in you can just hit one endpoint and that’s it?

PaulCapestany: and like, i guess you’re saying that cayley nodes are indeed clusterable?

barakmich: You might need a stock HTTP loadbalancer, but yeah

barakmich: And there may be some small changes to make per backend, eg, taking a list of storage endpoints

PaulCapestany: barakmich: the “sharding” story was I think one of the other things that prevented me from really digging into cayley previously as well, just for clarification, that should just work as well?

PaulCapestany: (I’m asking because the last time I checked, Titan seemed to be the only GraphDB that promised to make scaling clusters as painless as possible…)

barakmich: So that’s a hard one in general. Sharding is easy – give it to a sharded backend and that works. Sharding well is tricky – queries slow down a bunch, depending on depth. But it does work.

barakmich: They have the same constraints; nothing new under the sun :slight_smile:

I can also help to get these interfaces working. But I had never tried Couchbase before.

Hi Paul, Dennwc, I am really sorry I just could take a look at your messages, I thought I got no response, please let me know if we can organize something, maybe a research team to know what it will take, I will be happy to put some code in :wink: … again sorry for the delay in answering.



See also Running Cayley in the browser since a PouchDB driver for the browser using would also provide a CouchDB driver for the server.