Cayley Query coverage (support Gremlin Table?)


#1

Hi there,

Sorry I put too many questions these days. I have used SPARQL for a while and try to use Cayley for our main triple store for performance reasons.

Can it query triples together instead of just vertices?

For example, there is a graph like this:
S1 P1 O1
S1 P2 O2.

I would like to get pairs of subjects and objects that are connected with a certain predicate, here P1. So if I query like “What are the pairs of S and O connected by P1?”, then it can return S1 and O1 together. (I know I can combine multiple queries like i) get all Ses related to P1, and ii) find Os related to each S with P1. But I wonder if Cayley can optimize this somehow by itself.)

I looked at Gremlin specification, and it can be described by “table” in Gremlin, which seems not to be implemented in Cayley.

Is there another workaround for this?

Additionally, I wonder if there is any plan for Cayley to include SPARQL.

Thanks!!


#2

Hi @jbkoh,

Questions are great, keep going :smiley:

Sure, internally Cayley gets all “directions” of each triple automatically (at least their IDs), and it’s exposed via Go API, but we haven’t exposed this via Gremlin for some reason. Right now it’s covered by #494. A bit later we will abstract it even further, so user will have all current Gremlin functionality available for both triples and associated vertices.

Regarding SPARQL, yeah, you have asked already in query languages tour, but I waited for @barakmich to comment on this in more details.
As far as I remember, we decided that SPARQL support will be great, but we don’t have time to work on it right now. Although, we’ll gladly accept contributions.


#3

@dennwc
Thanks for the answer! I don’t plan to adopt Go to my project for now, so I will keep my eye on the full implementation of Gremlin.

I have another question for query syntax:
In this example graph:
S1 P1 O1
S2 P1 O1
S3 P2 S1
S4 P2 S1
S5 P2 S2
S6 P2 S2

I want to find X satisfying “X P2 Y” and “Y P1 O1”. Basically, everything related to O1 via P2 and then P1. I think I have to use Map, but it does not work as I expected.

I did two ways that I thought, but both didn’t work:
1.
var x1 = g.V(O1).In(P1)
var x = x1.In(P2)
x.All()
returns -> {“error” : “TypeError: Cannot access member ‘All’ of null”}

var x1 = g.V(O1).In(P1)
var x = x1.ForEach(function(d) { d.In(P2)})
x.All()
returns -> {“error” : “TypeError: Cannot access member ‘All’ of null”}

Can I get some insight on how to handle these?

Thanks!


#4

First query should work, actually. Does g.V(O1).All() work at all? Maybe something wrong with node names?


#5

Actually I forgot to include a step for my actual question.

I have a list of entities (a found path in Gremlin)

var List1 = g.V().Out(P1).Is(O1)
var List2 = g.V().Out(P1).Is(O2)

and then I want to find a subset of List1, which is connected to any of O2 by P2. What I did is:
List1.Out(P2).Is(List2).All()

It returns :
{“error” : “TypeError: Cannot access member ‘All’ of null”}

What is the correct query for this situation?

Many thanks!


#6

The query is correct. That’s why I asked if separate parts of query works as expected. Does List1 or List2 contains an expected list of nodes?


#7

Some sort of TODO list, what Gremlin queries are supported right now, what not and their difficult to implement those would be helpful to get an idea where to get with Cayley as of now as well as to give hints to newcomers where to start hacking.


#8

We actually made a mistake by naming it Gremlin, because it’s only Gremlin-inspired. We will rename it to Gizmo soon to avoid any further confusion.


#10

Thanks for the responses. I think there must be something wrong in my query or my description of the query because I confirmed that the triples exist in the graph with other queries.

My previous question remains same:
/////////////////
var List1 = g.V().Out(P1).Is(O1)
var List2 = g.V().Out(P1).Is(O2)

and then I want to find a subset of List1, which is connected to any of O2 by P2. What I did is:
List1.Out(P2).Is(List2).All()
////////////////

Here, I hand-picked one node from List1 and did the following:
S1’ is a handpicked node from List1
S2’ is a node from List2 corresponding to S1’, which means
S1’ P2 S2’ exists.
and I ran:
g.V(S1’).Out(P2).Is(S2’),
which returns S2’ as expected. (the same thing was returned without Is(S2’))

However, if I run the same query with lists as described above, it returns null.

Shouldn’t List be used in this way?

Many thanks for continuing responses.


#11

I think I know what is wrong. Try this:
List1.Out(P2).And(List2).All()


#12

Thank you so much. This is what exactly I was looking for. I can play with Cayley more now. Thanks. :slight_smile: