7/7/14 :: Question about iterator


#1

https://groups.google.com/forum/#!topic/cayley-users/p8YQ2pu3g-Y


Bsr S

7/7/14

Hello,
I am trying to use cayley as a library to call the Gremlin API natively. I had bit of a hard time understanding the iterators used for each command. Can anybody to briefly explain the architecture around iterator. Specifically, what are sub iterators, when they are invoked?

Thanks.
bsr


barakmich

7/7/14

Other recipients: bsr...@gmail.com

  • show quoted text -
    So a graph query is roughly represented as a tree of iterators – things
    that implement graph.Iterator. An iterator is (loosely) a stand-in for a
    set of things that match a particular portion of the graph.

A query language will, at the appropriate time (and this can be multiple
times, in the case of Gremlin) build an iterator tree from the basic
iterators (and a pointer to the graph.TripleStore, another interface)
and then execute it – namely, pulling matching paths out from the set.

For speed, these trees can be optimized; if the TripleStore has a
special index for a particular set, that iterator can be replaced with a
simpler iterator. Same with intersections with “everything” and so on.

Evaluation is merely calling Next() repeatedly on the iterator at the
top of the tree. Subiterators, then, are the branches and leaves of the
tree.

For some examples of building iterator trees, see
https://github.com/google/cayley/blob/master/graph/memstore/triplestore_test.go
.

Also, I’m looking forward to what Dan proposes for a Go library
interface. :slight_smile: Once that starts to settle, it may make sense to talk
about using that same interface as a wire format (ie, embeddable query
languages in other languages!) but one step at a time.

–Barak


Bsr S

7/7/14

Other recipients: bsr...@gmail.com

  • show quoted text -
    thank you so much for the response. I will go though the test code. If you have a moment can you please explain for an in operation why it build a predicateNodeIterator, link to, and, a hasa iterator. Can’t it just look up in one of the index from TripleDirectionIndex .

      lto := iterator.NewLinksTo(ts, base, in)
      and := iterator.NewAnd()
      and.AddSubIterator(iterator.NewLinksTo(ts, predicateNodeIterator, graph.Predicate))
      and.AddSubIterator(lto)
      return iterator.NewHasA(ts, and, out)
    

say for a query like
g.V(“B”).In(“follows”).All() , can’t it be just filtering the predicate on the TripleDirectionIndex.subject for node B.

I am trying to follow how this simple query is built based on iterators.

thanks again for the work and your time.
bsr.


barakmich

7/7/14

Other recipients: bsr...@gmail.com

  • show quoted text -
    HasA and LinksTo get you between links and nodes. So the tree for your
    example looks something like:

HasA (subject) – gets the things in the subject field for:

  • And – the intersection of:
    ** LinksTo (predicate) links that have the predicate of…:
    *** Fixed iterator containing “follows” – … just the node “follows”.
    ** LinksTo (object) links that have the object field of:
    *** Fixed iterator containing “B” – … just the node “B”

Where each extra * is a level of subiterators – HasA is your top iterator.

These can be optimized against the backend – by calling Optimize(). For
instance those LinksTo iterators can often be replaced by a better index
on the backend. There’s even a simple step for many backends to make the
And optimize away too (which is a simple extension I haven’t done yet,
and will increase speed).

But, the way I describe it here – that’s what I call the “abstract
iterator tree” – in that it doesn’t contain any iterators specific to
the backend (yet).

Hope that helps,

–Barak


Bsr S

7/7/14

Other recipients: bsr...@gmail.com

thanks again :slight_smile: