Implement GraphQL


#1

Please see it:


Quick look shows, that it relies on prior knowledge about schema. To implement this we need at least to support sameAs to allow usage of any RDF dictionary.

Btw, #353 may be handy to query current schema, but it may also be done by hand.


Perhaps dataloader is the new point of integration.


Actually we can make it work by using the parser from graphql-go directly.


From: https://github.com/cayleygraph/cayley/issues/370


Testing Typed Quad values and GraphQL
#2

#3

#4

I’ve implemented initial support for GraphQL recently. For example, we can now query movie data with:

{
  actor(name: "Keanu Reeves") {
    id: _id_
    name
    starring: __rev___film__performance__actor {
      character: __film__performance__character
      film: __rev___film__film__starring {
        name
      }
    }
  }
}

Which should be really in this form:

{
  actor(name: "Keanu Reeves") {
    id: _id_
    name
    starring: /rev/film/performance/actor {
      character: /film/performance/character
      film: /rev/film/film/starring {
        name
      }
    }
  }
}

Or with dots, like in dgraph:

{
  actor(name: "Keanu Reeves") {
    id: _id_
    name
    starring: rev.film.performance.actor {
      character: film.performance.character
      film: rev.film.film.starring {
        name
      }
    }
  }
}

These ugly names in first example are because GraphQL limits character set for names to alpha-numeric, thus I have to replace all / characters with __. Later on I will fork GraphQL parser to allow these extra characters we need.

Results for the query above looks like this:

{
 "data": {
  "actor": {
   "id": "/en/keanu_reeves",
   "name": "Keanu Reeves",
   "starring": [
    {
     "character": "Jonathan Harker",
     "film": {
      "name": "Bram Stoker's Dracula"
     }
    },
    {
     "film": {
      "name": "Bill \u0026 Ted's Bogus Journey"
     }
    },
    {
     "film": {
      "name": "The Gift"
     }
    },
    {
     "film": {
      "name": "The Day the Earth Stood Still"
     }
    },
    {
     "film": {
      "name": "The Night Watchman"
     }
    },
    {
     "film": {
      "name": "My Own Private Idaho"
     }
    },
    {
     "character": "Ted \"Theodore\" Logan",
     "film": {
      "name": "Bill \u0026 Ted's Excellent Adventure"
     }
    }
   ]
  }
 }
}

It means that we have experimental GraphQL support.

For now only a subset of features was implemented, but we already have a distinct one: for each link we can query objects in reverse (notice, there is a rev prefix for a few properties).

I will push a PR with this change after types branch lands.

Any thoughts and feedback is welcome.


#5

Pushed a Docker image. You may try it with this command:

docker run --rm -p 64210:64210 dennwc/cayley:graphql

UI will be served at http://localhost:64210.


#6

Sweet – first party query language party time!


#7

That’s incredible @dennwc! Thanks for destroying my Sunday (:


#8

Progress

OK, now it looks usable to me.

The same query in a new form:

{
  actor(name: "Keanu Reeves") {
    id: _id
    name
    starring: rev./film/performance/actor {
      character: _/film/performance/character
      film: rev./film/film/starring {
        name
      }
    }
  }
}

Almost no strange names now :slight_smile:

Also added pagination support (mid-query!):

{
  actor(name: "Keanu Reeves") {
    id: _id
    starring: rev./film/performance/actor (_limit: 10, _skip: 20) {
      character: _/film/performance/character
      film: rev./film/film/starring {
        name
      }
    }
  }
}

Now, few things about internals.

Fields

Field names are always expected to be of IRI type. For example, name and /film/performance/actor should be <name> and </film/performance/actor> in the dataset.

You can traverse fields in reverse adding rev. prefix to field name. In the query above rev./film/performance/actor is interpreted as “traverse </film/performance/actor> in reverse”.

GraphQL forces firsts characters of names to be alpha-numeric. If your predicates starts with other characters, just add _ before predicate - it will be trimmed automatically. For example, _/film/performance/character will be interpreted as </film/performance/character>.

There are 3 special fields for now:

  • _id - current node value
  • _limit - used in filters to limit the number of results
  • _skip - used to skip a number of results

Values

String values are converted to String type, except <bob> which is interpreted as IRI <bob>, and _:bob which becomes BNode _:bob.

Values without quotes are expected to be IRIs. For example, bob is IRI <bob>, _/bob/ will be IRI </bob/> and so on. As a special case _:bob is interpreted as BNode _:bob, and not as IRI <:bob>.

Structure

Query above will be translated to following Gremlin analog (pseudo-code):

actor = g.V().Has("<name>","Keanu Reeves").Tag("id").
    SaveOptional("<name>","name").TagValue()

actor.starring = []
g.V(actor.id).In("</film/performance/actor>").Skip(20).Limit(10).ForEach(function(v){
    starring = g.V(v.id).Save("</film/performance/character>","character").TagValue()
    starring.film = g.V(starring.id).In("</film/film/starring>").Save("<name>","name").TagValue()
    actor.starring.append(starring)
})

g.Emit({ "actor": actor })

Actually it can be even more complex, but everything is handled under the hood.

Few things to notice:

  • Top-level names in query means nothing. actor in example query affects only the name for a key in result map.
  • All properties are saved as optional.
  • We have no schema, so it’s not possible to detect if a certain property should be an array or not. Because of this, current implementation always tries to avoid arrays. If no nodes were returned property is null, if one returned - property is a single value, if a set is returned - property is an array.

PS:

PR updated, Docker image updated.

@oren: Maybe you want one more Sunday to be destroyed? :grin:

@barakmich: What do you think? Can it be merged? Or should I implement few more features before that?