Finding the shortest path with Gizmo


#1

Given i have vertices with an unweighted distance, what would be the way to find the shortest distance between two nodes?
Those are the Nquads used

<Azha> <is_connected_to> <Kaus Media> .
<Kaus Media> <is_connected_to> <Azha> .
<Azha> <is_connected_to> <Kaus Borealis> .
<Kaus Borealis> <is_connected_to> <Azha> .
<Azha> <is_connected_to> <Kaus Australis> .
<Kaus Australis> <is_connected_to> <Azha> .
<Azha> <is_connected_to> <Fastolf> .
<Fastolf> <is_connected_to> <Azha> .
<Azha> <is_connected_to> <Krah'Hor> .
<Krah'Hor> <is_connected_to> <Azha> .
<Azha> <is_connected_to> <Vindemiatrix> .
<Vindemiatrix> <is_connected_to> <Azha> .
<Azha> <is_connected_to> <Jishui> .
<Jishui> <is_connected_to> <Azha> .
<Azha> <is_connected_to> <Todem> .
<Todem> <is_connected_to> <Azha> .
<Azha> <is_connected_to> <Helvetios> .
<Helvetios> <is_connected_to> <Azha> .
<Azha> <is_connected_to> <Antonehk> .
<Antonehk> <is_connected_to> <Azha> .
<Azha> <is_connected_to> <Khitomer> .
<Khitomer> <is_connected_to> <Azha> .
<Azha> <is_connected_to> <Zeta Polis> .
<Zeta Polis> <is_connected_to> <Azha> .
<Azha> <is_connected_to> <Wasat> .
<Wasat> <is_connected_to> <Azha> .
<Azha> <is_connected_to> <Francihk> .
<Francihk> <is_connected_to> <Azha> .
<Fastolf> <is_connected_to> <Tejat> .
<Tejat> <is_connected_to> <Fastolf> .

Say I want to find the path between the vertices <Vindemiatrix> and <Tejat>. It is relatively easy to find that last hop from the start to the destination:

var m = g.Morphism().Out("<is_connected_to>")
g.V("<Vindemiatrix>").Tag("start").FollowRecursive(m).Has("<is_connected_to>","<Tejat>").All()

However, I banged my head against the wall until now on how I could get the complete path (Vindemiatrix=>Azha=>Fastolf=>Tejat).

I simply have no idea on how to do that in Gremlin. Any help would be highly appreciated, even more so with explanation so that I can wrap my head around this.

Kind regards,

SpicyClam


#2

I think you may need to use .ForEach instead, so you can record the path as it’s being traversed.


#3

I am very sorry, but I do not understand. I am pretty new to this. If I iterate over the vertices that have the predicate <is_connected_to>, how shall I decide wether this vertex is actually closer than another one or even has a connection to my target?


#4

Yeah, sorry, I should have been more specific.

Currently, you run the whole FollowRecursive query that has no way of returning the path.

Instead, you can do something like this:

function recurse(id) {
  var out = g.V(id).Has("<is_connected_to>","<Tejat>").ToArray()
  if (len(out) == 1) {
    // end node, return the path
    return [id, "<Tejat>"]
  }
  var shortest = null;
  g.V(id).Out("<is_connected_to>").ForEach(function(id2){
     var sub = recurse(id2)
     // compare sub to shortest, overwrite, if necessary
     if (sub && (!shortest || shortest.length > sub.length) shortest = sub;
  })
  if (!shortest) return null;
  return [id].push(...shortest);
}
var shortest = recurse("<Vindemiatrix>", "<Tejat>")
g.Emit(shortest)

This emulates the way how FollofRecursive works, but in JS. It picks every node and tries to traverse one step further. Of course, it’s the most naive algorithm, but it may do for the start :slight_smile:

And I haven’t tested it, so it may have some typos :sweat_smile:


#5

@dennwc Thank you so much! So, to get it right, if I wanted to do this in Go, i would have multiple queries to the engine, either way, right? No explicit or implicit built in “here is your shortest path”?


#6

Correct, we don’t have a native short path implementation (yet). So if you will write one in Go, feel free to contribute it :slight_smile:

Having said that, the same query in Go can be much more efficient, because it’s possible to lookup/iterate faster, directly using indexes.


#7

@dennwc Without promising, I guess I will do that. My use case seems to be generic enough to make it feasible to generalize. After a bit of research, breadth-first and depth-first search should be easy enough to implement, too, as a first try. As far as I can tell it is the same basic algorithm, only differing in that one uses a stack where the other uses a queue. Let me dig into it a bit.