Strong connection, weak connection with Gizmo


#1

Hi All!

I would like to query my graph and identify strong/weak connections.
A strong connection between two users exists, if user A follows user B and B follows A.

A --> B
B <-- A
B --> C

A weak connection exists if user A follows user B but B does not follow A.

A --> B
B --> C

To identify strong connections I wrote the following query:

var strongTies = [];

function fStrong(id) {
  strong = g.V(id).out("<follows>").has("<follows>", id).toArray();
  if (strong.length > 0) {
    strong.forEach(function(strong_) {
      var found = strongTies.find(function(element) {
      	return element.source == strong_ && element.target == id
      })
      if (!found) {
      	strongTies.push({"source": id, "target": strong_});
      }
    })    
  }
}

g.V().has("<isa>","<User>").ForEach(function(user) {
	fStrong(user.id)
})

g.emit(strongTies.length)

Remarks: I used toArray() instead of count() because count() returns null if there is no connection. I used toArray() instead of all() because all() creates output.
If there is a mutual connection, the two user ids are saved in an array.
Added filter to prevent A–>B and B–>A: Use polyfill for Javascript find function (put JS code before Gizmo query): https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/find#Polyfill
This approach triples the query time (from 5.2sec to 16.5sec with my small dataset).

For implementing the weak query, I thought to use something like hasNot/exists e.g.

function fWeak(id) {
  weak = g.V(id).out("<follows>").hasNot("<follows>", id).toArray();
...

But there is no function hasNot/exists in Gizmo.

Therefore I implemented the weak connection query this way:

var weakTies = [];

function fWeak(id) {
  g.V(id).out("<follows>").ForEach(function(followers){
	count = g.V(followers.id).has("<follows>", id).toArray()
    if (count.length == 0) {
		weakTies.push({"source": id, "target": followers.id});
    }
  });
}

g.V().has("<isa>","<User>").ForEach(function(user) {
	fWeak(user.id)
})

g.emit(weakTies.length)

Finally I get some results.

My questions are:

  • Is there a simpler approach possible?
  • To filter my duplicates, can I import a javascript library (e.g. lodash) to help me with that?
  • Did I make any mistakes?

Thank you,
websta


#2

Hey websta,
Thank you for taking your time to try Cayley and posting your question with such detail.
I believe the first thing you should do is separate extraction logic from transformation logic.
So how I would model it is:
query.js

// This is sent to Cayley so it must be ES5 and have no dependencies
var users = g.V().has("<isa>","<User>");

users
.tag("user")
.save("<follows>", "follows")
.out("<follows>")
.save("<follows>", "followsFollows")
.getLimit(-1);

transform.js

// This is operating on the Cayley response so it can use ES6+ and external dependencies
var weakTies = [];
var strongTies = [];
response.result.forEach(function (result) {
     var tie = { source: result.user, target: result.follows };
     if (result.user === result.followsFollows) {
         strongTies.push(tie)
     } else {
         weakTies.push(tie);
     }
});

#3

Hello Iddan!

Thank you for your reply!

So, I am currently using the Web UI of Cayley. How do I split the query there to use ES5/ES6?
Or are you describing the process when sending the query via http and transforming the response on the JavaScript client?

Best regards,
websta


#4

The first part is meant to be sent as query to the database, the second is processing of the data (can also be done with any programming language you’d prefer).