Remove duplicate quads in the Cayley graph


#1

Cayley now ignores duplicated quads by default. My question is that how to remove duplicate quads?


#2

Ignores in this case means that it will not insert them twice, and will not complain about them.


#3

Thanks for the quick reply. However, in my practice, if I insert a 'quad‘ that exists in the graph with the Gizmo API, there will be two ‘quad’ in the graph. Another case is that if I load the .nq file twice, all the quads in the file will have a duplicate in the graph. How to avoid the duplication?


#4

It sounds more like a bug introduced recently. What backend are you using?

And how do you add quads via Gizmo API? You probably meant HTTP API for writing quads.


#5

Yes, I should explain that more clearly.

I use the HTTP API (v1) for writing quads and Bolt as the backend.


#6

Thanks, I will check if I can reproduce it and will send a fix in few days.


#7

Thanks again for your help.


#8

One thing I forgot to say, I set ignore_duplicates: True and ignore_missing: True in the configuration file.


#9

Thanks for reporting - I’m able to reproduce this.

Opened an issue to track the progress: #675


#10

Recently, I have found that if a data file itself contains duplicate quads, e.g.

<alice> <follows> <bob> .
<alice> <follows> <bob> .

the latest version (0.7.2) of Cayley will still retain the duplicated quads after the cayley load command. Should we expect that result?