On 04.04.2016 14:13, Stephen Bonner wrote:
Hi Tiago,
Thank you for your detailed reply, it cleared things up a lot! I would also like to thank you for your incredible work on graph-tool, its a great package!
Below is the code i now use to count the triangles -
# number of triangles gc = global_clustering(tempG) d = tempG.degree_property_map("total") num_triangles = gc[0] * (d.a * (d.a - 1) / 2).sum() / 3
My test dataset is the CA-HepPh network from SNAP - http://snap.stanford.edu/data/ca-HepPh.html
However i am not getting the reported number of triangles for the dataset which, according to SNAP, is - 3,358,499
However from the above graph-tool code i get - 13,434,795
Do you have any idea what might cause the discrepancy between the ground truth and the computed result?
Many SNAP datasets contain parallel edges, but apparently many of their statistics ignore those. Graph-tool (correctly) incorporates them in the triangle counting. If you remove the parallel edges (e.g. with remove_parallel_edges()), you get the same triangle count as reported in SNAP. Best, Tiago -- Tiago de Paula Peixoto <tiago@skewed.de>