Clustering coefficient with parallel edges
Hi Tiago, I was recently comparing the behavior of the clustering coefficient in different graph libraries and I was surprised by the results given by graph-tool with parallel edges: import graph_tool as gt from graph_tool.clustering import local_clustering from graph_tool.stats import remove_parallel_edges num_nodes = 5 edge_list = [(0, 3), (3, 0), (1, 0), (0, 1), (1, 2), (2, 1), (2, 4), (4, 2), (4, 1), (1, 4), (4, 3), (3, 4)] g = gt.Graph() g.add_vertex(num_nodes) g.add_edge_list(edge_list) print(local_clustering(g, undirected=False).a) print(local_clustering(g, undirected=True).a) g.set_directed(False) remove_parallel_edges(g) print(local_clustering(g, undirected=True).a) Gives: [0. 0.33333333 1. 0. 0.33333333] [0. 0.26666667 0.66666667 0. 0.26666667] [0. 0.33333333 1. 0. 0.33333333] And I must say I was expecting the same result for all three ^^ The graph: From what I understand, then, it seems that, when using undirected=True, graph-tool considers reciprocal edges as two different ones and considers the possible triangles between parallel edges as distinct. Is this desired behavior, and if so, why? I know it stems from the mathematical formula, but I think this one was derived from simple graphs and I'm not sure I see the physical meaning of the "extension" to parallel edges, besides the fact that it would perhaps give the same result as weighting the edges by their number. If this is indeed desired, is there a quick way to do the equivalent of setting the graph to undirected and removing the parallel edges in "view-mode" (i.e. making the removal temporary, like a filter)? Thanks in advance for your help and for your work on graph-tool. Best, Tanguy
Am 14.04.20 um 09:32 schrieb Tanguy Fardet:
From what I understand, then, it seems that, when using undirected=True, graph-tool considers reciprocal edges as two different ones and considers the possible triangles between parallel edges as distinct.
Is this desired behavior, and if so, why?> I know it stems from the mathematical formula, but I think this one was derived from simple graphs and I'm not sure I see the physical meaning of the "extension" to parallel edges, besides the fact that it would perhaps give the same result as weighting the edges by their number.
Yes, it's desired behavior. Ignoring parallel edges would have complicated and slowed down the code, and in any case can be easily implemented by the user (see below), who should have something specific in mind when attempting to compute the value for multigraphs.
If this is indeed desired, is there a quick way to do the equivalent of setting the graph to undirected and removing the parallel edges in "view-mode" (i.e. making the removal temporary, like a filter)?
Yes, just use: u = GraphView(g, directed=False) GraphView(u, efilt=label_parallel_edges(u).fa == 0) Best, Tiago -- Tiago de Paula Peixoto <tiago@skewed.de>
Hi Tiago, thanks for the quick and precise answer!
Yes, it's desired behavior.
Ignoring parallel edges would have complicated and slowed down the code, and in any case can be easily implemented by the user (see below), who should have something specific in mind when attempting to compute the value for multigraphs.
OK, do you think it would be relevant to include it in the documentation? It was not very obvious to me...
If this is indeed desired, is there a quick way to do the equivalent of setting the graph to undirected and removing the parallel edges in "view-mode" (i.e. making the removal temporary, like a filter)? Yes, just use:
u = GraphView(g, directed=False) GraphView(u, efilt=label_parallel_edges(u).fa == 0)
Works like a charm, thank you! Best, Tanguy
Am 14.04.20 um 10:50 schrieb Tanguy Fardet:
Yes, it's desired behavior.
Ignoring parallel edges would have complicated and slowed down the code, and in any case can be easily implemented by the user (see below), who should have something specific in mind when attempting to compute the value for multigraphs.
OK, do you think it would be relevant to include it in the documentation? It was not very obvious to me...
But doesn't the function compute exactly what it says in the documentation? -- Tiago de Paula Peixoto <tiago@skewed.de>
But doesn't the function compute exactly what it says in the documentation?
It does, but one needs to know what the undirected=True implies for the graph, otherwise the result is surprising (or just not what one's think one is computing, if not working on a super simple test graph where it is easy to check). I think some people might end up not noticing and therefore would not compute what they think they are computing... Best, Tanguy
Am 14.04.20 um 11:26 schrieb Tanguy Fardet:
But doesn't the function compute exactly what it says in the documentation?
It does, but one needs to know what the undirected=True implies for the graph, otherwise the result is surprising (or just not what one's think one is computing, if not working on a super simple test graph where it is easy to check).
It is consistent with how directionality is implement in graph-tool, which preserves the total number of edges. Furthermore, the concept of multigraph vs simple graph is orthogonal to directed vs undirected graphs. So this is not really about the clustering coefficient computation.
I think some people might end up not noticing and therefore would not compute what they think they are computing...
I don't like second-guessing misinterpretations like this. What is important is to be consistent. The surprise came from not understanding how the undirected/directed filter works. This kind of surprise will appear in almost every function of this kind, and I don't think that putting a warning everywhere is the right approach. Besides, I think the way the undirected filter works is perfectly intuitive. What you seemed to be expecting was for a simple directed graph to become a simple undirected graph. But in this case how would property maps be handled? Suppose the incoming edge had a weight value and the outgoing edge had another, which one should be kept in the simple graph? How would a directed multigraph be handled when converted to undirected? Should it magically become a simple graph? Wouldn't that be a lot more surprising? In graph-tool all graphs are multigraphs per default, which makes decisions such as these easier. Best, Tiago -- Tiago de Paula Peixoto <tiago@skewed.de>
Hi Tiago,
It is consistent with how directionality is implement in graph-tool, which preserves the total number of edges. Furthermore, the concept of multigraph vs simple graph is orthogonal to directed vs undirected graphs.
So this is not really about the clustering coefficient computation.
I understand.
Besides, I think the way the undirected filter works is perfectly intuitive. What you seemed to be expecting was for a simple directed graph to become a simple undirected graph. But in this case how would property maps be handled? Suppose the incoming edge had a weight value and the outgoing edge had another, which one should be kept in the simple graph? How would a directed multigraph be handled when converted to undirected? Should it magically become a simple graph? Wouldn't that be a lot more surprising?
Yes, I was thinking that way because there were no attributes in the current graph, but you are right, of course. For instance igraph provides a keyword for edge coalescence to sum/take the max/other of the edge attributes; is there a similar way of doing this that is already available in graph-tool? Best, Tanguy
Am 14.04.20 um 14:34 schrieb Tanguy Fardet:
Yes, I was thinking that way because there were no attributes in the current graph, but you are right, of course. For instance igraph provides a keyword for edge coalescence to sum/take the max/other of the edge attributes; is there a similar way of doing this that is already available in graph-tool?
Yes, take a look at the condensation_graph() function. -- Tiago de Paula Peixoto <tiago@skewed.de>
Ah, I did not think I could use it that way! I'll try it out, thank you. Best, Tanguy PS: (sorry for the spam, I hit reply again) On 14/04/2020 14:36, Tiago de Paula Peixoto wrote:
Am 14.04.20 um 14:34 schrieb Tanguy Fardet:
Yes, I was thinking that way because there were no attributes in the current graph, but you are right, of course. For instance igraph provides a keyword for edge coalescence to sum/take the max/other of the edge attributes; is there a similar way of doing this that is already available in graph-tool? Yes, take a look at the condensation_graph() function.
Hello Tiago and Tanguy, I'm following your discussion with great interest as I have a similar task. I want to convert a multigraph into a simple graph. This works very nicely with GraphView(u, efilt=label_parallel_edges(u).fa == 0) However, removing the parallel edges this way means that the edge properties of the parallel edges are lost. I have an edge property called "eqntag" which is a string and I want to concatenate all eqntags from the parallel edges into the simple graph. I think this is what Tanguy meant by "edge coalescence". I did not understand how to use condensation_graph() for this purpose, so I've written my own function: def remove_parallel_edges_merge_eqntags(g): # create simple graph g2: g2 = gt.GraphView(g, efilt=gt.label_parallel_edges(g).fa == 0) # loop over the remaining edges in the simple graph: for e in g2.edges(): # create a list with all edges from g that connect the same # vertices as the current edge in g2: edgelist = g.edge(e.source(), e.target(), all_edges=True) # replace eqntag by semicolon-separated merged eqntag: g2.ep.eqntag[e] = ';'.join([g.ep.eqntag[edge] for edge in edgelist]) return g2 Best regards Rolf -- Sent from: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/
participants (3)
-
Rolf Sander -
Tanguy Fardet -
Tiago de Paula Peixoto