Hi all, I continue with the probably silly questions.. I have a graph that is relatively large (1.6 million vertices, 2.4 million arcs). I extract the largest strong component, but it is still quite large (~400,000 vertices) and while the "analysis" algos work smoothly, when it comes to representing it, it's very sparsely connected, so essentially a huge cloud of points. So I am trying to extract the second largest strong component, which is way smaller. I do the following: comp, hist = label_components(g) idx = argsort(hist) idx_second_largest = idx[-2] idx_vertices_second_largest = find(comp.a == idx_second_largest) u = GraphView(g, vfilt=idx_vertices_second_largest) To make things easier, I'll paraphrase the example from the doc.
g2=random_graph(100, lambda: (1,1)) comp,hist=label_components(g2) print hist [11 28 41 17 3]
So my second largest strong component should be the one with elements labelled 1.
print comp.a [0 1 1 2 3 2 2 0 0 2 1 3 3 0 3 1 4 0 2 2 1 3 2 1 2 2 2 2 2 1 2 1 3 1 2 2 1 2 3 2 2 0 2 2 4 2 3 2 2 3 1 1 2 1 1 2 1 1 2 4 2 1 0 1 2 1 1 3 2 2 2 1 3 3 1 2 1 1 0 0 2 2 2 3 3 3 2 2 2 2 1 2 3 0 1 1 2 1 0 3]
Using the code above gives, finally
print idx_vertices_second_largest [ 1 2 10 15 20 23 29 31 33 36 50 51 53 54 56 57 61 63 65 66 71 74 76 77 90 94 95 97]
But when I use this as u = GraphView(g2, vfilt=idx_vertices_second_largest), I get an error that ends with File "/usr/local/lib/python2.7/dist-packages/graph_tool/__init__.py", line 527, in __get_set_f_array a[:] = v ValueError: operands could not be broadcast together with shapes (100) (4) So as far as picking up the indices, I seem to be right, but after that, problem.. Where am I being silly? (Coming from classic C, my first reaction is "what should I typecast to?".) Thanks! -- J.
On 06/07/2012 04:26 PM, Julien Arino wrote:
So I am trying to extract the second largest strong component, which is way smaller. I do the following:
comp, hist = label_components(g) idx = argsort(hist) idx_second_largest = idx[-2] idx_vertices_second_largest = find(comp.a == idx_second_largest) u = GraphView(g, vfilt=idx_vertices_second_largest)
To make things easier, I'll paraphrase the example from the doc.
g2=random_graph(100, lambda: (1,1)) comp,hist=label_components(g2) print hist [11 28 41 17 3]
So my second largest strong component should be the one with elements labelled 1.
print comp.a [0 1 1 2 3 2 2 0 0 2 1 3 3 0 3 1 4 0 2 2 1 3 2 1 2 2 2 2 2 1 2 1 3 1 2 2 1 2 3 2 2 0 2 2 4 2 3 2 2 3 1 1 2 1 1 2 1 1 2 4 2 1 0 1 2 1 1 3 2 2 2 1 3 3 1 2 1 1 0 0 2 2 2 3 3 3 2 2 2 2 1 2 3 0 1 1 2 1 0 3]
Using the code above gives, finally
print idx_vertices_second_largest [ 1 2 10 15 20 23 29 31 33 36 50 51 53 54 56 57 61 63 65 66 71 74 76 77 90 <tel:71%2074%2076%2077%2090> 94 95 97]
But when I use this as u = GraphView(g2, vfilt=idx_vertices_second_largest), I get an error that ends with
File "/usr/local/lib/python2.7/dist-packages/graph_tool/__init__.py", line 527, in __get_set_f_array a[:] = v ValueError: operands could not be broadcast together with shapes (100) (4)
So as far as picking up the indices, I seem to be right, but after that, problem.. Where am I being silly? (Coming from classic C, my first reaction is "what should I typecast to?".)
You're not passing the correct information as the vertex filter. The vfilt parameter from GraphView expects either a function or a vertex property map or an array. The vertex property map must be of type "boolean", where the vertices with value 'True' are kept in the graph. Similarly, if you pass an array, it should also be boolean (or integer) valued, and it should have the same size as the number of vertices. You cannot pass a list of vertices you want to be kept... However, if you pass a function, it will be called on each vertex, and the result determines if the vertex is filtered. Thus, you have essentially two options: u = GraphView(g2, vfilt=lambda v: v in idx_vertices_second_largest) or comp, hist = label_components(g) idx = argsort(hist) idx = idx[-2] filter = comp.a == idx u = GraphView(g2, vfilt=filter) The second option should be the fastest. Cheers, Tiago -- Tiago de Paula Peixoto <tiago@skewed.de>
participants (2)
-
Julien Arino -
Tiago de Paula Peixoto