Hi Tiago,
I am trying to calculate the shortest distances of a graph after applying a
filter. I have a code that looks like this:
g=gt.load_graph("myGraph.xml",format="xml")
#for later use
distances = gt.shortest_distance(g)
#extract the components of the graph
comp = g.label_components(g)
#This splits the graph in several components
#I want to calculate the shortest distances
#for the component 2 for example
filtering = g.new_vertex_property("boolean")
for v in g.vertices():
if comp[v]==2:
filtering[v]=True
else:
filtering[v]=False
#set the vertex filter
g.set_vertex_filter(filtering)
distances_comp=gt.shortest_distance(g)
The last line of code rises a segmentation fault. I have plotted the graph
with the filtered graph and its correct, also I can calculate the
local_clustering_coefficient without problems. Am I doing something wrong?
Is there any other way to filter the graph and calculate the shortest
distances? Is this a bug?
Thanks so much,
Juan

I want to use graph-tool using the multiprocessing library in Python 2.6. I
keep running into issues though with trying to share a full graph object. I
can store the graph in a multiprocessing.Namespace, but it doesn't keep the
dicts of properties. Example:
def initNS( ns ):
_g = Graph( directed = False)
ns.graph = _g
ns.edge_properties = {
'genres': _g.new_edge_property("vector<string>"),
'movieid': _g.new_edge_property("int"),
}
ns.vertex_properties = {
'personid': _g.new_vertex_property("int32_t")
}
"""
Build property maps for edges and vertices to hold our data
The graph vertices represent actors whereas movies represent edges
"""
# Edges
_g.edge_properties["genres"] = ns.edge_properties['genres']
_g.edge_properties["movieid"] = ns.edge_properties['movieid']
# Vertices
_g.vertex_properties["personid"] = ns.vertex_properties['personid']
ns.graph = _g
##########
This initializes 'ns', which is a multiprocessing.Namespace. The problem is
that for example, ns.edge_properties[ * ] tells me that the type isn't
pickle-able. I tried to just skip that and use the _g.edge_properties to
access it, but those dicts aren't carried over to the different process in
the pool. Presumably b/c they aren't pickle-able.
Any thoughts about how to fix this?
(For those interested, I'm attempting to use the IMDbPy library to do some
graph analysis on the relationships among actors and movies. Each process
has it's own db connection and trying to populate the graph with actor and
movie information in parallel since it's a pretty large and dense graph.
Somewhere in the neighborhood of 250,000+ vertices for just a depth of three
relationships)
Thanks,
--
Derek

Is there some trick needed to get graphviz's HTML labels working from
graph-tool ?
I've been making some nice svg plots from graph-tool setting the
label/shape/URL/tooltip vertex properties, but now I want to be able to
have multiple URL links from each node and HTML labels seem the obvious
way to do this.
I have an idea it ought to be as simple as setting the label strings to
something like (simple test case)
<<TABLE><TR><TD>left</TD><TD>right</TD></TR></TABLE>>
(and this seems to work as expected used in a .dot file) but setting the
same thing from graph-tool just gets me the the formatting reproduced
verbatim (extra '<' '>'s included) in the graph labels. Is some sort of
escaping needed (or do I need to bypass some escaping graph-tool is
doing) ?
Thanks for any help
Tim
Using graph-tool 2.2.15-1 & graphviz 2.26.3-5 on Debian squeeze.

Hi,
sorry for my naive question. I'd like to import hundreds of large and dense
adjacency matrices available as a numpy arrays into graph_tool.
The following option takes very long. Q1: Is there a preferable alternative?
import graph_tool.all as gt
import numpy as np
adj=np.array([[1,1,0],[1,1,0],[0,0,1]])
n_vertices=adj.shape[0]
g = gt.Graph(directed=False)
vlist = g.add_vertex(n_vertices)
edge_ids = np.where(adj !=0)
for i in range(edge_ids[0].shape[0]):
g.add_edge(g.vertex(edge_ids[0][i]), g.vertex(edge_ids[1][i]))
Since I am dealing with undirected graphs without selfloops a first step
would be to take only one triangle of the symmetric adjacency matrix:
# get ids of upper triangle
upper_tr = np.triu_indices_from(np.zeros((n_vertices,n_vertices)))
adj[a[0],a[1]] = 0
...
Q2: is this a valid approach?
Thanks in advance,
Matthias

Hello,
I'm using a cvsreader to parse a very big file and create an undirected
graph accordingly.
The file can contain duplicated edges (i.e. A B in one row, B A in another
one), so I'm checking
if g.edge(v1,v2)==None:
e = g.add_edge(v1,v2)
in order to discard them (v1 and v2 are vertices created from what it's read
from the file).
However the graph contains a lot of edges (few millions) and vertices (many
thousands), with a potentially high degree for the vertices, and it takes a
lot to process the data.
As far as I read in the soruce code, the Graph.edge() method checks all the
outgoing edges of the source node, but even if I check which node has the
highest degree, it takes a lot of time to build the graph.
Is there any other way to remove the duplicated edges? Maybe an edge filter
based on some lambda wizardry?
Thanks in advance,
Giuseppe
--
Researcher at University of Bologna, Italy