On 12/03/2010 05:07 PM, Derek Ditch wrote:
I want to use graph-tool using the multiprocessing library in Python 2.6. I keep running into issues though with trying to share a full graph object. I can store the graph in a multiprocessing.Namespace, but it doesn't keep the dicts of properties. Example:
def initNS( ns ): _g = Graph( directed = False) ns.graph = _g ns.edge_properties = { 'genres': _g.new_edge_property("vector<string>"), 'movieid': _g.new_edge_property("int"), } ns.vertex_properties = { 'personid': _g.new_vertex_property("int32_t") }
""" Build property maps for edges and vertices to hold our data The graph vertices represent actors whereas movies represent edges """ # Edges _g.edge_properties["genres"] = ns.edge_properties['genres'] _g.edge_properties["movieid"] = ns.edge_properties['movieid'] # Vertices _g.vertex_properties["personid"] = ns.vertex_properties['personid'] ns.graph = _g
########## This initializes 'ns', which is a multiprocessing.Namespace. The problem is that for example, ns.edge_properties[ * ] tells me that the type isn't pickle-able. I tried to just skip that and use the _g.edge_properties to access it, but those dicts aren't carried over to the different process in the pool. Presumably b/c they aren't pickle-able.
Any thoughts about how to fix this?
The problem is that property maps cannot be pickled independently, since they need internal references to the graph object. However, the entire graph can be pickled, together with its internal properties (i.e. properties stored in g.vertex_properties and g.edge_properties). So, instead of keeping the properties in ns.edge_properties, for instance, you should keep them in ns.graph.edge_properties.
(For those interested, I'm attempting to use the IMDbPy library to do some graph analysis on the relationships among actors and movies. Each process has it's own db connection and trying to populate the graph with actor and movie information in parallel since it's a pretty large and dense graph. Somewhere in the neighborhood of 250,000+ vertices for just a depth of three relationships)
Note that graph-tool is not thread safe... So any access or modification to the graph must be protected by a lock.
Cheers, Tiago
-- Tiago de Paula Peixoto tiago@skewed.de