To be comprehensive, I add here the MWE source. Note that I fixed the vertices so that the output is reproducible. However, one could select randomly the vertices and would end with the same behavior.

Bests,
François.
  

import multiprocessing
import graph_tool as gt 
import graph_tool.topology as gtt
import hashlib
import sys

class MyProcess(multiprocessing.Process):
    """
    A process that computes shortest paths and shortest distances in a graph tool graph.
    """
    def __init__(self, graph, test):
        super(MyProcess, self).__init__()
        self.graph = graph
        self.test = test

    def run(self):
        while True:
            # Operation is repeated so that the bug is cristal clear. 
            source, target = self.test
            source = self.graph.vertex(source)
            target = self.graph.vertex(target)
            
            # We start the work.
            print('{} does shortest_distance from {} to {}'.format(self, source, target))
        
            gtt.shortest_distance(self.graph, 
                                  source=source,
                                  weights=self.graph.ep['weight'],
                                  max_dist=1400,
                                  pred_map=True)

            # We end the work.
            print('{} done.'.format(self))


def hash_graphs(*args):
    """
    Provides an edge based graph digest that can be used to invalidate old cache.

    :type args: tuple of :class:`graph_tool.GraphView`
    :param args: the graphs to be hashed.

    :rtype: str
    :return: a hash digest of the input graph.
    """
    graph_hash = hashlib.md5()
    for graph in args:
        graph_hash.update(gt.edge_endpoint_property(graph, graph.vp['id'], "source").a.tobytes())
        graph_hash.update(gt.edge_endpoint_property(graph, graph.vp['id'], "target").a.tobytes())
    return graph_hash.hexdigest()


if __name__ == '__main__':
    
    # Unserialize the graph.
    graph = gt.load_graph('./mwe/graph.gt.gz')
    
    # Bug switch.
    if sys.argv[-1] == 'DO_HASH':
        graph_hash = hash_graphs(graph)
    
    # Repetable inputs.
    tests = [(452946, 391015), 
             (266188, 207342),
             (514127, 290838),
             (439705, 87897),
             (223098, 440593),
             (279880, 368550),
             (108357, 199593),
             (273888, 275937)]

    # Actual work.
    procs = [MyProcess(graph, tests[i]) for i in range(8)]

    for proc in procs:
        proc.start()

    for proc in procs:
        proc.join() 

On Thu, Nov 10, 2016 at 7:24 PM, François Kawala <francois.kawala@gmail.com> wrote:
Hello,

I observe a quite strange bug that involves python's multiprocessing library. I try to use (read only) one graph instance with several Multithreading.Process. The graph is unserialized in the parent process. Each child receives a reference to the graph. Then each child does simple repetitive calls to graph_tool.topology.shortest_distance. Everything great each child process works as fast as it can. However when the main process executes the hash_graphs function presented below, each child process hangs infinitely. The hash_graphs is executed prior to the children start.

def hash_graphs(*args):
    """
    Provides an edge based graph digest that can be used to invalidate old cache.

    :type args: tuple of :class:`graph_tool.GraphView`
    :param args: the graphs to be hashed.

    :rtype: str
    :return: a hash digest of the input graph.
    """
    graph_hash = hashlib.md5()
    for graph in args:
        graph_hash.update(gt.edge_endpoint_property(graph, graph.vp['id'], "source").a.tobytes())
        graph_hash.update(gt.edge_endpoint_property(graph, graph.vp['id'], "target").a.tobytes())
    return graph_hash.hexdigest()

I package a MWE, it is available here : https://drive.google.com/file/d/0B5GhhBKHOKOxVnpfYTBwNDZxODA/view?usp=sharing. To run it simply do :

tar xzf mwe.tar.gz

# run the buggy version
python3 -m mwe DO_HASH

# run as expected
python3 -m mwe

The buggy output looks like : 

<MyProcess(MyProcess-1, started)> does shortest_distance from 452946 to 391015
<MyProcess(MyProcess-2, started)> does shortest_distance from 266188 to 207342
<MyProcess(MyProcess-3, started)> does shortest_distance from 514127 to 290838
<MyProcess(MyProcess-4, started)> does shortest_distance from 439705 to 87897
<MyProcess(MyProcess-5, started)> does shortest_distance from 223098 to 440593
<MyProcess(MyProcess-6, started)> does shortest_distance from 279880 to 368550
<MyProcess(MyProcess-7, started)> does shortest_distance from 108357 to 199593
<MyProcess(MyProcess-8, started)> does shortest_distance from 273888 to 275937

The expected output looks like : 

<MyProcess(MyProcess-1, started)> does shortest_distance from 452946 to 391015
<MyProcess(MyProcess-2, started)> does shortest_distance from 266188 to 207342
<MyProcess(MyProcess-3, started)> does shortest_distance from 514127 to 290838
<MyProcess(MyProcess-5, started)> does shortest_distance from 223098 to 440593
<MyProcess(MyProcess-6, started)> does shortest_distance from 279880 to 368550
<MyProcess(MyProcess-1, started)> done.
<MyProcess(MyProcess-1, started)> does shortest_distance from 452946 to 391015
<MyProcess(MyProcess-2, started)> done.
<MyProcess(MyProcess-2, started)> does shortest_distance from 266188 to 207342
<MyProcess(MyProcess-4, started)> does shortest_distance from 439705 to 87897
<MyProcess(MyProcess-7, started)> does shortest_distance from 108357 to 199593
<MyProcess(MyProcess-3, started)> done.
<MyProcess(MyProcess-1, started)> done.
<MyProcess(MyProcess-3, started)> does shortest_distance from 514127 to 290838
<MyProcess(MyProcess-1, started)> does shortest_distance from 452946 to 391015
<MyProcess(MyProcess-8, started)> does shortest_distance from 273888 to 275937
<MyProcess(MyProcess-2, started)> done.
<MyProcess(MyProcess-2, started)> does shortest_distance from 266188 to 207342
<MyProcess(MyProcess-3, started)> done.
<MyProcess(MyProcess-3, started)> does shortest_distance from 514127 to 290838
<MyProcess(MyProcess-1, started)> done.
<MyProcess(MyProcess-1, started)> does shortest_distance from 452946 to 391015
<MyProcess(MyProcess-6, started)> done.
<MyProcess(MyProcess-6, started)> does shortest_distance from 279880 to 368550
<MyProcess(MyProcess-4, started)> done.
<MyProcess(MyProcess-4, started)> does shortest_distance from 439705 to 87897
<MyProcess(MyProcess-8, started)> done.
<MyProcess(MyProcess-8, started)> does shortest_distance from 273888 to 275937
<MyProcess(MyProcess-1, started)> done.
<MyProcess(MyProcess-1, started)> does shortest_distance from 452946 to 391015
<MyProcess(MyProcess-2, started)> done.
<MyProcess(MyProcess-2, started)> does shortest_distance from 266188 to 207342
<MyProcess(MyProcess-3, started)> done.
<MyProcess(MyProcess-3, started)> does shortest_distance from 514127 to 290838
<MyProcess(MyProcess-5, started)> done.
<MyProcess(MyProcess-5, started)> does shortest_distance from 223098 to 440593
<MyProcess(MyProcess-1, started)> done.
<MyProcess(MyProcess-1, started)> does shortest_distance from 452946 to 391015
<MyProcess(MyProcess-8, started)> done.
<MyProcess(MyProcess-8, started)> does shortest_distance from 273888 to 275937
<MyProcess(MyProcess-7, started)> done.
<MyProcess(MyProcess-7, started)> does shortest_distance from 108357 to 199593
<MyProcess(MyProcess-3, started)> done.
<MyProcess(MyProcess-3, started)> does shortest_distance from 514127 to 290838
...

How could I explain this behavior ? 

Bests,
François.




--
François Kawala