Re: [graph-tool] do_djk_search without distance initialization

28 Jun 2017

      On 28.06.2017 15:00, François Kawala wrote:
...
I try to dig up an old idea that we've discussed previously (the thread is
here
<http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/Shortest-distance-complexity-when-used-with-max-dist-td4026018.html>).
summary: when one runs shortest distance on a graph with several million
vertices the distance initialization cost is high with respect to the actual
Dijsktra cost. We want to circumvent the initialization [see L326 of
src/graph/topology/graph_distance.cc
<https://git.skewed.de/count0/graph-tool/blob/master/src/graph/topology/graph_distance.cc#L326>]
The proposed approach (as per this boost thread
<https://groups.google.com/forum/#%21topic/boost-list/7nCzvkkEXCk>) is to
use the `dijkstra_shortest_paths_no_color_map_no_init`. The distance
initialization is done "on the fly" by the `examine_edge` function. More
precisely, the Dijkstra visitor maintains a set of discovered vertices. This
set is read in the `examine_edge` function to set the distance to infinity
when a the target vertex does not belong to the set of discovered vertices.
The `examine_edge` function is also responsible to update the  set of
discovered vertices.
I've pushed a commit:

   https://git.skewed.de/count0/graph-tool/commit/bf174831

that replaces the search functions by no_init ones. The initialization is
still done, but only once via a single numpy call in the Python side of things.

The speed of the numpy call should be comparable to what we get with
allocation, which is unavoidable.

Please see if the performance improvements you observe with this is acceptable.

Best,
Tiago

-- 
Tiago de Paula Peixoto <tiago@skewed.de>