On 07.03.2015 18:20, François wrote:
Consider two graphs: G=(V,E) and H=(V',E') such that:
- V' \subset V and E' \subset E
- |E| = 66M
- |E'|=2M.
Run:
- X = shortest_distance(G, G.vertex(0), max_dist=x, weights=G.ep['weights'])
- Y = shortest_distance(H, H.vertex(0), max_dist=x, weights=H.ep['weights'])
I verify that : X = Y
The execution took 320ms for (1) and 16ms for (2). However the number of visited vertices should be identical ? I expected to obtain comparable execution times.
This is because of the initialization of the algorithm. The algorithm needs to create and initialize at least two vectors of size |V| to store the distances and book-keeping, regardless of how many vertices are in fact visited. Therefore the algorithm will never be faster than O(|V|), independently of "max_dist".
Here is more or less what you are describing:
In [1]: g = lattice([10, 10])
In [2]: %time d = shortest_distance(g, g.vertex(0), max_dist=2) CPU times: user 3.33 ms, sys: 0 ns, total: 3.33 ms Wall time: 2.43 ms
In [3]: g = lattice([1000, 1000])
In [4]: %time d = shortest_distance(g, g.vertex(0), max_dist=2) CPU times: user 50 ms, sys: 0 ns, total: 50 ms Wall time: 28.6 ms
Note that the second graph is ten thousand times larger than the first one, but the algorithm was only a little more than ten times slower.
Now if we do it for the larger graph without max dist, we have:
In [5]: %time d = shortest_distance(g, g.vertex(0)) CPU times: user 270 ms, sys: 0 ns, total: 270 ms Wall time: 260 ms
Which is around 10 times slower... So thing are indeed faster if max_dist is specified, but the algorithm will still depend on the total size of the graph.
Best, Tiago
-- Tiago de Paula Peixoto tiago@skewed.de