Thanks for these clarifications.
I guess that most of the time is spent in the *copy_property* [link to git https://git.skewed.de/count0/graph-tool/blob/master/src/graph/graph_properties_copy.cc#L33]. Indeed, when I provide a pre-initialized distance vector through the dist_map parameter, the execution of shortest_distance is only 10ms faster.
If I understood well, the *copy_property *is used to build a vector initialized with one single value. If so, one would obtain the same result with python and numpy with
x = np.empty(array_size, dtype=float) x.fill(value)
I did try to time this "numpy way initialization" (altough I'm not sure it corresponds to *copy_property*)
python -m timeit -s "import numpy as np" -n 100 "np.empty(33e6, dtype=int).fill(1)" 100 loops, best of 3: 34.9 msec per loop
These 34.9ms have to be compared to the 300ms (bake of envelope calculus) that takes the *copy_property *function.
Am I right about the way that the copy_property function works ? Could it be improved ?
2015-03-08 20:11 GMT+01:00 Tiago Peixoto [via Main discussion list for the graph-tool project] firstname.lastname@example.org:
On 07.03.2015 18:20, François wrote:
Consider two graphs: G=(V,E) and H=(V',E') such that:
- V' \subset V and E' \subset E
- |E| = 66M
- X = shortest_distance(G, G.vertex(0), max_dist=x,
- Y = shortest_distance(H, H.vertex(0), max_dist=x,
I verify that : X = Y
The execution took 320ms for (1) and 16ms for (2). However the number of visited vertices should be identical ? I expected to obtain comparable execution times.
This is because of the initialization of the algorithm. The algorithm needs to create and initialize at least two vectors of size |V| to store the distances and book-keeping, regardless of how many vertices are in fact visited. Therefore the algorithm will never be faster than O(|V|), independently of "max_dist".
Here is more or less what you are describing:
In : g = lattice([10, 10])
In : %time d = shortest_distance(g, g.vertex(0), max_dist=2) CPU times: user 3.33 ms, sys: 0 ns, total: 3.33 ms Wall time: 2.43 ms
In : g = lattice([1000, 1000])
In : %time d = shortest_distance(g, g.vertex(0), max_dist=2) CPU times: user 50 ms, sys: 0 ns, total: 50 ms Wall time: 28.6 ms
Note that the second graph is ten thousand times larger than the first one, but the algorithm was only a little more than ten times slower.
Now if we do it for the larger graph without max dist, we have:
In : %time d = shortest_distance(g, g.vertex(0)) CPU times: user 270 ms, sys: 0 ns, total: 270 ms Wall time: 260 ms
Which is around 10 times slower... So thing are indeed faster if max_dist is specified, but the algorithm will still depend on the total size of the graph.
-- Tiago de Paula Peixoto <[hidden email] http:///user/SendEmail.jtp?type=node&node=4026027&i=0>
graph-tool mailing list [hidden email] http:///user/SendEmail.jtp?type=node&node=4026027&i=1 http://lists.skewed.de/mailman/listinfo/graph-tool
*signature.asc* (836 bytes) Download Attachment
Tiago de Paula Peixoto email@example.com
If you reply to this email, your message will be added to the discussion below:
http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/... To start a new topic under Main discussion list for the graph-tool project, email firstname.lastname@example.org To unsubscribe from Shortest_distance complexity when used with max_dist, click here http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4026018&code=ZnJhbmNvaXMua2F3YWxhQGdtYWlsLmNvbXw0MDI2MDE4fDIxMTQ0MDk4Nzk= . NAML http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml