On 10.03.2015 15:39, François wrote:
Hello,
Firstly thanks for pushing in this update as quickly, that's awesome. I tested your fix, It seems that the vector initialization was not the only to blame. With the fix the execution time is still around 300ms.
I checked that you're fix is used, and "timed" the execution of the *shortest_distance* function up to the call to *libgraph_tool_topology.get_dists*.
In the "sub-sample" graph:
- the initialization takes in average 3ms;
- the the call to libgraph_tool_topology.get_dists* *takes in average 11ms.
In the full graph:
- the initialization takes in average 100ms;
- the call to libgraph_tool_topology.get_dists* *takes in average 230ms.
Do you have openmp enabled? If yes, it is best to disable it with openmp_set_num_threads(1) before running the code.
Note, however, that you will always get a slower code for a larger graph... A 10x slow-down in each function is expected if your graph is 10x larger.
Note also that the distance initialization is done inside get_dists().
Surprisingly (or not ?) when I change the pmap initialization [link to source https://git.skewed.de/count0/graph-tool/blob/master/src/graph_tool/topology/__init__.py#L1248] to:
vals = numpy.arange(g.num_vertices(), dtype='int') pmap = g.new_vertex_property("int64_t", vals=vals)
The numpy call takes 36ms and creation of the new vertex property takes 103ms.
These are very tight loops, so things are very sensitive to cache hits, memory re-allocation issues and so on. For instance, here I get:
In [8]: g = lattice([1000, 10000])
In [9]: g Out[9]: <Graph object, undirected, with 10000000 vertices and 19989000 edges at 0x7fad583720b8>
In [10]: %time vals = numpy.arange(g.num_vertices(), dtype='int') CPU times: user 16.7 ms, sys: 0 ns, total: 16.7 ms Wall time: 19.6 ms
In [11]: %time vals = numpy.arange(g.num_vertices(), dtype='int') CPU times: user 13.3 ms, sys: 960 ms, total: 973 ms Wall time: 976 ms
In [12]: %time vals = numpy.arange(g.num_vertices(), dtype='int') CPU times: user 20 ms, sys: 13.3 ms, total: 33.3 ms Wall time: 32.6 ms
In [13]: %time vals = numpy.arange(g.num_vertices(), dtype='int') CPU times: user 3.33 ms, sys: 950 ms, total: 953 ms Wall time: 954 ms
In [14]: %time vals = numpy.arange(g.num_vertices(), dtype='int') CPU times: user 20 ms, sys: 10 ms, total: 30 ms Wall time: 30 ms
In [15]: %time vals = numpy.arange(g.num_vertices(), dtype='int') CPU times: user 20 ms, sys: 43.3 ms, total: 63.3 ms Wall time: 62.1 ms
In [16]: %time vals = numpy.arange(g.num_vertices(), dtype='int') CPU times: user 23.3 ms, sys: 6.67 ms, total: 30 ms Wall time: 28.9 ms
In [17]: %time vals = numpy.arange(g.num_vertices(), dtype='int') CPU times: user 13.3 ms, sys: 173 ms, total: 187 ms Wall time: 186 ms
In [18]: %time vals = numpy.arange(g.num_vertices(), dtype='int') CPU times: user 20 ms, sys: 13.3 ms, total: 33.3 ms Wall time: 30.7 ms
In [19]: %time vals = numpy.arange(g.num_vertices(), dtype='int') CPU times: user 6.67 ms, sys: 853 ms, total: 860 ms Wall time: 858 ms
Notice how the timing fluctuates wildly from ~30ms to ~900ms!
With the property map creation, things are similar:
In [20]: %time pmap = g.new_vertex_property("int64_t", vals=vals) CPU times: user 50 ms, sys: 0 ns, total: 50 ms Wall time: 48.2 ms
In [21]: %time pmap = g.new_vertex_property("int64_t", vals=vals) CPU times: user 23.3 ms, sys: 20 ms, total: 43.3 ms Wall time: 43.5 ms
In [22]: %time pmap = g.new_vertex_property("int64_t", vals=vals) CPU times: user 40 ms, sys: 3.33 ms, total: 43.3 ms Wall time: 42.3 ms
In [23]: %time pmap = g.new_vertex_property("int64_t", vals=vals) CPU times: user 30 ms, sys: 490 ms, total: 520 ms Wall time: 519 ms
In [24]: %time pmap = g.new_vertex_property("int64_t", vals=vals) CPU times: user 50 ms, sys: 6.67 ms, total: 56.7 ms Wall time: 56.6 ms
In [25]: %time pmap = g.new_vertex_property("int64_t", vals=vals) CPU times: user 26.7 ms, sys: 1.12 s, total: 1.15 s Wall time: 1.14 s
In [26]: %time pmap = g.new_vertex_property("int64_t", vals=vals) CPU times: user 43.3 ms, sys: 13.3 ms, total: 56.7 ms Wall time: 54.6 ms
In [27]: %time pmap = g.new_vertex_property("int64_t", vals=vals) CPU times: user 20 ms, sys: 1.11 s, total: 1.13 s Wall time: 1.12 s
These jumps in time are probably just memory re-allocations.
The actual initialization itself is basically the same when using a numpy array or a property map:
In [33]: %time pmap.a[:] = 10 CPU times: user 30 ms, sys: 0 ns, total: 30 ms Wall time: 20.7 ms
In [34]: %time pmap.a[:] = 10 CPU times: user 26.7 ms, sys: 0 ns, total: 26.7 ms Wall time: 18.9 ms
In [35]: %time pmap.a[:] = 10 CPU times: user 20 ms, sys: 0 ns, total: 20 ms Wall time: 20.1 ms
n [36]: %time vals[:] = 10 CPU times: user 16.7 ms, sys: 3.33 ms, total: 20 ms Wall time: 18.3 ms
In [37]: %time vals[:] = 10 CPU times: user 26.7 ms, sys: 0 ns, total: 26.7 ms Wall time: 21.7 ms
In [38]: %time vals[:] = 10 CPU times: user 16.7 ms, sys: 0 ns, total: 16.7 ms Wall time: 15 ms
Therefore I'm not really sure there is any issue with the initialization.
Best, Tiago