I am currently implementing an existing networkx implementation in graph-tool and observed that the new implementation runs slower than the networkx one. I then found out that a simple node access is already slower. So I have the exact same undirected graph with 1000 nodes and avg. degree of 2 in nx and gt, and then run:
def select(g): for i in range(1000): n = g.vertex(0)
def select_nx(g): for i in range(1000): n = g.nodes(0)
%time select(gt) CPU times: user 75.6 ms, sys: 69 µs, total: 75.6 ms Wall time: 72.6 ms
%time select_nx(g) CPU times: user 3.82 ms, sys: 0 ns, total: 3.82 ms Wall time: 3.46 ms
So I am wondering whether this is to be expected and that I can only see the drastic runtime improvements when working with calculations like shortest paths etc.
-- View this message in context: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/... Sent from the Main discussion list for the graph-tool project mailing list archive at Nabble.com.
On 06.07.2016 09:46, killver wrote:
I am currently implementing an existing networkx implementation in graph-tool and observed that the new implementation runs slower than the networkx one. I then found out that a simple node access is already slower. So I have the exact same undirected graph with 1000 nodes and avg. degree of 2 in nx and gt, and then run:
def select(g): for i in range(1000): n = g.vertex(0)
def select_nx(g): for i in range(1000): n = g.nodes(0)
%time select(gt) CPU times: user 75.6 ms, sys: 69 µs, total: 75.6 ms Wall time: 72.6 ms
%time select_nx(g) CPU times: user 3.82 ms, sys: 0 ns, total: 3.82 ms Wall time: 3.46 ms
So I am wondering whether this is to be expected and that I can only see the drastic runtime improvements when working with calculations like shortest paths etc.
Graph-tool achieves its performance by off-loading central loops and data structures to C++. If your main loops are in Python, it offers no advantage whatsoever.
The same applies for things like ndarray in Numpy, etc. The approach to achieve good performance in Python is to avoid as many loops as possible.
In fact, as you see above, the fact that it depends on foreign data structures may even cause some overhead.
(In graph-tool, when you call Graph.vertex() a new object is generated, unlike in networkx where it just accesses a list. This can be trivially optimized in graph-tool by either working with 'ints' most of the time, or caching the vertex objects, i.e. vertices = list(g.vertices()); v = vertices[i])
Best, Tiago
Thanks Tiago, that totally makes sense.
I have another question. I am currently running some benchmarks for comparing two graph-tool installations: 2.2.44 vs. 2.16
While in general 2.16 is much faster on most operations, it is slower on some standard vertex access operations. For example:
def test(): g = gt.price_network(10000, 10, directed=False) for i in range(10000): rnd = g.vertex(randint(0, g.num_vertices())) %time test()
2.22: CPU times: user 189 ms, sys: 7.15 ms, total: 197 ms Wall time: 196 ms
2.16 CPU times: user 841 ms, sys: 776 µs, total: 841 ms Wall time: 838 ms
Do you have an idea why this might be the case?
Thanks, Philipp
On 06.07.2016 19:23, Tiago de Paula Peixoto wrote:
On 06.07.2016 09:46, killver wrote:
I am currently implementing an existing networkx implementation in graph-tool and observed that the new implementation runs slower than the networkx one. I then found out that a simple node access is already slower. So I have the exact same undirected graph with 1000 nodes and avg. degree of 2 in nx and gt, and then run:
def select(g): for i in range(1000): n = g.vertex(0)
def select_nx(g): for i in range(1000): n = g.nodes(0)
%time select(gt) CPU times: user 75.6 ms, sys: 69 µs, total: 75.6 ms Wall time: 72.6 ms
%time select_nx(g) CPU times: user 3.82 ms, sys: 0 ns, total: 3.82 ms Wall time: 3.46 ms
So I am wondering whether this is to be expected and that I can only see the drastic runtime improvements when working with calculations like shortest paths etc.
Graph-tool achieves its performance by off-loading central loops and data structures to C++. If your main loops are in Python, it offers no advantage whatsoever.
The same applies for things like ndarray in Numpy, etc. The approach to achieve good performance in Python is to avoid as many loops as possible.
In fact, as you see above, the fact that it depends on foreign data structures may even cause some overhead.
(In graph-tool, when you call Graph.vertex() a new object is generated, unlike in networkx where it just accesses a list. This can be trivially optimized in graph-tool by either working with 'ints' most of the time, or caching the vertex objects, i.e. vertices = list(g.vertices()); v = vertices[i])
Best, Tiago
graph-tool mailing list graph-tool@skewed.de https://lists.skewed.de/mailman/listinfo/graph-tool
On 20.07.2016 11:52, Philipp Singer wrote:
While in general 2.16 is much faster on most operations, it is slower on some standard vertex access operations. For example:
def test(): g = gt.price_network(10000, 10, directed=False) for i in range(10000): rnd = g.vertex(randint(0, g.num_vertices())) %time test()
2.22: CPU times: user 189 ms, sys: 7.15 ms, total: 197 ms Wall time: 196 ms
2.16 CPU times: user 841 ms, sys: 776 µs, total: 841 ms Wall time: 838 ms
Do you have an idea why this might be the case?
This looks like a performance regression. Please just open an issue in the website, so I can take care of it.
Best, Tiago