On 06.07.2016 09:46, killver wrote:
I am currently implementing an existing networkx implementation in graph-tool and observed that the new implementation runs slower than the networkx one. I then found out that a simple node access is already slower. So I have the exact same undirected graph with 1000 nodes and avg. degree of 2 in nx and gt, and then run:
def select(g): for i in range(1000): n = g.vertex(0)
def select_nx(g): for i in range(1000): n = g.nodes(0)
%time select(gt) CPU times: user 75.6 ms, sys: 69 µs, total: 75.6 ms Wall time: 72.6 ms
%time select_nx(g) CPU times: user 3.82 ms, sys: 0 ns, total: 3.82 ms Wall time: 3.46 ms
So I am wondering whether this is to be expected and that I can only see the drastic runtime improvements when working with calculations like shortest paths etc.
Graph-tool achieves its performance by off-loading central loops and data structures to C++. If your main loops are in Python, it offers no advantage whatsoever. The same applies for things like ndarray in Numpy, etc. The approach to achieve good performance in Python is to avoid as many loops as possible. In fact, as you see above, the fact that it depends on foreign data structures may even cause some overhead. (In graph-tool, when you call Graph.vertex() a new object is generated, unlike in networkx where it just accesses a list. This can be trivially optimized in graph-tool by either working with 'ints' most of the time, or caching the vertex objects, i.e. vertices = list(g.vertices()); v = vertices[i]) Best, Tiago -- Tiago de Paula Peixoto <tiago@skewed.de>