I have a directed graph of about half a million nodes and approximately a
million edges following scale free behaviour and a power law degree
distribution. To test some of my hypothesis, I would like to generate random
smaller graphs (about 50 up to 200 nodes) representative of the big one.
When I used a sample function that samples straight away from the real
distribution of the big network, I have following problems:
- I generate unconnected nodes with both 0 in AND out degree.
- I generate small sub parts of a few nodes that are not connected to the
- If only sampling from nodes with at least 1 degree, the generated graph is
coherent, but not representative anymore as I need a big portion of nodes
with either only one in or one out degree.
Here is the part of my script I used for that, where samples are drawn from
dictionaries of the degrees:
k_in = in_degrees[a]
g=gt.random_graph(N, lambda:(sample_in(), sample_out()),
I also tried sampling from a list of tuples as you have mentioned before in
the forum, but I didn't receive any results, as the tuples randomly drawn
from my list might not be combinable.
g = gt.random_graph(4, lambda i: degs[i], directed=True)
- Is there any option I could active that would help me in those cases I
- Is there a better way how to create representative small networks?
Any help on that issue will be much appreciated.
Sent from: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/
I'm writing a small package that builds on graph-tool, but not on its graphics capabilities (also because I have to represent other things rather than the graph itself). Still I could use some of the functions "under the hood" for my purposes. I have a question about gt.draw.get_hierarchy_control_points(): the function returns the Bézier spline control points for edges in a given graph, but I'm having difficulties in understanding how this information is encoded. For a single edge in graph, I have dozens of values as control points (half dozens + 2), hence I suspect all splines going from node A to the root of a hierarchy and back to node B are encoded there, and control points should be taken 6 by 6 (3x2 by 3x2 coordinates?). How (x,y) for control points are encoded then: (x, x, x, y, y, y) or (x, y, x, y, x, y)? What are the 2 additiona values I have for each vector? Also, are values absolute or relative to one node in particular (A, B or root...)?
I'm a beginner in graph algorithm, and I have trouble to find an efficient
solution for my issue.
My goal is to find X best paths between two vertices. Not only the shortest
paths, but sub optimals path, ordered by length.
My graph is relatively small (10k vertices, 20k edges) and is not directed.
I don't use weight or vertices/edges attributes.
My current prototype:
I use first all_shortest_paths to find the best paths.
Then I use all_paths with a cutoff = length of shortest_path + K . K is an
arbitrary value set as an argument of my function.
Finally, I sort the return of all_paths by path length and select the X
It works fine when the vertices are close to each other, but when my cutoff
is over 20ish, the all_paths function take forever and my CPU usage
skyrockets. I can't even iterate with next() over the result. I understand
that all_paths is not very effective in my case, but at the moment this is
my best guess.
I'm reading the documentation of graph tools, but I cant find any other
function fitting my needs. I 'm considering using the function bfs_search
with a custom BFSVisitor, but I'm afraid i will encounter performance issues
Am i missing something?
Anyways, thanks for this library, its impressive.
Sorry for my poor English.
Sent from: https://nabble.skewed.de/
I can give some feedback on this since I've also used graph-tool for
getting paths. In my case it was slightly different because (1) I wanted to
sample paths without bias, (2) I used a weight on the arcs, (3) my arcs
were directed and (4) the number of nodes and edges was really big . Sadly,
(1) and (2) made it not very easy to do it via graph tool.
As far as I understand, graph tool uses a DFS to obtain `all_paths`. When
the number of potential paths is large for a given cutoff but the remaining
number of valid paths is small, it will iterate a long time (at each
`next()`) until it proves there are no more paths that have a length equal
or smaller to the cutoff.
You could choose to stop the iteration once it starts taking longer than
certain time to find a path, but even this is no guarantee. There may be
many paths still remaining to be found. You can also play with the number
of paths to return...
Another thing you should take into account if you want to stop `all_paths`
at a limit number of paths is that since it does a DFS, two paths that are
close in the returned list will be "close" in structure (i.e., similar). So
"sampling" paths this way is not great.
Sorry I cannot be of more help. If you find a better way to do it with
graph-tool, please share it over here!
On Wed, Dec 16, 2020 at 12:00 PM <graph-tool-request(a)skewed.de> wrote:
> Send graph-tool mailing list submissions to
> To subscribe or unsubscribe via the World Wide Web, visit
> or, via email, send a message with subject or body 'help' to
> You can reach the person managing the list at
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of graph-tool digest..."
> Today's Topics:
> 1. Find best paths sorted by length with graph tools (Ruhm)
> Message: 1
> Date: Tue, 15 Dec 2020 08:22:24 -0700 (MST)
> From: Ruhm <romain.da-costa-vieira(a)itsfactory.fr>
> To: graph-tool(a)skewed.de
> Subject: [graph-tool] Find best paths sorted by length with graph
> Message-ID: <1608045744698-0.post(a)n3.nabble.com>
> Content-Type: text/plain; charset=us-ascii
> I'm a beginner in graph algorithm, and I have trouble to find an efficient
> solution for my issue.
> My goal is to find X best paths between two vertices. Not only the shortest
> paths, but sub optimals path, ordered by length.
> My graph is relatively small (10k vertices, 20k edges) and is not directed.
> I don't use weight or vertices/edges attributes.
> My current prototype:
> I use first all_shortest_paths to find the best paths.
> Then I use all_paths with a cutoff = length of shortest_path + K . K is an
> arbitrary value set as an argument of my function.
> Finally, I sort the return of all_paths by path length and select the X
> It works fine when the vertices are close to each other, but when my cutoff
> is over 20ish, the all_paths function take forever and my CPU usage
> skyrockets. I can't even iterate with next() over the result. I understand
> that all_paths is not very effective in my case, but at the moment this is
> my best guess.
> I'm reading the documentation of graph tools, but I cant find any other
> function fitting my needs. I 'm considering using the function bfs_search
> with a custom BFSVisitor, but I'm afraid i will encounter performance
> Am i missing something?
> Anyways, thanks for this library, its impressive.
> Sorry for my poor English.
> Sent from: https://nabble.skewed.de/
> Subject: Digest Footer
> graph-tool mailing list
> End of graph-tool Digest, Vol 155, Issue 4
If I have two graphs g and h with internal edge property maps, how can I make these the layers of a graph i?
g = Graph(directed=False)
g_weight = g.new_ep('int')
g_layer = g.new_ep('int')
g.add_edge_list([[0, 1, 2, 0], [2, 3, 2, 0]], eprops=[g_weight, g_layer])
g.edge_properties['weight'] = g_weight
g.edge_properties['layer'] = g_layer
h = Graph(directed=False)
h_weight = h.new_ep('int')
h_layer = h.new_ep('int')
h.add_edge_list([[1, 2, 1, 1]], eprops=[h_weight, h_layer])
h.edge_properties['weight'] = h_weight
h.edge_properties['layer'] = h_layer
This works but is not nice and gets lengthy when layers are many:
i = g.copy() # make deep copy
i.add_edge_list(h.edges()) # this does not add the edge properties
i.ep.weight.a[-h.num_edges():] = h.ep.weight.a # manually add the edge weight
i.ep.layer.a[-h.num_edges():] = h.ep.weight.a # manually add the edge layer
Is there a better way?
I'm using gt 2.29 from conda-ostrokach.