Dear Tiago,
I have a directed graph of about half a million nodes and approximately a
million edges following scale free behaviour and a power law degree
distribution. To test some of my hypothesis, I would like to generate random
smaller graphs (about 50 up to 200 nodes) representative of the big one.
When I used a sample function that samples straight away from the real
distribution of the big network, I have following problems:
- I generate unconnected nodes with both 0 in AND out degree.
- I generate small sub parts of a few nodes that are not connected to the
main graph.
- If only sampling from nodes with at least 1 degree, the generated graph is
coherent, but not representative anymore as I need a big portion of nodes
with either only one in or one out degree.
Here is the part of my script I used for that, where samples are drawn from
dictionaries of the degrees:
def sample_in():
a=np.random.randint(num)
k_in = in_degrees[a]
return k_in
def sample_out():
if sample_in()==0:
b=np.random.randint(num_out)
k_out=out_zero_zeros.values()[b]
return k_out
else:
b=np.random.randint(num)
k_out=out_degrees[b]
return k_out
N=200
g=gt.random_graph(N, lambda:(sample_in(), sample_out()),
model="constrained-configuration", directed=True)
I also tried sampling from a list of tuples as you have mentioned before in
the forum, but I didn't receive any results, as the tuples randomly drawn
from my list might not be combinable.
degs=[(7,1),(4,3),(5,6),(2,4),(6,8),(2,0),(3,5),(0,3),(2,7),(2,1)]
g = gt.random_graph(4, lambda i: degs[i], directed=True)
- Is there any option I could active that would help me in those cases I
described above?
- Is there a better way how to create representative small networks?
Any help on that issue will be much appreciated.
Best wishes,
Jana
--
Sent from: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/

Hi again,
I'm writing a small package that builds on graph-tool, but not on its graphics capabilities (also because I have to represent other things rather than the graph itself). Still I could use some of the functions "under the hood" for my purposes. I have a question about gt.draw.get_hierarchy_control_points(): the function returns the Bézier spline control points for edges in a given graph, but I'm having difficulties in understanding how this information is encoded. For a single edge in graph, I have dozens of values as control points (half dozens + 2), hence I suspect all splines going from node A to the root of a hierarchy and back to node B are encoded there, and control points should be taken 6 by 6 (3x2 by 3x2 coordinates?). How (x,y) for control points are encoded then: (x, x, x, y, y, y) or (x, y, x, y, x, y)? What are the 2 additiona values I have for each vector? Also, are values absolute or relative to one node in particular (A, B or root...)?
Thanks
d

I am curious what is being used to calculate the standard deviation of the
average in gt.vertex_average and gt.edge_average
>>> t2=gt.Graph()
>>> t2.add_vertex(2)
>>> t2.add_edge(t2.vertex(0), t2.vertex(1))
>>> gt.vertex_average(t2, "in")
(0.5, 0.35355339059327373)
Now, shouldn't std be σ(n)=sqrt(((0-0.5)^2+(1-0.5)^2)/2)=0.5 ?
also q(n-1)=sqrt((0.5^2+0.5^2)/(2-1))~=0.70710
0.3535 is sqrt(2)/4 which happens to be σ(n-1)/2, so it seems there is some
relation to that.
A little bigger graph.
>>> t3=gt.Graph()
>>> t3.add_vertex(5)
>>> t3.add_edge(t3.vertex(0), t3.vertex(1))
>>> gt.vertex_average(t3, "in")
(0.2, 0.17888543819998318)
Now, we should have 0,1,0,0,0 series for vertex incoming degree.
So Windows calc gives σ(n)=0.4 and σ(n-1)~=0.44721, so where does 0.1788854
come from ?
Reason, I am asking because, I have a large graph, where the average looks
quite alright but the std makes no sense, as going by the histogram, degree
values are quite a bit more distributed than the std would indicate.
--
View this message in context: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com…
Sent from the Main discussion list for the graph-tool project mailing list archive at Nabble.com.

Hi,
I am using the minimize_blockmodel_dl() function and the LayeredBlockState
class to conduct community detection with metadata. I would like to know if
it is possible to specify the block number limits separately for the user
clusters and the metadata clusters (instead of the total block number limit)
during the SBM inference? Since we want to have different granularity for
user clusters and metadata clusters.
(P.S. I tried using nested SBM and merging separately the user clusters and
the metadata clusters. It worked to some extent, but it seems impossible to
accurately control the final number of clusters, which we might need to.)
Thanks!
Yan
--
Sent from: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/

Hi, everyone!
I met a problem when I'm learning how to use graph-tool. I read the paper,
network reconstruction and community detection from dynamics, and I am
trying to achieve the same result. When I followed the same settings for
real networks with synthetic dynamics, their similarities were just about
0.2. I have a question about how to control the number of infection events
per node,a, for the first model and the number of micro-state, M, for the
second model. The whole process is shown as following.
import graph_tool.all as gt
from matplotlib import cm
g = gt.collection.konect_data["openflights"] ## airport network with SIS
dynamics
gt.remove_parallel_edges(g)
g = gt.extract_largest_component(g, prune=False)
#simulation of an empirical dynamic model
# The algorithm accepts multiple independent time-series for the
# reconstruction. We will generate 100 SIS cascades starting from a
# random node each time, and uniform infection probability beta=0.2.
ss = []
for i in range(100):
si_state = gt.SISState(g, beta=.2)
s = [si_state.get_state().copy()]
for j in range(10):
si_state.iterate_sync()
s.append(si_state.get_state().copy())
# Each time series should be represented as a single vector-valued
# vertex property map with the states for each note at each time.
s = gt.group_vector_property(s)
ss.append(s)
# Prepare the initial state of the reconstruction as an empty graph
u = g.copy()
u.clear_edges()
ss = [u.own_property(s) for s in ss] # time series properties need to be
'owned' by graph u
# Create reconstruction state
rstate = gt.EpidemicsBlockState(u, s=ss, beta = None, r=1e-6,
global_beta=.2,
state_args=dict(B=20), nested=False,
aE=g.num_edges())
# Now we collect the marginals for exactly 10,000 sweeps, at
# intervals of 10 sweeps:
gm = None
bm = None
betas = []
def collect_marginals(s):
global gm, bm
gm = s.collect_marginal(gm)
b = gt.perfect_prop_hash([s.bstate.b])[0]
bm = s.bstate.collect_vertex_marginals(bm, b=b)
betas.append(s.params["global_beta"])
gt.mcmc_equilibrate(rstate, force_niter=1000, mcmc_args=dict(niter=10,
xstep=0),
callback=collect_marginals)
print("Posterior similarity: ", gt.similarity(g, gm, g.new_ep("double", 1),
gm.ep.eprob))
print("Inferred infection probability: %g ± %g" % (mean(betas), std(betas)))
##########################################################
g = gt.GraphView(gt.collection.konect_data["maayan-foodweb"],
directed=True)##a food web network with Ising dynamic
gt.remove_parallel_edges(g)
# The algorithm accepts multiple independent time-series for the
# reconstruction. We will generate 1000 Ising cascades starting from a
# random node each time, and the uniform inverse temperature beta=0.2.
ss = []
for i in range(1000):
si_state = gt.IsingGlauberState(g, beta=.1)
s = [si_state.get_state().copy()]
si_state.iterate_async(niter=1000)
s.append(si_state.get_state().copy())
# Each time series should be represented as a single vector-valued
# vertex property map with the states for each note at each time.
s = gt.group_vector_property(s)
ss.append(s)
u = g.copy()
u.clear_edges()
ss = [u.own_property(s) for s in ss]
rstate = gt.PseudoIsingBlockState(g,s=ss,beta=0.1,state_args=dict(B=1),
nested=False, aE=g.num_edges())
gm = None
bm = None
betas = []
def collect_marginals(s):
global gm, bm
gm = s.collect_marginal(gm)
b = gt.perfect_prop_hash([s.bstate.b])[0]
bm = s.bstate.collect_vertex_marginals(bm, b=b)
betas.append(s.params["beta"])
gt.mcmc_equilibrate(rstate, force_niter=1000, mcmc_args=dict(niter=10,
xstep=0),
callback=collect_marginals)
print("Posterior similarity: ", gt.similarity(g, gm, g.new_ep("double", 1),
gm.ep.eprob))
print("Inversed temperature: %g ± %g" % (mean(betas), std(betas)))
Moreover, I also wonder how to do a nested version for the same network.
Please let me know if you need more information on the question. otherwise,
I hope to hear how this can be achieved using graph-tool?
Thanks,
Gege Hou
--
Sent from: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/

I followed the installation instructions in the website:
https://git.skewed.de/count0/graph-tool/-/wikis/installation-instructions
for installing graph-tool on Ubuntu 18.04, but when I write "sudo apt-get
install python3-graph-tool" in the terminal, it gives me the following
error:
Reading package lists... Done
Building dependency tree
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:
The following packages have unmet dependencies:
python3-graph-tool : Depends: libboost-context1.67.0 but it is not
installable
Depends: libboost-iostreams1.67.0 but it is not
installable
Depends: libboost-python1.67.0 but it is not
installable
Depends: libboost-python1.67.0-py38 but it is not
installable
Depends: libboost-regex1.67.0-icu63 but it is not
installable
Depends: libc6 (>= 2.29) but 2.27-3ubuntu1 is to be
installed
Depends: libgcc-s1 (>= 3.4) but it is not installable
Depends: libgomp1 (>= 9) but 8.4.0-1ubuntu1~18.04 is
to be installed
Depends: libstdc++6 (>= 9) but 8.4.0-1ubuntu1~18.04 is
to be installed
E: Unable to correct problems, you have held broken packages.
Does anyone know how I can fix it?
--
Sent from: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/

Hi,
There seems to be problem with get_edges_prob for the layered SBM. Here is a
minimal example:
import graph_tool.all as gt
import numpy as np
gr=gt.generate_sbm(b=np.array([0]*500+[1]*500),probs=np.array([[10000,200],[
200,10000]]))
etype=gr.new_edge_property('int')
gr.ep.etype=etype
t=0
for e in gr.edges():
gr.ep.etype[e]=t%4
t+=1
state = gt.minimize_nested_blockmodel_dl(gr,
deg_corr=True,layers=True,state_args=dict(ec=gr.ep.etype,layers=True))
print(state.get_edges_prob(missing=[[2,32,0]],spurious=[]))
print(state.get_edges_prob(missing=[[2,32,0],[3,4,2]],spurious=[]))
print(state.get_edges_prob(missing=[[2,32,0],[3,4,2],[36,7,0]],spurious=[]))
pr=state.get_edges_prob(missing=[[2,32,0],[3,4,2]],spurious=gr.get_edges([gr
.ep.etype])[:3])
Output:
0.0
-7.883180576649465
-7.883180576649465
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-11-1f73a43d95dd> in <module>
12 print(state.get_edges_prob(missing=[[2,32,0],[3,4,2]],spurious=[]))
13
print(state.get_edges_prob(missing=[[2,32,0],[3,4,2],[36,7,0]],spurious=[]))
---> 14
pr=state.get_edges_prob(missing=[[2,32,0],[3,4,2]],spurious=gr.get_edges([gr
.ep.etype])[:3])
/usr/lib/python3/dist-packages/graph_tool/inference/nested_blockmodel.py in
get_edges_prob(self, missing, spurious, entropy_args)
499 lstate._state.clear_egroups()
500
--> 501 L += lstate.get_edges_prob(missing, spurious,
entropy_args=eargs)
502 if isinstance(self.levels[0], LayeredBlockState):
503 missing = [(lstate.b[u], lstate.b[v], l_) for u, v,
l_ in missing]
/usr/lib/python3/dist-packages/graph_tool/inference/layered_blockmodel.py in
get_edges_prob(self, missing, spurious, entropy_args)
896 if e is None:
897 raise ValueError("edge not found: (%d, %d, %d)"
% \
--> 898 (int(u), int(v), l[0]))
899
900 if state.is_weighted:
ValueError: edge not found: (3, 4, 2)
The error occurs only when spurious edges are included but even without
spurious edges the outputs above seem to be inaccurate. I tried to resolve
the issue myself but couldn't make it work.
Best wishes,
Anatol

Hi,
I am computing a weight for each edge, in a graph of some million nodes and
about 50 mil edges. Initially, it was doing 25000 edges per minute, so it
should have finished processing in about 30+ hours. Now I check the server
for results after more than 72 hours, and the process is still running, but
very slowly, doing less than 1000 edges per minute.
Is there anything I am doing wrong, or what could be the reason for such a
slowdown?
I list here the code:
def compute_exclusivities(g):
count = 0
excl = g.new_edge_property("double")
g.edge_properties["Exclusivity"] = excl
edges = g.get_edges([g.edge_properties["label_id"]])
for edge in edges:
edges_of_head = g.get_out_edges(edge[0],
[g.edge_properties["label_id"]])
count_edges_ofhead_sametype = len(np.where(np.where(edges_of_head ==
edge[2])[1] == 2)[0])
edges_of_tail = g.get_in_edges(edge[1],
[g.edge_properties["label_id"]])
count_edges_oftail_sametype = len(np.where(np.where(edges_of_tail ==
edge[2])[1] == 2)[0])
excl[g.edge(edge[0], edge[1])] = 1 / (count_edges_ofhead_sametype +
count_edges_oftail_sametype - 1)
count = count + 1
if count % 1000 == 0:
print(count, " exclusivities computed")
Thanks,
Ioana
--
Sent from: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/

Hello Tiago,
Thanks for your reply!
> Are you really using a 32 bit i386 CPU in 2020?
No, I have 64 bit:
dpkg --print-architecture --> amd64
I think I found the solution:
https://unix.stackexchange.com/questions/272908/apt-looking-for-i386-files-…
Now it works fine. If other users face the same problem, maybe you could
mention the following fix for /etc/apt/sources.list:
deb [ arch=amd64 ] http://downloads.skewed.de/apt DISTRIBUTION main
> Xenial is super old and does not compile graph-tool due to a lack of
> compiler with C++17 support.
It's really a pity that you face problems with the C++ compiler. I think
that Ubuntu Xenial is still widely used. It is supported until April
2021:
https://wiki.ubuntu.com/Releases
(I would upgrade if I could but the hardware on that computer doesn't
allow it)
Best regards
Rolf
--
-----------------------------------------------------------------------
Rolf Sander phone: [+49] 6131/305-4610
Max-Planck Institute of Chemistry email: rolf.sander(a)mpic.de
PO Box 3060, 55020 Mainz, Germany homepage: www.rolf-sander.net
-----------------------------------------------------------------------
https://www.encyclopedia-of-geosciences.nethttps://www.geoscientific-model-development.net
-----------------------------------------------------------------------

Dear Graph-Tool Community,
I am interested in analysing the hierarchical partitions generated by the nested blockmodel. Specifically, after I have generated a nested SBM; I would then like to post-process this and calculate measures such as eigenvector centrality for a given hierarchical node; save this as a property, and then in visualisation apply either a size or colormap constraint to said node weighted by its centrality.
Using the collection data;
g = gt.collection.data["celegansneural”]
state = gt.minimize_nested_blockmodel_dl(g)
I can then ascertain what my levels are with;
l1state = state.levels[1].g
l2state = state.levels[2].g
etc.
I can then calculate eigenvector centrality of a given hierarchical partition as follows;
ee1, x1 = gt.eigenvector(l1state)
ee2, x2 = gt.eigenvector(l2state)
1) This presumably then needs to be saved as a hvprops(?!). But, I am unclear how to do this, not least in a way that I know for sure that the correct hierarchical vertices within l1state and l2state are aligning to the generated centrality measures of x1 and x2, respectively.
2) Furthermore, if/when that is achieved, how can I call upon this in drawing, for example to size the level 1 hierarchical vertices according to centrality, or level 2 vertices by another measure, etc.?
Hugely grateful for any solutions!
James