On 05/21/2013 01:37 PM, VaSa wrote:
I am curious what is being used to calculate the
standard deviation of the
average in gt.vertex_average and gt.edge_average
These functions return the standard deviation of *the mean* not the
standard deviation of the distribution, which is given by,
\sigma_a = \sigma / sqrt(N)
where \sigma is the standard deviation of the distribution, and N is the
number of samples.
>> t2.add_edge(t2.vertex(0), t2.vertex(1))
>> gt.vertex_average(t2, "in")
Now, shouldn't std be σ(n)=sqrt(((0-0.5)^2+(1-0.5)^2)/2)=0.5 ?
The standard deviation of the mean is therefore:
0.5 / sqrt(2) = 0.35355339059327373...
which is what you see.
A little bigger graph.
>> t3.add_edge(t3.vertex(0), t3.vertex(1))
>> gt.vertex_average(t3, "in")
Now, we should have 0,1,0,0,0 series for vertex incoming degree.
So Windows calc gives σ(n)=0.4 and σ(n-1)~=0.44721, so where does 0.1788854
come from ?
Again, 0.4 / sqrt(5) = 0.17888543819998318...
Reason, I am asking because, I have a large graph,
where the average looks
quite alright but the std makes no sense, as going by the histogram, degree
values are quite a bit more distributed than the std would indicate.
If you want the deviation of the distribution to compare with the
histogram, just multiply by sqrt(N).
Tiago de Paula Peixoto <tiago(a)skewed.de>