How are samples drawn for distance_histogram?
When providing a value for "samples" in "distance_histogram" are the samples drawn randomly? Thus, could I expect a different set of vertices being used if I supply the same graph with the same number of samples twice to compute my distance histogram? I am wondering if I can use the repetition of the computation to give me a crude estimate of the uncertainty due to sampling as opposed to calculating the histogram for all nodes (if the same nodes are being used in each run there is obviously little point in repeating it). I should add that I am using the histogram only to compute the average shortest path length for the network. -- View this message in context: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/... Sent from the Main discussion list for the graph-tool project mailing list archive at Nabble.com.
On 27.09.2016 15:16, P-M wrote:
When providing a value for "samples" in "distance_histogram" are the samples drawn randomly? Thus, could I expect a different set of vertices being used if I supply the same graph with the same number of samples twice to compute my distance histogram?
Yes, of course. Best, Tiago -- Tiago de Paula Peixoto <tiago@skewed.de>
I am drawing a sample of 3% from a network of 12 million vertices. From what I can tell the histogram and the average path length computed for it are identical between the two runs though. I would expect to see some degree of fluctuation at least if I am sampling different vertices in different runs or do you think this is an unreasonable assumption? Best, Philipp -----Original Message----- From: Tiago de Paula Peixoto [mailto:tiago@skewed.de] Sent: 29 September 2016 16:00 To: Main discussion list for the graph-tool project <graph-tool@skewed.de> Subject: Re: [graph-tool] How are samples drawn for distance_histogram? On 27.09.2016 15:16, P-M wrote:
When providing a value for "samples" in "distance_histogram" are the samples drawn randomly? Thus, could I expect a different set of vertices being used if I supply the same graph with the same number of samples twice to compute my distance histogram?
Yes, of course. Best, Tiago -- Tiago de Paula Peixoto <tiago@skewed.de>
Please ignore that message, my mistake. The results do indeed differ across runs so everything works fine. Best, Philipp -----Original Message----- From: Philipp-Maximilian Jacob [mailto:pmj27@cam.ac.uk] Sent: 30 September 2016 12:40 To: 'Tiago de Paula Peixoto' <tiago@skewed.de>; 'Main discussion list for the graph-tool project' <graph-tool@skewed.de> Subject: RE: [graph-tool] How are samples drawn for distance_histogram? I am drawing a sample of 3% from a network of 12 million vertices. From what I can tell the histogram and the average path length computed for it are identical between the two runs though. I would expect to see some degree of fluctuation at least if I am sampling different vertices in different runs or do you think this is an unreasonable assumption? Best, Philipp -----Original Message----- From: Tiago de Paula Peixoto [mailto:tiago@skewed.de] Sent: 29 September 2016 16:00 To: Main discussion list for the graph-tool project <graph-tool@skewed.de> Subject: Re: [graph-tool] How are samples drawn for distance_histogram? On 27.09.2016 15:16, P-M wrote:
When providing a value for "samples" in "distance_histogram" are the samples drawn randomly? Thus, could I expect a different set of vertices being used if I supply the same graph with the same number of samples twice to compute my distance histogram?
Yes, of course. Best, Tiago -- Tiago de Paula Peixoto <tiago@skewed.de>
participants (3)
-
P-M -
Philipp-Maximilian Jacob -
Tiago de Paula Peixoto