Question about output of `get_edges_prob` for spurious edges
Dear Tiago, I am trying to understand the output of get_edges_prob when using it for spurious edges. What value is the routine returning? Am I getting the "likelihood" of the edge being spurious or the "likelihood" of it "actually" existing? Say, if I have two edges I am assuming to be spurious, "a" and "b", and edge "a" scores a higher likelihood ratio value than edge "b" (having used `s.get_edges_prob([],[a], entropy_args=dict(partition_dl=False))` and `s.get_edges_prob([],[b], entropy_args=dict(partition_dl=False))`). Is "a" more likely to be a spurious edge than "b" or is it the other way round (i.e. "a" being more likely to indeed exist than "b")? Best, Philipp
Hi, If anybody does have any thoughts on this I would be very grateful. With best wishes, Philipp -- Sent from: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/
On 01.09.2017 17:26, Philipp-Maximilian Jacob wrote:
I am trying to understand the output of get_edges_prob when using it for spurious edges. What value is the routine returning? Am I getting the “likelihood” of the edge being spurious or the “likelihood” of it “actually” existing?
The former, i.e. the probability that the graph with the _removed_ edge is generated by the inferred model, and hence that the removed edge does not belong. -- Tiago de Paula Peixoto <tiago@skewed.de>
Hi Tiago, Thank you for that explanation. A quick follow-up: If calculating likelihoods of both missing and spurious edges would I expect the output to be on a continuous scale of existence likelihood? Assume there is an edge “c” which I am assuming to be a missing edge and I calculate the likelihood ratios by summing across all three edges (based on `s.get_edges_prob([],[a], entropy_args=dict(partition_dl=False))`, `s.get_edges_prob([],[b], entropy_args=dict(partition_dl=False))` and `s.get_edges_prob([c],[], entropy_args=dict(partition_dl=False))`). If I find \lambda_a > \lambda_c > \lambda_b can I read this that “a” is more likely to be spurious than “c” is to be missing (which in turn is more likely to be spurious than "b" is to be missing)? Or is such a comparison not really meaningful anyways? Best, Philipp
On 12.09.2017 13:01, Philipp-Maximilian Jacob wrote:
Hi Tiago,
Thank you for that explanation. A quick follow-up:
If calculating likelihoods of both missing and spurious edges would I expect the output to be on a continuous scale of existence likelihood? Assume there is an edge “c” which I am assuming to be a missing edge and I calculate the likelihood ratios by summing across all three edges (based on `s.get_edges_prob([],[a], entropy_args=dict(partition_dl=False))`, `s.get_edges_prob([],[b], entropy_args=dict(partition_dl=False))` and `s.get_edges_prob([c],[], entropy_args=dict(partition_dl=False))`). If I find \lambda_a > \lambda_c > \lambda_b can I read this that “a” is more likely to be spurious than “c” is to be missing (which in turn is more likely to be spurious than "b" is to be missing)? Or is such a comparison not really meaningful anyways?
Yes, this is totally fine. The "spurious" and "missing" edges are arbitrary modifications to the graph, and the probabilistic model does not distinguish between them. Best, Tiago -- Tiago de Paula Peixoto <tiago@skewed.de>
participants (3)
-
P-M -
Philipp-Maximilian Jacob -
Tiago de Paula Peixoto