Writing a custom binary format (.gt file) in spark
I've got a dataframe inside of Apache spark which I'd like to output as a .gt file. The dataframe is a table of edges between vertices like so: 0 9 0 8 9 1 I read the guide on the custom gt file format, but how do I actually write one for my edges dataframe? That too, inside of spark? Any ideas/examples you could point me to? Any help is appreciated, thank you!
On 25.05.2017 19:32, B wrote:
I've got a dataframe inside of Apache spark which I'd like to output as a .gt file.
The dataframe is a table of edges between vertices like so: 0 9 0 8 9 1
I read the guide on the custom gt file format, but how do I actually write one for my edges dataframe? That too, inside of spark? Any ideas/examples you could point me to?
The format is documented here: https://graph-tool.skewed.de/static/doc/gt_format.html But if you have a list of edges, you can save it as a binary numpy array, and load the graph with Graph.add_edge_list(). It should be of comparable speed. Best, Tiago -- Tiago de Paula Peixoto <tiago@skewed.de>
participants (2)
-
B -
Tiago de Paula Peixoto