Combining Neo4J and Hadoop (part II)
In the previous post Combining Neo4J and Hadoop (part I) we described the way we combine Hadoop and Neo4J and how we are getting the data into Neo4J.
In this second part we will take you through the journey we took to implement a distributed way to create a Neo4J database. The idea is to use our Hadoop cluster for creating the underlying file structure of a Neo4J database.
To do this we must first understand this file-structure. Luckily Chris Gioran has done a great job describing this structure in his blog Neo4J internal file storage
The description was done for version 1.6 but largely still matches the 1.8 file-structure.
First I’ll start with a small recap of the file-structure.