Java | Methodology | Performance | Testing
Performance testing with Selenium and JMeter Mark Bakker 18 Nov, 2012
In the previous post Combining Neo4J and Hadoop (part I) we described the way we combine Hadoop and Neo4J and how we are getting the data into Neo4J.
In this second part we will take you through the journey we took to implement a distributed way to create a Neo4J database. The idea is to use our Hadoop cluster for creating the underlying file structure of a Neo4J database.
To do this we must first understand this file-structure. Luckily Chris Gioran has done a great job describing this structure in his blog Neo4J internal file storage
The description was done for version 1.6 but largely still matches the 1.8 file-structure.
First I’ll start with a small recap of the file-structure.
01 ff ff ff ff ff ff ff ff // root node, no relationships, no properties
01 00 00 00 00 00 00 00 01 // node 1, first relationship 0, first property 1
01 00 00 00 02 00 00 00 04 // node 2, first relationship 2, first property 4
4e 6f 64 65 53 74 6f 72 65 20 76 30 2e 41 2e 30 // NodeStore v0.A.0
01 00 00 00 02 00 00 00 01 00 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff // relationship 1, from node 2, to node 1, type 0, no prev, no next,
01 00 00 00 04 00 00 00 03 00 00 00 01 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00 00 00 06
01 00 00 00 06 00 00 00 05 00 00 00 01 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00 00 00 0a
52 65 6c 61 74 69 6f 6e 73 68 69 70 53 74 6f 72 65 20 76 30 2e 41 2e 30
RowNum NodeId Property1 Property2 PropertyN
0 AAA nameOfA amountOfA someAThing
1 BBB nameOfB amountOfB someBThing
2 CCC nameOfC amountOfC someCThing
3 DDD nameOfD amountOfD someDThing
RowNum fromNodeId ToNodeId EdgeProperty1 EdgePropertyN
0 AAA BBB someDate1 someNumber1
1 AAA DDD someDate2 someNumber2
2 BBB DDD someDate3 someNumber3
3 CCC BBB someDate4 someNumber4
4 DDD BBB someDate5 someNumber5
5 DDD CCC someDate6 someNumber6
nodeNum nodeId relNum fromNodeId ToNodeId fromNodeNum
0 AAA 0 AAA BBB 0
0 AAA 1 AAA DDD 0
1 BBB 2 BBB DDD 1
2 CCC 3 CCC BBB 2
3 DDD 4 DDD BBB 3
3 DDD 5 DDD CCC 3
nodeNum nodeId relNum fromNodeId ToNodeId toNodeNum
1 BBB 0 AAA BBB 1
3 DDD 1 AAA DDD 3
3 DDD 2 BBB DDD 3
1 BBB 3 CCC BBB 1
1 BBB 4 DDD BBB 1
2 CCC 5 DDD CCC 2
nodeNum nodeId relNum fromNodeId ToNodeId fromNodeNum toNodeNum
0 AAA 0 AAA BBB 0 1
0 AAA 1 AAA DDD 0 3
1 BBB 2 BBB DDD 1 3
2 CCC 3 CCC BBB 2 1
3 DDD 4 DDD BBB 3 1
3 DDD 5 DDD CCC 3 2
1 BBB 0 AAA BBB 0 1
3 DDD 1 AAA DDD 0 3
3 DDD 2 BBB DDD 1 3
1 BBB 3 CCC BBB 2 1
1 BBB 4 DDD BBB 3 1
2 CCC 5 DDD CCC 3 2
nodeNum nodeId relNum fromNodeId ToNodeId fromNodeNum toNodeNum
0 AAA 1 AAA DDD 0 3
0 AAA 0 AAA BBB 0 1
1 BBB 4 DDD BBB 3 1
1 BBB 3 CCC BBB 2 1
1 BBB 2 BBB DDD 1 3
1 BBB 0 AAA BBB 0 1
2 CCC 5 DDD CCC 3 2
2 CCC 3 CCC BBB 2 1
3 DDD 5 DDD CCC 3 2
3 DDD 4 DDD BBB 3 1
3 DDD 2 BBB DDD 1 3
3 DDD 1 AAA DDD 0 3
nodeNum nodeId relNum fromNodeId ToNodeId fromNodeNum toNodeNum
0 AAA 1 AAA DDD 0 3
1 BBB 4 DDD BBB 3 1
2 CCC 5 DDD CCC 3 2
3 DDD 5 DDD CCC 3 2
1 1 0
1 4 4
1 5 8
1 5 12
nodeNum relNum fromNodeNum toNodeNum
0 1 0 3
0 0 0 1
1 4 3 1
1 3 2 1
1 2 1 3
1 0 0 1
2 5 3 2
2 3 2 1
3 5 3 2
3 4 3 1
3 2 1 3
3 1 0 3
nodeNum relNum fromNodeNum toNodeNum next previous
0 1 0 3 0 x
0 0 0 1 x 1
1 4 3 1 3 x
1 3 2 1 2 4
1 2 1 3 0 3
1 0 0 1 x 2
2 5 3 2 3 x
2 3 2 1 x 5
3 5 3 2 4 x
3 4 3 1 2 5
3 2 1 3 1 4
3 1 0 3 x 2
nodeNum relNum fromNodeNum toNodeNum next previous nodeNum2 relNum2 fromNodeNum2 toNodeNum2 next2 previous2
0 1 0 3 0 x 0 1 0 3 0 x
0 1 0 3 0 x 3 1 0 3 x 2
0 0 0 1 x 1 0 0 0 1 x 1
0 0 0 1 x 1 1 0 0 1 x 2
1 4 3 1 3 x 1 4 3 1 3 x
1 4 3 1 3 x 3 4 3 1 2 5
1 3 2 1 2 4 1 3 2 1 2 4
1 3 2 1 2 4 2 3 2 1 x 5
1 2 1 3 0 3 1 2 1 3 0 3
1 2 1 3 0 3 3 2 1 3 1 4
1 0 0 1 x 2 1 0 0 1 x 2
1 0 0 1 x 2 0 0 0 1 x 1
2 5 3 2 3 x 2 5 3 2 3 x
2 5 3 2 3 x 3 5 3 2 4 x
2 3 2 1 x 5 2 3 2 1 x 5
2 3 2 1 x 5 1 3 2 1 2 4
3 5 3 2 4 x 3 5 3 2 4 x
3 5 3 2 4 x 2 5 3 2 3 x
3 4 3 1 2 5 3 4 3 1 2 5
3 4 3 1 2 5 1 4 3 1 3 x
3 2 1 3 1 4 3 2 1 3 1 4
3 2 1 3 1 4 1 2 1 3 0 3
3 1 0 3 x 2 3 1 0 3 x 2
3 1 0 3 x 2 0 1 0 3 0 x
nodeNum relNum fromNodeNum toNodeNum next previous nodeNum2 relNum2 fromNodeNum2 toNodeNum2 next2 previous2
0 1 0 3 0 x 3 1 0 3 x 2
0 0 0 1 x 1 1 0 0 1 x 2
1 4 3 1 3 x 3 4 3 1 2 5
1 3 2 1 2 4 2 3 2 1 x 5
1 2 1 3 0 3 3 2 1 3 1 4
1 0 0 1 x 2 0 0 0 1 x 1
2 5 3 2 3 x 3 5 3 2 4 x
2 3 2 1 x 5 1 3 2 1 2 4
3 5 3 2 4 x 2 5 3 2 3 x
3 4 3 1 2 5 1 4 3 1 3 x
3 2 1 3 1 4 1 2 1 3 0 3
3 1 0 3 x 2 0 1 0 3 0 x
nodeNum relNum fromNodeNum toNodeNum next previous nodeNum2 relNum2 fromNodeNum2 toNodeNum2 next2 previous2
0 1 0 3 0 x 3 1 0 3 x 2
0 0 0 1 x 1 1 0 0 1 x 2
1 2 1 3 0 3 3 2 1 3 1 4
2 3 2 1 x 5 1 3 2 1 2 4
3 5 3 2 4 x 2 5 3 2 3 x
3 4 3 1 2 5 1 4 3 1 3 x
relNum fromNodeNum toNodeNum fromnext fromprevious tonext toprevious
0 0 1 x 1 x 2
1 0 3 0 x x 2
2 1 3 0 3 1 4
3 2 1 x 5 2 4
4 3 1 2 5 3 x
5 3 2 4 x 3 x