Combining Neo4J and Hadoop (part II)

In the previous post Combining Neo4J and Hadoop (part I) we described the way we combine Hadoop and Neo4J and how we are getting the data into Neo4J.

In this second part we will take you through the journey we took to implement a distributed way to create a Neo4J database. The idea is to use our Hadoop cluster for creating the underlying file structure of a Neo4J database.
To do this we must first understand this file-structure. Luckily Chris Gioran has done a great job describing this structure in his blog Neo4J internal file storage
The description was done for version 1.6 but largely still matches the 1.8 file-structure.
First I’ll start with a small recap of the file-structure.

Read more →

Combining Neo4J and Hadoop (part I)

Why combine these two different things.
Hadoop is good for data crunching, but the end-results in flat files don’t present well to the customer, also it’s hard to visualize your network data in excel.

Neo4J is perfect for working with our networked data. We use it a lot when visualizing our different sets of data.
So we prepare our dataset with Hadoop and import it into Neo4J, the graph database, to be able to query and visualize the data.
We have a lot of different ways we want to look at our dataset so we tend to create a new extract of the data with some new properties to look at every few days.

This blog is about how we combined Hadoop and Neo4J and describes the phases we went trough in our search for the optimal solution.
Read more →

Installing a nodejs application without your good old internet

While we were building a little server to enable auditlogging on our hadoop cluster (more on that in a future blogpost) we needed a way to distribute our application.
This blog is about the packaging of this application. The application is build with nodejs and packaging and dependency management is mostly done with npm (the node package manager).

Of course installing this application in the production environment should have been as easy as the setup on our own laptop’s right? Wrong! On our laptops it was a easy git clone followed by a npm install and voila we have a running application. So how hard could it be to do this on a server at the client. Let me tell you….
Read more →

How the quest for transaction timeout’s did cost me money

At our project the focus is at making the application stable and controllable. So instead of building cool new features
we are spending our time making sure the application is able to run stable in the production environment.

After the first few issues the so called ‘Transaction timeout’ issue raised it’s ugly head.
Every now and then the application threw an exception due to a transaction timeout.
This was very strange since the timeout was set to 30 seconds and the complete processing of the whole
application was done in less than 2 seconds (spread over more than 1 transaction).
Read more →

Improving web application performance by parallelizing requests

For a web application i develop we had a problem with the performance. After a small investigation we found out that it had relations with the amount of requests to the server that were done.

The application is running in a browser (currently IE7) and browsers are generally limited to do not more then 2 parallel request to the same domain.(this has improved a bit in later versions of the browsers). In this post i will describe the quest for solutions.

Read more →

Spring JMS and WebSphere

Using Spring JMS in our application which needs to be running on WebSphere proved to be somewhat of a challenge. And since googling provided a lot of information but just a small ‘easy to miss’ piece of text to put the pieces together, i decided to write up this blog.

Read more →

Open Source GIS experiences

After being away from the GIS world for a while, I started working on a new project replacing the current used software by an open source alternative. The first small application that needed to be made was for an emergency phone call center to show the position of the caller on a map. After that a few prototypes should prove that it was doable to replace the current software stack by open source alternatives.

In this blog I will describe the tools used, a few of the problems I ran into and of course the solutions to the problems which involve coding and communication 😉

The tools used where a Java based server called Geoserver and a client side JavaScript library called OpenLayers.

Read more →

Devoxx Antwerp 2008 – Impressions

University

Monday 8th of December 2008 was the start of a week full of information. After attending the complete conference (including the University sessions) last year I felt it would be a good thing to do the same this year.

The university sessions give me a change to get more in-depth knowledge on some of the subjects. For this first day I had chosen the sessions on Scala and Java Power Tools.

Scala

The session about Scala has got me really interested in this (for me) new language. The combination of Object Oriented and Functional programming, the tight integration with Java (in the end its all Java bytecode) and the conciseness makes it worth my while to have a closer look. As Ted Neward mentioned in his talk: ‘Today start with Scala to experiment and prototype, so next year you’ll have the advantage of Scala knowledge to be able to use it in production systems.’

Read more →