Lately I have been busy with graph databases. After reading the free eBook “Graph Databases” I installed Neo4j and played around with it. Later I went as far as to follow the introduction course as well as the advanced graph modeling course at Xebia. This really helped me start playing around with Neo4j in a bit more structured manner than I was doing before the course.
I can recommend installing Neo4j and just starting to use it, as it has a great user interface with excellent help manuals. For instance, this is the startscreen:
Easy, right?
One of the things that struck me was the ease with which you could access the data from ECMAscript (or Javascript if you’re very old and soon-to-be obsoleted). Using the REST API you can access the graph in several ways, reading and writing data from and to the database. It’s the standard interface, actually. There’s a whole section in the Neo4j help dedicated to using the REST API, so I’ll leave most of it alone for now.
What’s important, is that you can also fire CYPHER queries at the database, receiving an answer in either JSON or XML notation, or even as an HTML page. This is important because CYPHER queries are *very* easy to write and understand. As an example, the following query will search the sample database that is part of the Neo4j database, with Movies and Actors.
Suppose we want to show all nodes that are of type Movie. Then the statement would be:
MATCH (m:Movie) RETURN m
A standard query to discover what’s in the database is
MATCH (m) RETURN m LIMIT 100
This is limited to 100 items (nodes and/or relationships), because it does return the entire database otherwise and in the user interface this starts to slow things down. It’s gorgeous, but when your resultsets are getting big it does slow things down. Here’s how it looks:
Very nice. But not that useful if we want a particular piece of data. However, if we want to show only the actors that played in movies, we could say:
MATCH (p:Person)-[n:ACTED_IN]->(m:Movie) RETURN p
This returns all nodes of type Person that are related to a node of type Movie through an edge of type ACTED_IN.
While I won’t go into more detail on Cypher, let’s just say it is a very powerful abstraction layer for queries on graphs that would be very hard to do with SQL. It’s not as performant as actually giving Neo4J explicit commands using the REST API, which you want to do if you build an application where sub-second performance is an issue, but for most day-to-day queries it’s pretty awesome.
So how do we use the REST API? That’s pretty easy, actually. There are two options, and one of them is now deprecated – that is the old CYPHER endpoint. So we use the new http://localhost:7474/db/data/transaction/commit
endpoint, which starts a transaction and immediately commits it. And yes, you can delete and create nodes through this endpoint as well so it’s highly recommended to not expose the database to the internet, unless you don’t mind everyone using your server as a public litterbox.
You have to POST requests to the endpoint. There are endpoints you can access with GET, like http://localhost:7474/db/data/node/1
which returns the node with id=1 on a HTML page, but the transactional endpoint is accessed using POST.
The easiest way to use a REST API is to start a simple webserver, create a simple HTML-page, add Javascript to it that responds to user input and that calls the Neo4j REST API.
Since we’re going to use Javascript, be smart and use JQuery as well. It’s pretty much a standard include.
How to proceed:
- First, start the Neo4j software. This opens a screen where you can start the database server, and in the bottom left of the screen you can see a button labeled “Options…”. Click that, then click the “Edit…” button in the Server Configuration section. Disable authentication for now (and make very sure you don’t do this on a server connected to the internet) by changing the code to the following:
# Require (or disable the requirement of) auth to access Neo4j
dbms.security.auth_enabled=false
This makes sure we don’t have the hassle of authentication for now. Don’t do this on a connected server though.
- Now, we start the Neo4j database. Otherwise we get strange errors.
- Then, proceed to build a new HTML-page (I suggest index.html) on your webserver, that looks like this:
<html>
<head>
<title>Cypher-test</title>
<script src="scripts/jquery-2.1.3.js"></script>
</head>
<body>
<script type="text/javascript">
function post_cypherquery() {
$('#messageArea').html('<h3>(loading)</h3>');
$.ajax({
url: "http://localhost:7474/db/data/transaction/commit",
type: 'POST',
data: JSON.stringify({ "statements": [{ "statement": $('#cypher-in').val() }] }),
contentType: 'application/json',
accept: 'application/json; charset=UTF-8'
}).done(function (data) {
$('#resultsArea').text(JSON.stringify(data));
/* process data */
// Data contains the entire resultset. Each separate record is a data.value item, containing the key/value pairs.
var htmlString = '<table><tr><td>Columns:</td><td>' + data.results[0].columns + '</td></tr>';
$.each(data.results[0].data, function (k, v) {
$.each(v.row, function (k2, v2) {
htmlString += '<tr>';
$.each(v2, function (property, nodeval) {
htmlString += '<td>' + property + ':</td><td>' + nodeval + '</td>';
});
htmlString += '</tr>';
});
});
$('#outputArea').html(htmlString + '</table>');
})
.fail(function (jqXHR, textStatus, errorThrown) {
$('#messageArea').html('<h3>' + textStatus + ' : ' + errorThrown + '</h3>')
});
};
</script>
<h1>Cypher-test</h1>
<p>
<div id="messageArea"></div>
<p>
<table>
<tr>
<td><input name="cypher" id="cypher-in" value="MATCH (n) RETURN n LIMIT 10" /></td>
<td><button name="post cypher" onclick="post_cypherquery();">execute</button></td>
</tr>
</table>
<p>
<div id="outputArea"></div>
<p>
</body>
</html>
Make sure you don’t forget to download JQuery and put the downloaded file in the scripts subdirectory below the directory in which you place this file. The line where you need to change the corresponding filename if you rename the file or place it somewhere else is highlighted in red.
While this doesn’t look very pretty, it gets the job done. It executes an AJAX call to Neo4j, using the transactional endpoint. After receiving a success-response, it writes the raw answer (JSON) into the resultsArea over the input box. Then, it parses the result and writes the results to a table in the dataArea.
The resultset from neo4j is returned as a data-object that looks like this:
{
"results" : [ {
"columns" : [ "n" ],
"data" : [
{"row" : [{"name":"Leslie Zemeckis"}]},
{"row" : [{"title":"The Matrix","released":1999,"tagline":"Welcome to the Real World"}]},
{"row" : [{"name":"Keanu Reeves","born":1964}]}
]
} ],
"errors" : [ ]
}
Note the different row-variants. Since we did not limit ourselves to a single type of node, we got both Movie- and Actor-nodes in the result. And even within a single node-type, not every node has the same properties. The neo4j manual has more information about the possible contents of the resultset.
Please note that ANY valid Cypher-statement will be executed, including CREATE and DELETE statements, so feel free to play around with this.
– Ronald Kunenborg.