Quantcast
Channel: nodejs – ArangoDB

ArangoDB Client for nodejs released

0
0

Note: Our new official nodejs driver is arangojs. You also can look at this blogpost about the new driver.

We got a note from Anders Elo from Sweden. He told us that he has released a ArangoDB client for node.js. Awesome! 🙂 You can find it on Github under the URL https://github.com/kaerus/arango-client. To install locally

npm install git://github.com/kaerus/arango-client

Anders also writes:

  • It’s still lacking features and is prone to changes but it works quite nice, atleast I think so. 🙂 I’ve not documented this project as of yet due to lack of time so I’ll briefly instruct you by some examples.*

var arango = require('arango.client'), util = require('util');

/* Prepare a connection, defaults {protocol:'http', hostname:'127.0.0.1', port: 8529} */
db = new arango.Connection({name:"testcollection"});

/* we need to first create our test collection */
db.collection.create(function(err,ret){
  console.log("err(%s): ",err, ret);
});

/* create a new document in collection */
db.document.create({a:"test"},function(err,ret){
  if(err) console.log("error(%s): ", err,ret);
  else console.log(util.inspect(ret));
});

/* create a document and a new collection on demand */
db.document.create(true,"newcollection",{a:"test"},function(err,ret){
  if(err) console.log("error(%s): ", err,ret);
  else console.log(util.inspect(ret));
});

/* alternate style utilizing events */
db.document.list().on('result',function(result){
  console.log(util.inspect(result));
}).on('error',function(error){
  console.log("error(%s):", error.code, error.message);
});

The interface to the ArangoDB REST api resides in the lib/api directory. You can find out more details there. Anders also includes some tests, just runt “npm test” to execute them.


Feature Preview: Using NPM packages for ArangoDB

0
0

ArangoDB follows the Common.JS specification for modules. However, it would be very convenient, if there was an easy way to install a package like “underscore.js”. These package are, for instance, available using NPM. There is a draft for packages on Common.JS which seems to be compatible with NPM.

NPM has a neat way of dealing with version conflicts. Basically, it allows multiple versions to exists simultaneously. For example, assume you have 4 packages A, B, C, D. A requires B and C and D, B requires C. Then directory layout might be as follows.

node_modules
|
+- A
|  |
|  +- node_modules
|     |
|     +- B
|     |  |
|     |  +- node_modules
|     |     |
|     |     +- C (1.0.0)
|     |
|     +- C (2.0.0)
|
+- D

Package B will see package C in version 1.0.0, while package A sees package C in version 2.0.0.

<!--more-->

This behaviour is easy to implement in ArangoDB. In addition to “Module” there is now a “Package”. Each package has it own module cache. When a package requires a module, the package hierarchy is traversed from the current package to the root (or global) package, until the module is found.

In order to use underscore, switch into the package directory and use NPM to install it

unix> cd /tmp/packages
unix> npm install underscore
npm http GET https://registry.npmjs.org/underscore
npm http 304 https://registry.npmjs.org/underscore
underscore@1.4.4 node_modules/underscore

Now start arangosh with the new “–javascript.package-path” option and enjoy underscore.

unix> arangosh --javascript.package-path /tmp/packages
arangosh> var _ = require("underscore");
arangosh> _.max([1,2,3])
3

FullStack London

0
0

I recently had the chance to visit FullStack London, a well organized conference. Thanks a lot to Skills Matter. FullStack was opened by Douglas Crockford about “The Better Parts” of ES6. I cannot wait to start using them. Douglas was followed by Isaac Schlueter talking about open source in companies. Although this talk was not technical I learned a lot and it was very inspiring.

The remainder of the conference was all about using JavaScript mostly on server-side using Node.js or in robotics. As robotics is not my kind of topic I visited the talks about server-side JS. They confirmed my impression where JS development is heading to: Microservices.

Microservice Architecture

One thing that has to be changed in the developers mindset is the software architecture. It basically means to trash everything I learned about good software design at university and replace it by the following structure:

  1. Create a swarm of microservices wrapping around the database(s) as a bottom layer.
  2. Possibly create another layer of microservices accessing the bottom layer and combining their results.
  3. You could create another layer of microservices here but you should keep the architecture as flat as possible.
  4. Create arbitrary many applications using the highest layer of microservices.

From the database engineers point of view the points 1. and 3. are of utmost interest: Keep the structure as flat as possible and the lowest layer is just wrapping around the database API. So why not extend the database API by embedding the microservice directly into it? This will give you several benefits:

  • Increased performance:
    • Skip the communication overhead for one layer.
    • Raw access to the data allows even more efficient filtering.
  • Let the database do the scaling:
    • If you have larger amount of data you need several servers anyway.
    • If you just have high read through put a single server would be a bottleneck.

ArangoDB’s Foxx framework allows you to do this, extend ArangoDB’s API with your own Foxx app (written in JavaScript, you can reuse most of your NodeJS code). Also several of the talks spawned ideas in my mind to even increase the benefits of creating micro services with Foxx. It also spawned another idea of a talk I might be giving in the future, so stay updated.

Now I am looking forward to Span Conf and the LJC.

Crawling GITHUB with Promises

0
0

The new Javascript driver no longer imposes any promises implementation. It follows the standard callback pattern with a callback using err and res.

I wanted to give the new driver a try. A github crawler seemed like a good side-project, especially because the node-github driver follows the same conventions as the Javascript driver.

There are a lot of promise libraries out there. The most popular one – according to NPM – was promises. It should be possible to use any implementation. Therefore I used this one.

The following source code can be found on github.

Pagination with Promises made easy

The github driver has a function to get all followers. However, the result is paginated. With two helper functions and promises it is straight forward to implement a function to retrieve all followers of an user.

function extractFollowers (name) {
  'use strict';

  return new Promise(function(resolve, reject) {
    github.user.getFollowers({ user: name }, promoteError(reject, function(res) {
      followPages(resolve, reject, [], res);
    }));
  });
}

The followPages function simply extends the result with the next page until the last page is reached.

function followPages (resolve, reject, result, res) {
  'use strict';

  var i;

  for (i = 0;  i < res.length;  ++i) {
    result.push(res[i]);
  }

  if (github.hasNextPage(res)) {
    github.getNextPage(res, promoteError(reject, function(res) {
      followPages(resolve, reject, result, res);
    }));
  }
  else {
    resolve(result);
  }
}

The promote error helper is a convenience function to bridge callbacks and promises.

function promoteError (reject, resolve) {
  'use strict';

  return function(err, res) {
    if (err) {
      if (err.hasOwnProperty("message") && /rate limit exceeded/.test(err.message)) {
        rateLimitExceeded = true;
      }

      console.error("caught error: %s", err);
      reject(err);
    }
    else {
      resolve(res);
    }
  };
}

I’ve decided to stick to the sequence reject (aka err) followed by resolve (aka res) – like the callbacks. The promoteError can be used for the github callback as well as the ArangoDB driver.

Queues, Queues, Queues

I’ve only needed a very simple job queue, so queue-it is a good choice. It provides a very simple API for handling job queues:

POST /queue/job
POST /queue/worker
DELETE /queue/job/:key

The new Javascript driver allows to access arbitrary endpoint. First install a Foxx implementing the queue microservice in an ArangoDB instance.

foxx-manager install queue-it /queue

Adding a new job from node.js is now easy

function addJob (data) {
  'use strict';

  return new Promise(function(resolve, reject) {
    db.endpoint("queue").post("job", data, promoteError(reject, resolve));
  });
}

Transaction

I wanted to crawl users and their repos. The relations (“follows”, “owns”, “is_member”, “stars”) is stored in an edge collection. I only add an edge if it is not already there. Therefore I check inside a transaction, if the edge exists and add it, if it does not.

createRepoDummy(repo.full_name, data).then(function(dummyData) {
  return db.transaction(
    "relations",
    String(function(params) {
      var me = params[0];
      var you = params[1];
      var type = params[2];
      var db = require("org/arangodb").db;

      if (db.relations.firstExample({ _from: me, _to: you, type: type }) === null) {
        db.relations.save(me, you, { type: type });
      }
    }),
    [ meId, "repos/" + data._key, type ],
    function(err) {
      if (err) {
        throw err;
      }

      return handleDummy(dummyData);
    });
})

Please note that the action function is executed on the server and not in the nodejs client. Therefore we need to pass the relevant data as parameters. It is not possible to use the closure variables.

Riding the Beast

Start an ArangoDB instance (i.e. inside a docker container) and install the simple queue.

foxx-manager install queue-it /queue

Start the arangosh and create collections users, repos and relations.

arangosh> db._create("users");
arangosh> db.users.ensureHashIndex("name");

arangosh> db._create("repos");
arangosh> db.repos.ensureHashIndex("name");

arangosh> db._createEdgeCollection("relations");

Now everything is initialized. Fire up nodejs and start crawling.

node> var crawler = require("./crawler");
node> crawler.github.authenticate({ type: "basic", username: "username", password: "password" })
node> crawler.addJob({ type:"user", identifier:"username" })
node> crawler.runJobs();

Please keep in mind that this is just an experiment. There is no good error handling and convenience functions for setup and start. It is also not optimized for performance. For instance, it would easily be possible to avoid nodejs / ArangoDB roundtrips using more transactions.

Sources used in this example:

The source code of this example is available from Github: https://github.com/fceller/Foxxmender

If you want to continue with other JavaScript related resources, you should start with ArangoDB NoSQL and JavaScript.

Building a self-learning game with ArangoDB, io.js & AngularJS in half a day.

0
0

With the ArangoDB Foxx Microservice Framework we’ve introduced an easy way to create a Web API right on top of the NoSQL database.

In early January Max challenged Andreas (AngularJS / NodeJS) that they could build a full-stack application within half a day.

The web application – in short – is a guessing game, in which the computer tries to guess a thing or animal you think of by asking a series of questions, for which you provide the answers. Here’s a demo:

http://guesser.9hoeffer.de:8000

GuesserApp

We use a single collection storing the questions as well as the guesses, which are organized in a binary tree. Each question node has a left and a right child for the two different answers, and each guess is a leaf in the tree. We do not use a graph data model for this since our queries will only be single document lookups. But of course, with ArangoDB you would have the choice to use graph traversals or to JOIN data from different collections.

Andreas and Max have created a 10 step tutorial on Github to illustrate a possible architecture of a web application using AngularJS for the frontend in the browser, io.js as application server and ArangoDB as backend database. All necessary steps are described in detail and you can follow the evolution of the system by looking at each stage without typing code yourself.

GitGuesser We are particularly focusing on ArangoDB and its Foxx Microservice Framework, and only briefly show the io.js and AngularJS parts. In particular, this is not intended to be an AngularJS or io.js tutorial. We even use some shortcuts that one would usually not deploy in production to keep the app simple. Nevertheless, the architecture of the application is in principle suitable as a blueprint for an actual, larger web application.

https://github.com/ArangoDB/guesser

Thanks to Andreas (Freelance NodeJS / AngularJS Developer – https://github.com/m0ppers) and Max (ArangoDB Core – https://github.com/neunhoef) for this tutorial.

If you want to continue with other JavaScript related resources, you should start with ArangoDB NoSQL and JavaScript.

LoopBack Connector for ArangoDB

0
0

ArangoDB can be used as a backend data source for APIs that you compose with the popular open-source LoopBack Node.js framework.

strongloop

In a recent blog article on StrongLoop, Nicholas Duffy explains how to use his new loopback-connector-arango connector to access ArangoDB:

Getting Started with the Node.js LoopBack Connector for ArangoDB

The tutorial uses the loopback-connector-arango which is available as npm and a demo application which is available from Github.

npm install --save duffn/loopback-connector-arango

What is LoopBack?

LoopBack is a highly-extensible, open-source Node.js framework that enables you to quickly compose scalable APIs, runs on top of the Express web framework and conforms to the Swagger 2.0 specification.

  • Quickly create dynamic end-to-end REST APIs.
  • Connect devices and browsers to data and services.
  • Use Android, iOS, and AngularJS SDKs to easily create client apps.
  • Add-on components for push, file management, 3rd-party login, and geolocation.
  • Use StrongLoop Arc to visually edit, deploy, and monitor LoopBack apps.
  • LoopBack API gateway acts an intermediary between API consumers (clients) and API providers to externalize, secure, and manage APIs.
  • Runs on-premises or in the cloud

We can’t wait to see how the community will adapt the ArangoDB connector and start building scaleable APIs with the LoopBack framework on top of ArangoDB.

Please tell us what you’ve build using LoopBack and ArangoDB!

Running V8 isolates in a multi-threaded ArangoDB database

0
0

ArangoDB allows running user-defined JavaScript code in the database. This can be used for more complex, stored procedures-like database operations. Additionally, ArangoDB’s Foxx framework can be used to make any database functionality available via an HTTP REST API. It’s easy to build data-centric microservices with it, using the scripting functionality for tasks like access control, data validation, sanitation etc.

We often get asked how the scripting functionality is implemented under the hood. Additionally, several people have asked how ArangoDB’s JavaScript functionality relates to node.js.

This post tries to explain that in detail.

The C++ parts

arangosh, the ArangoShell, and arangod, the database server, are written in C++ and they are shipped as native code executables. Some parts of both arangosh and arangod itself are written in JavaScript (more on that later).

The I/O handling in arangod is written in C++ and uses libev (written in C) for the low-level event handling. All the socket I/O, working scheduling and queueing is written in C++, too. These are parts that require high parallelism, so we want this to run in multiple threads.

All the indexes, the persistence layer and many of the fundamental operations, like the ones for document inserts, updates, deletes, imports are written in C++ for effective control of memory usage and parallelism. AQL’s query parser is written using the usual combination of Flex and Bison, which generate C files that are compiled to native code. The AQL optimizer, AQL executor and many AQL functions are writting in C++ as well.

Some AQL functions however, are written in JavaScript. And if an AQL query invokes a user-defined function, this function will be a JavaScript function, too.

How ArangoDB uses V8

How is JavaScript code executed in ArangoDB?

Both arangosh and arangod are linked against the V8 JavaScript engine library. V8 (itself written in C++) is the component that runs the JavaScript code in ArangoDB.

V8 requires JavaScript code to run in a so-called isolate (note: I’ll be oversimplifying a bit here – in reality there are isolates and contexts). As the name suggests, isolates are completely isolated from each other. Especially, data cannot be shared or moved across isolates, and each isolate can be used by only one thread at a time.

Let’s look at how arangosh, the ArangoShell, uses V8. All JavaScript commands entered in arangosh will be compiled and executing with V8 immediately. In arangosh, this happens using a single V8 isolate.

On the server side, things are a bit different. In arangod, there are multiple V8 isolates. The number of isolates to create is a startup configuration option (--javascript.v8-contexts). Creating multiple isolates allows running JavaScript code in multiple threads, truly parallel. Apart from that, arangod has multiple I/O threads (--scheduler.threads configuration option) for handling the communication with client applications.

As mentioned earlier, part of ArangoDB’s codebase itself is written in JavaScript, and this JavaScript code is executed the same way as any user-defined will be executed.

Executing JavaScript code with V8

For executing any JavaScript code (built-in or user-defined), ArangoDB will invoke V8’s JIT compiler to compile the script code into native code and run it.

The JIT compiler in V8 will not try extremely hard to optimize the code on the first invocation. On initial compilation, it will aim for a good balance of optimizations and fast compilation time. If it finds some code parts are called often, it may re-try to optimize these parts more aggressively automatically. To make things even more complex, there are different JIT compilers in V8 (i.e. Crankshaft and Turbofan) with different sweet spots. JavaScript modes (i.e. strict mode and strong mode) can also affect the level of optimizations the compilers will carry out.

Now, after the JavaScript code has been compiled to native code, V8 will run it until it returns or fails with an uncaught exception.

But how can the JavaScript code access the database data and server internals? In other words, what actually happens if a JavaScript command such as the following is executed?

/* js example JavaScript command */
db.myCollection.save({ _key: "test" });

Accessing server internals from JavaScript

Inside arangod, each V8 isolate is equipped with a global variable named db. This JavaScript variable is a wrapper around database functionality written in C++. When the db object is created, we tell V8 that its methods are C++ callbacks.

Whenever the db object is accessed in JavaScript, the V8 engine will therefore call C++ methods. These provide full access to the server internals, can do whatever is required and return data in the format that V8 requires. V8 then makes the return data accessible to the JavaScript code.

Executing db.myCollection.save(...) is effectively two operations: accessing the property myCollection on the object db and then calling function save on that property. For the first operation, V8 will invoke the object’s NamedPropertyHandler, which is a C++ function that is responsible for returning the value for the property with the given name (myCollection). In the case of db, we have a C++ function that collection object if it exists, or undefined if not.

The collection object again has C++ bindings in the background, so calling function save on it will call another C++ function. The collection object also has a (hidden) pointer to the C++ collection. When save is called, we will extract that pointer from the this object so we know which C++ data structures to work on. The save function will also get the to-be-inserted document data as its payload. V8 will pass this to the C++ function as well so we can validate it and convert it into our internal data format.

On the server side, there are several objects exposed to JavaScript that have C++ bindings. There are also non-object functions that have C++ bindings. Some of these functions are also bolted on regular JavaScript objects.

Accessing server internals from ArangoShell

When running the same command in arangosh, things will be completely different. The ArangoShell may run on the same host as the arangod server process, but it may also run on a completely different one. Providing arangosh access to server internals such as pointers will therefore not work in general. Even if arangosh and arangod do run on the same host, they are independent processes with no access to the each other’s data. The latter problem could be solved by having a shared memory segment that both arangosh and arangod can use, but why bother with that special case which will provide no help in the general case when the shell can be located on any host.

To make the shell work in all these situations, it uses the HTTP REST API provided by the ArangoDB server to talk to it. For arangod, any ArangoShell client is just another client, with no special treatments or protocols.

As a consequence, all operations on databases and collections run from the ArangoShell are JavaScript wrappers that call their respective server-side HTTP APIs.

Recalling the command example again (db.myCollection.save(...)), the shell will first access the property myCollection of the object db. In the shell db is a regular JavaScript object with no C++ bindings. When the shell is started, it will make an HTTP call to arangod to retrieve a list of all available collections, and register them as properties in its db object. Calling the save method on one of these objects will trigger an HTTP POST request to the server API at /_api/document?collection=myCollection, with the to-be-inserted data in its request body. Eventually the server will respond and the command will return with the data retrieved from the server.

Considerations

Consider running the following JavaScript code:

/* js code to insert 1000 documents */
for (var i = 0; i < 1000; ++i) {
  db.myCollection.save({ _key: "test" + i });
}

When run from inside the ArangoShell, the code will be executed in there. The shell will perform an HTTP request to arangod for each call to save. We’ll end up with 1,000 HTTP requests.

Running the same code inside arangod will trigger no HTTP requests, as the server-side functions are backed with C++ internals and can access the database data directly. It will be a lot faster to run this loop on the server than in arangosh. A while ago I wrote another article about this.

When replacing the ArangoShell with another client application, things are no different. A client application will not have access to the server internals, so all it can do is to make requests to the server (by the way, the principle would be no different if we used MySQL or other database servers, only the protocols would vary).

Fortunately, there is a fix for this: making the code run server-side. For example, the above code can be put into a Foxx route. This way it is not only fast but will be made accessible via an HTTP REST API so client applications can call it with a single HTTP request.

In reality, database operations will be more complex than in the above example. And this is where having a full-featured scripting language like JavaScript helps. It provides all the features that are needed for more complex tasks such as validating and sanitizing input data, access control, executing database queries and postprocessing results.

The differences to node.js

To start with: ArangoDB is not node.js, and vice versa. ArangoDB is not a node.js module either. ArangoDB and node.js are completely independent. But there is a commonality: both ArangoDB and node.js use the V8 engine for running JavaScript code.

Threading

AFAIK, standard node.js only has a single V8 isolate to run all code in. While that made the implementation easier (no hassle with multi-threading) it also limits node.js to using only a single CPU.

It’s not unusual to see a multi-core server with a node.js instance maxing out one CPU while the other CPUs are sitting idle. In order to max out a multi-core server, people often start multiple node.js instances on a single server. That will work fine, but the node.js instances will be independent, and sharing data between them is not possible in plain JavaScript.

And because a node.js instance is single-threaded, it is also important that code written for node.js is non-blocking. Code that blocks while waiting for some I/O operation would block the only available CPU. Using non-blocking I/O operations allows node.js to queue the operation, and execute other code in the meantime, allowing overall progress. This also makes it look like it would be executing multiple actions in parallel, while it is actually executing them sequentially.

Contrary, arangod is a multi-threaded server. It can serve multiple requests in parallel, using multiple CPUs. Because arangod has multiple V8 isolates that each can execute JavaScript code, it can run JavaScript in multiple threads in parallel.

arangosh, the ArangoShell, is single-threaded and provides only a single V8 isolate.

Usage of modules

Both node.js and ArangoDB can load code at runtime so it can be organized into modules or libraries. In both, extra JavaScript modules can be loaded using the require function.

There is often confusion about whether node.js modules can be used in ArangoDB. This is probably because the answer is “it depends!“.

node.js packages can be written in JavaScript but they can also compile to native code using C++. The latter can be used to extend the functionality of node.js with features that JavaScript alone wouldn’t be capable of. Such modules however often heavily depend on a specific V8 version (so do not necessarily compile in a node.js version with a different version of V8) and often rely on node.js internals.

ArangoDB can load modules that are written in pure JavaScript. Modules that depend on non-JavaScript functionality (such as native modules for node.js) or modules that rely on node.js internals cannot be loaded in ArangoDB. As a rule of thumb, any module will run in ArangoDB that is implemented in pure JavaScript, does not access global variables and only requires other modules that obey the same restrictions.

ArangoDB also uses several externally maintained JavaScript-only libraries, such as underscore.js. This module will run everywhere because it conforms to the mentioned restrictions.

ArangoDB also uses several other modules that are maintained on npm.js. An example module is AQB, a query builder for AQL. It is written in pure JavaScript too, so it can be used from a node.js application and from within ArangoDB. If there is an updated version of this module, we use npm to install it in a subdirectory of ArangoDB. As per npm convention, the node.js modules shipped with ArangoDB reside in a directory named node_modules. Probably this is what caused some of the confusion.





Latest Images