Introducing Graphaware Databridge: Graph Data Import Made Simple
- by 7wData
Databridge is a fully-featured ETL tool specifically built for Neo4j, and designed for usability, expressive power and impressive performance.
Until now, Neo4j users wanting to import data into Neo4j have been faced with two choices: Create Cypher statements in conjunction with Cypher’s LOAD CSV or use Neo4j’s batch import tool.
Each of these approaches has its strengths and weaknesses. LOAD CSV is very flexible, but you need to learn Cypher, it struggles with large volumes of data and is relatively slow.
On the other hand, Neo4j’s batch import tool is extremely efficient at processing large data volumes. You don’t need to know any Cypher, but the input files usually need to be manually generated beforehand. Being a simple CSV loader, it also lacks the expressive power of Cypher.
Furthermore, many of the issues faced by any reasonably complex data import process can’t easily be solved using the existing tooling. Consequently, people often resort to creating bespoke solutions in code. We know because we’ve done it enough times ourselves.
Databridge
At GraphAware , we didn’t want to keep re-inventing the wheel at every new client we went to. So we took a different approach and built Databridge. Databridge is a fully-featured ETL tool specifically built for Neo4j, and designed for usability, expressive power, and impressive performance. It’s already in use at a number of GraphAware clients, and we think it’s now mature enough to bring it to the attention of the wider world.
So, in this blog post, we’re going to take a quick tour of the main features of Databridge, to give you an idea of what it can do, and to help you get a feel for whether it would be useful for you.
We’ll create a really simple example that you can follow along with as we go.
Declarative Approach
One of the difficulties with the current ETL tools is that they are quite developer-oriented. You either have to learn a lot of Cypher, or you have to be able to manipulate your raw data sources and generate node and relationship files that the batch import tool can use. As noted earlier, when these two options become infeasible, you need to write code.
But in fact, every Neo4j import needs to do exactly the same sorts of things: locate the data sources, know how to transform them into graph objects, link nodes together with relationships, assign labels, index properties and so on. All this pretty much boils down to two questions:
What data do I want?
What do I want it to look like when it’s loaded in the graph?
Databridge tackles these questions by being primarily declarative, instead of programmatic in nature.
It does this by using simple JSON files called schema descriptors in which you define the graph schema you want to build, along with resource descriptors in which you identify the data you want to import, and how to get it. This means you’re able to work directly with your source data exactly as is.
If you can create a JSON document, you can use Databridge.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
Strategies for simplifying complex Salesforce data migrations – Free Webinar
27 March 2024
5 PM CET – 6 PM CET
Read MoreCategories
You Might Be Interested In
Lessons Learned: What Big Data and Predictive Analytics Missed in 2016
30 Dec, 2016In this era of the software-driven business, we’re told “data is the new oil”, and that predictive analytics and machine …
13 Forecasts on Artificial Intelligence
19 Nov, 2016Once upon a time, Artificial Intelligence (AI) was the future. But today, human wants to see even beyond this future. …
How an analytics app is changing the lives of acne patients
11 Dec, 2016Acne is a disease that affects the skin’s oil glands. The oil carries dead skin cells to the surface of …
Recent Jobs
Do You Want to Share Your Story?
Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.