高手问答第 163 期 —— 分布式原生多模型数据库 ArangoDB

OSCHINA 本期高手问答(2017 年 7 月 26 日 — 8 月 1 日)我们请来了@JanStücke为大家解答关于 ArangoDB 数据库的问题。

Jan Stücke 目前负责 ArangoDB 的技术交流工作,他在创业领域已经工作 8 年多了,主要关注产品开发和项目管理。两年前,他加入了 ArangoDB,现在主要和技术人员一起开发完善多模型数据库的理念。

ArangoDB 是一个开源的分布式原生多模型数据库,多模型指的是兼有图 (graph)、文档 (document)和键/值对 (key/value) 三种数据模型。它的理念是利用一个引擎,一个 query 语法,一项数据库技术,以及多个数据模型,来为各种复杂的数据灵活建模。​最大力度满足项目的灵活性,简化技术堆栈,简化数据库运维,降低运营成本。


  • Single View:收集和关联来自不同系统的所有数据是许多公司的一个大问题,也是多模型数据库的完美适用场景。具有多模型功能的 ArangoDB 可以作为缓存层,通过自然的方式查询数据,而无需对基础数据集进行太多的重新建模。
  • 欺诈检测:欺诈检测涉及到复杂的模式匹配,这需要经常考虑到图形结构(例如,到单个主机或帐户的不寻常的连接数量)
  • 物联网:IoT 会产生大量的状态数据,例如地理位置信息,传感器数据等。同时,物联网中的实际情况通常是分层结构。
  • 电子商务系统:电子商务系统需要存储客户和产品的数据(JSON),购物车数据(键/值),订单和销售数据(JSON 或图表),并且需要大量查询。



  1. ArangoDB 数据库的使用场景(如对于大规模的数据支持如何)
  2. ArangoDB 数据库的特性
  3. 与同类型数据库的对比(如性能、效率、内存占用、兼容性等)
  4. ArangoDB 的使用文档(如入门方面的资料、教程是否完善、上手难度如何等)
  5. ArangoDB 的使用案例
  6. ArangoDB 下一步的计划
  7. 图形数据库相关的问题


P.S. 本次高手问答的嘉宾是外国友人,所以英语好的朋友不妨用英语进行提问~

OSChina 高手问答一贯的风格,不欢迎任何与主题无关的讨论和喷子。

下面欢迎大家就 ArangoDB 相关的问题向@JanStücke提问,请直接回帖提问。

发帖于1年前 21回/2K+阅
共有21个答案 最后回答: 1年前

@GermanWifi 太多nosql 走马观花  眼花缭乱, 如何选择啊  这么多学都学不完啊

--- 共有 3 条评论 ---
GermanWifi@fdanasa : You're very welcome! Here is also a link to explain how to choose right NoSQL. https://www.arangodb.com/tech-talks/ 1年前 回复
fdanasa 回复 @GermanWifi : thank you very much 1年前 回复
GermanWifiHallooo, 由于回复评论字数有限制,而我们的解答比较详细,所以采用了“引用此答案” 回答,请一路向下或向后滑动,然后就能找到啦~ :D 1年前 回复

@GermanWifi ArangoDB 是否可以替代mysql等关系数据库,对于mongodb,redis等有什么优势?

--- 共有 1 条评论 ---
GermanWifi Hallooo, 由于回复评论字数有限制,而我们的解答比较详细,所以采用了“引用此答案” 回答,请一路向下或向后滑动,然后就能找到啦~ :D 1年前 回复

@GermanWifi ArangoDB 是否可以替代mysql等关系数据库

--- 共有 1 条评论 ---
GermanWifi Hallooo, 由于回复评论字数有限制,而我们的解答比较详细,所以采用了“引用此答案” 回答,请一路向下或向后滑动,然后就能找到啦~ :D 1年前 回复

@GermanWifi 和支持SQL 的OrientDB相比有何不同?有哪些优势?有什么不足?性能如何?

--- 共有 1 条评论 ---
GermanWifiHallooo, 由于回复评论字数有限制,而我们的解答比较详细,所以采用了“引用此答案” 回答,请一路向下或向后滑动,然后就能找到啦~ :D 1年前 回复

@GermanWifi 请问这个数据库中的图模型和neo4j数据库比起来,有什么不一样的地方,或者优势?

--- 共有 1 条评论 ---
GermanWifi主要解释一下在 perform graph queries 方面的区别。答案在第二页。 1年前 回复


@GermanWifi 太多nosql 走马观花  眼花缭乱, 如何选择啊  这么多学都学不完啊

大家好,本期Q&A 嘉宾为外宾,所以原始回答为英文。如有疑问,可以提问要中文回答,也欢迎OSChina 开源翻译志愿者的热情帮助。


To pick the right database for a project depends on your use case and the questions you want to answer with your data. It can be tricky to choose the right NoSQL database as most of them just serve a single data model. You might need more than one to serve the needs of your application or company now or in the near future. I can just provide some guiding thoughts here about three of them (key/value, graphs, document) but there are more like e.g. time series or columnar stores.

  1. The document model made it easy for developers to store and combine data of any structure, without giving up access and indexing functionality. In addition, future changes in the data model can be done without downtime. Great for e.g. user data, product data and the like.
  2. The key-value approach best suits cases in which fast lookups of simple data is needed. Those lookups are simply done by its key as the unique identifier pointing to any kind of value. This model suits well for e.g a shopping cart or sensor data consisting of just an identifier and a value.
  3. Another - currently very popular - data model is graphs. With graphs developers and data scientist can focus on the relationships between values by storing data as nodes, edges, and properties. In this network of data, every element contains a pointer to its adjacent element which opens up completely new ways to find patterns in datasets. Graphs are superior in terms of performance when you do not know the depth of your search before you do it, e.g. give me all people who are friends or friends of friends of Jan living in Peking. With graphs, it is very simple to transfer your data modeling from the whiteboard to your database and adjust your queries to future needs.


Those are all amazing new data models which can solve important parts in modern application development. We think that integrating various single model databases into your application causes a lot of complexity and costs, not to mention the nearly impossible task of getting a consistent state of your data with multiple databases involved. That’s why we think you are way better of with a multi model database having full ACID transactions and supporting e.g. joins, aggregations and graph traversals.





@GermanWifi ArangoDB 是否可以替代mysql等关系数据库,对于mongodb,redis等有什么优势?

Great question, thanks for that!

MongoDB and Redis are great technologies and are there for a long time. What I especially like is MongoDB's Atlas DBaaS. We also think that this is the way to go but not at this stage of our development.


MongoDB is a document store and supporting only one data model. MongoDB is eventually consistent and seems to be rather tricky to scale up with, at least this is the feedback we get from users migrating from MongoDB to ArangoDB. What ArangoDB offers in comparison to mongo is real joins, ACID transactions, graph capabilities and a declarative query language AQL. AQL is an intuitive query language designed to feel more like coding. It supports all the functionalities people like in SQL like aggregation, sorting and the combination of all these functionalities in a single query. Furthermore, you can also easily scale with ArangoDB to a large cluster. We have production cases of 20 machines and tested ArangoDB on 80 machines, providing over 1 million writes per second https://www.arangodb.com/whitepaper-arangodb-cluster-benchmark/. I’m confident to say that ArangoDB can be a full replacement for MongoDB and adds more possibilities to see and query your data.


Redis is a brilliant technology and we like their founder Salvatore very much. Brilliant guy and fun to discuss the future of storage with. It seems like Redis is moving more and more into the document store direction as they added secondary indexes and other features to enable more query options. I personally would recommend Redis for hyper-scale situations with key-value models. ArangoDB is perfectly suitable for small to midsize key/value cases as we are optimized for document and graph use cases but can handle many key/value use cases as well.

Hope I could clarify things a bit.