NEO is a distributed, redundant and transactional
storage designed to be an alternative to ZEO and FileStorage. It is developed and maintained by
Nexedi and is being used in several applications maintained by Nexedi
as well as Nexedi internal production instance including all websites.
NEO implements ZODB's Storage interface, and supports the following standard extensions:
- Revisions (optional in BaseStorage) with MVCC support.
- Undo (optional in BaseStorage).
- Pack (both pruning old object revisions and orphan objects).
- Conflict resolution.
- Iterator (allows exporting storage data).
Note: There is no plan to support "version" extension in NEO, because of its pending-deprecation state.
NEO adds the following features:
- No central lock: no more storage-cluster-wide commit lock.
- Increased scalability: load balancing over multiple machines.
- Fault tolerance: data replication over multiple machines.
Why use NEO?
Architecture and Characteristics
With growing data volumes and lessons learned from managing a petabyte
in hindsight, below you can find additional chracteristics of NEO:
- Thick client, thin server
NEO: designed this way.
- Client-side compression
- Load balancing
NEO: static balancing is done, dynamic is not.
Local disc cache, data geographical proximity to user
not implemented, for now NEO isn't
expected to fit long-distance distribution ram caching is done, though (both
at NEO and ZODB levels), NEO level being implemented with a special
intermediate caching algorithm.
Fine-grained locks are better
this way, and optimistic
transaction consistency used in ZODB also helps.
- Data flushed to disk only during commit
NEO: designed this way.
- No need for central index updates for object changes
NEO: designed this way.
- Large updates & update pooling
NEO: This is not expected to happen at NEO level, but at application level (ex: Zope).
- Data duplication
NEO: designed this way.
Sequential storage support (eg. tapes) the same way as random access storage (eq. disks)
NEO: not implemented, for now NEO requires
all data (historical and current) to be accessible, to fit the needs of ZODB.
It is unsure if this can be implemented at all, and heavily depends on application
level behaviour (ex: Zope).
NEO: not implemented, NEO focuses on interactive
use (short transactions) rather than heavy data processing (long transactions)
for the moment, so such feature is not on top of priority list.
Finally, it seems that the biggest difference between described systems and
NEO/ZODB sits around the meaning of "transaction" and expected application behavior
inside a transaction: NEO provides the same level of isolation as ZODB does, which
is (supposed to be) PL-2+, as per Atul Adya's thesis denomination (see below),
which looks stricter than transaction isolation (shortly) described here.
The reliability of a data storage - such as NEO - is critical. To ensure the
quality of NEO design, its protocol is in the process of being formally proven
To ensure code quality, NEO project relies on automated testing:
- unit test checking individual method behaviour
- functional tests checking node and cluster behaviour
- standard ZODB test suites
You can get the source code in the following Git repository:
https://lab.nexedi.com/nexedi/neoppod.git (Github mirror)
or browse it online.
It is also published on PyPI.
The following software is required:
Linux 2.6 or later.
Note: the actual requirement is on epoll, integrated in 2.5.44. There
are plans to add support for other platforms, but it is not implemented yet.
- Python 2.7 (2.7.9 or later for SSL support).
Note: MySQL server is currently used as a backend for NEO, with InnoDB,
RocksDB or TokuDB storage engine. This was chosen as an early approach to take
advantage of existing features (transactional persistent storage, basically),
and will be replaced with leaner key/value storage later.
- ZODB >= 3.10.x (Zope 2.13 or later)
Verification of large-scale distributed database systems in the NEOPPOD project. by
O. Bertrand, A. Calonne, C. Choppy, S. Hong, K. Klai, F. Kordon, Y. Okuji, E. Paviot- Adet, L. Petrucci, and J.-P. Smets - in Workshop on Petri Nets and Software Engineering (PNSE'09, associated with Petri Nets 2009) - poster paper, pages 315–316, 2009
The NEO Protocol for Large-Scale Distributed Database Systems: Modelling and Initial Verification by Christine Choppy, Anna Dedova, Sami Evangelista, Silien Hong, Kais Klai, and Laure Petrucci.
Tips and Tricks
Automated test results are published on www.erp5.com.
Q: How does NEO scale compared to ZODB?
A: For "normal" database use (1+TB) NEO is running very stable. Non-scalability topics
being worked on include pruning of old data being too slow and reshaping of a cluster
(NEO moving data when cluster changes).
We did a number of scalability tests going up to 150TB to find bottlenecks. Issues found and being investigated:
- replication too slow (20TB database, one storage becomes OUT_OF_DATE, syncing this non-synced storage takes too long for MariaDB's RockDB)
- read/write (upload 1.5TB/day through fluentd) was slower than disk/ethernet speed. Being investigated.
The size of one NEO server currently used and connected to 60 zopes is 83724 GB
(stored in ERP5 data stream module on top of NEO, smaller disk consumption due to compression).
We want to gradually increase this 80 TB stored in ZODB to 1 PB and ran one test storing
an array of 4.9 million rows using wendelin.core
persistent numpy array in ZODB without consuming too much memory. Doing something
similar in ZODB and FileStorage would be difficult in terms of disk space and performance.
NEO is Free Software, licensed under the terms of the GNU GPL v3 (or later). For rationale, please see Nexedi licensing.
Projects directly related to NEO, but not actually touching its core. They might or might not be strictly dependant on NEO.
Idea: write a FUSE wrapper using NEO as a storage back-end, instead of a hard disk partition, or cd, etc.
Goal: stress-testing with filesystem benchmark suites.
Progress: Started in may 2010, stalled since then. Code to be cleaned up and published when I (Vincent Pelletier) find time. Unstable.
Idea: write a memcached server using NEO as a storage back-end, instead of ram (original memcached) or other back-ends (kumofs, etc).
Goal: benchmark with memcached-oriented tools.
Progress: Not started, assigned.
API: Storage preferred, otherwise ZODB
To the best of our knowledge, there is no other Storage interface implementation offering both scalability and fault tolerance the way NEO does:
- FileStorage - Single-file storage
- RelStorage - Relational database storage
- DirectoryStorage - Multi-file storage
- Zeo - Networked, multi-storage RPC.
- ZeoRaid - Fault-tolerant clustering of Zeo servers
- Ceph - Although not an object database, its design is very close to NEO
Some interesting pages on topics related to NEO, but not written for/about NEO:
NEO project was initiated in 2005 by Nexedi, a French company developing ERP5
- a Free Software ERP for small to large enterprises implemented on top of Zope - since 2001.
NEO was then endorsed in 2009 by System@tic competitive cluster, by Paris Region and by FEDER programme of the European Union.