ZEO
Space |
1 |
Tolerance |
0 |
NEO
Space |
S-R |
Tolerance |
R |
ZEORaid
Space |
1 |
Tolerance |
S-1 |
S = 4
DB-dependent
relStorage
Space |
1 |
Tolerance |
S-1 |
Non-native
invalidations
0 ≤ R ≤ S-1
Read & Write
Replication
What makes NEO different from other storage implementations, and what needs it addresses.
Why not bare FileStorage ? Bare FileStorage doesn't allow concurrent accesses from multiple processes, so (Python) application cannot take advantage of multiple cores/CPUs, even less multiple machines.
Why not ZEO, which addresses this problem ? ZEO doesn't allow data distribution over multiple disks, so application cannot use more storage space than what is available to a single computer, or get better bandwidth.
Why not ZEORaid ? ZEORaid does not allow scaling in data size terms, as it does RAID-1-ish replication (each underlying storage contains the whole database).
Why not relStorage ? relStorage has scalability issues because Relational DataBases do not send invalidation messages, and it must emulate them. Also, depending on the underlying RDB, it might not support master-master replication. And also, it suffers from the same problem as ZEORaid for data size scalability.
NEO's design addresses these problems:
NEO allows tweaking desired fault-tolerance and space efficiency, by providing it with a number of replicates ("R" in above figure), achieving data distribution among machines. Master-master replication is natively implemented in NEO, as well as invalidation handling.