Distributed, redundant and transactional storage for ZODB
- How to use
- Commercial Support
- Change History
NEO is a distributed, redundant and scalable implementation of ZODB API. NEO stands for Nexedi Enterprise Object.
A NEO cluster is composed of the following types of nodes:
“master” nodes (mandatory, 1 or more)
Takes care of transactionality. Only one master node is really active (the active master node is called “primary master”) at any given time, extra masters are spares (they are called “secondary masters”).
“storage” nodes (mandatory, 1 or more)
Stores data, preserving history. All available storage nodes are in use simultaneously. This offers redundancy and data distribution. Available backends: MySQL (InnoDB or TokuDB), SQLite
“admin” nodes (mandatory for startup, optional after)
Accepts commands from neoctl tool and transmits them to the primary master, and monitors cluster state.
Well… Something needing to store/load data in a NEO cluster.
ZODB API is fully implemented except:
- pack: only old revisions of objects are removed for the moment
- (full implementation is considered)
- blobs: not implemented (not considered yet)
Any ZODB like FileStorage can be converted to NEO instanteously, which means the database is operational before all data are imported. There’s also a tool to convert back to FileStorage.
See also http://www.neoppod.org/links for more detailed information about features related to scalability.
In addition of the disclaimer contained in the licence this code is released under, please consider the following.
NEO does not implement any authentication mechanism between its nodes, and does not encrypt data exchanged between nodes either. If you want to protect your cluster from malicious nodes, or your data from being snooped, please consider encrypted tunelling (such as openvpn).
- Linux 2.6 or later
- Python 2.7.x
- For storage nodes using MySQL backend:
- For client nodes: ZODB 3.10.x
NEO can be installed like any other egg (see setup.py). Or you can simply make neo directory available for Python to import (for example, by adding its container directory to the PYTHONPATH environment variable).
Write a neo.conf file like the example provided. If you use MySQL, you’ll also need create 1 database per storage node.
Start all required nodes:
$ neomaster -f neo.conf $ neostorage -f neo.conf -s storage1 $ neostorage -f neo.conf -s storage2 $ neoadmin -f neo.conf
Tell the cluster to initialize storage nodes:
$ neoctl -a <admin> start
Clients can connect when the cluster is in RUNNING state:
$ neoctl -a <admin> print cluster RUNNING
See importer.conf file to import an existing database, or neoctl command for more administrative tasks.
Alternatively, you can use neosimple command to quickly setup a cluster for testing.
First make sure Python can import ‘neo.client’ package.
Edit your zope.conf, add a neo import and edit the zodb_db section by replacing its filestorage subsection by a NEOStorage one. It should look like:
%import neo.client <zodb_db main> <NEOStorage> master_nodes 127.0.0.1:10000 name <cluster name> </NEOStorage> mount-point / </zodb_db>
Just create the storage object and play with it:
from neo.client.Storage import Storage s = Storage(master_nodes="127.0.0.1:10010", name="main") ...
“name” and “master_nodes” parameters have the same meaning as in configuration file.
Before shutting down NEO, all clients like Zope instances should be stopped, so that cluster become idle. This is required for multi-DB setups, to prevent critical failures in second phase of TPC.
A cluster (i.e. masters+storages+admin) can be stopped gracefully by putting it in STOPPING state using neoctl:
neoctl -a <admin> set cluster STOPPING
This can also be done manually, which helps if your cluster is in bad state:
- Stop all master nodes first with a SIGINT or SIGTERM, so that storage nodes
- don’t become in OUT_OF_DATE state.
- Next stop remaining nodes with a SIGINT or SIGTERM.
This is the recommanded way to backup a NEO cluster. Once a cluster with appropriate upstream_cluster & upstream_masters configuration is started, you can switch it into backup mode using:
neoctl -a <admin> set cluster STARTING_BACKUP
It remembers it is in such mode when it is stopped, and it can be put back into normal mode (RUNNING) by setting it into STOPPING_BACKUP state.
Packs are currently not replicated, which means packing should always be done up to a TID that is already fully replicated, so that the backup cluster has a full history (and not random holes).
NEO has no built-in deployment features such as process daemonization. We use supervisor with configuration like below:
[group:neo] programs=master_01,storage_01,admin [program:storage_01] priority=10 command=neostorage -s storage_01 -f /neo/neo.conf [program:master_01] priority=20 command=neomaster -s master_01 -f /neo/neo.conf [program:admin] priority=20 command=neoadmin -s admin -f /neo/neo.conf
Developers interested in NEO may refer to NEO Web site and subscribe to following mailing lists:
This version comes with a change in the SQL tables format, to fix a potential crash of storage nodes when storing values that only differ by the compression flag. See UPGRADE notes if you think your application may be affected by this bug.
- Performance and features:
- ‘Importer’ storage backend has been significantly sped up.
- Support for TokuDB has been added to MySQL storage backend. The engine is still InnoDB by default, and it can be selected via a new ‘neostorage’ option.
- A ‘neomaster’ option has been added to automatically start a new cluster if the number of pending storage nodes is greater than or equal to the specified value.
- Storage crashed when reading empty transactions. We still need to decide
whether NEO should:
- continue to store such transactions;
- ignore them on commit, like other ZODB implementation;
- or fail on commit.
- Storage crashed when a client tries to “steal” the UUID of another client.
- Client could get stuck forever on unreadable cells when not connected to the master.
- Client could only instantiate NEOStorage from the main thread, and the RTMIN+2 signal displayed logs for only 1 NEOStorage. Now, RTMIN+2 & RTMIN+3 are setup when neo.client module is imported.
- Storage crashed when reading empty transactions. We still need to decide whether NEO should:
- Plus fixes and improvements to logging and debugging.
- Version 1.2 added a new ‘Importer’ storage backend but it had 2 bugs.
- An interrupted migration could not be resumed.
- Merging several ZODB only worked if NEO could import all classes used by the application. This has been fixed by repickling without loading any object.
- Logging has been improved for a better integration with the environment:
- RTMIN+1 signal was changed to reopen logs. RTMIN+1 & RTMIN+2 signals, which were previously used for debugging, have been remapped to RTMIN+2 & RTMIN+3
- In Zope, client registers automatically for log rotation (USR2).
- NEO logs are SQLite DB that are not open anymore with a persistent journal, because this is incompatible with the rename+reopen way to rotate logs, and we want to support logrotate.
- ‘neolog’ can now open gzip/bz2 compressed logs transparently.
- ‘neolog’ does not spam the console anymore when piped to a process that exits prematurely.
- MySQL backend has been updated to work with recent MariaDB (>=10).
- 2 ‘neomaster’ command-line options were added to set upstream cluster/masters.
The most important changes in this version are the work about conversion of databases from/to NEO:
A new ‘Importer’ storage backend has been implemented and this is now the recommended way to migrate existing Zope databases. See ‘importer.conf’ example file for more information.
‘neomigrate’ command refused to run since version 1.0
Exported data serials by NEO iterator were wrong. There are still differences with FileStorage:
- NEO always resolves to original serial, to avoid any indirection (which slightly speeds up undo at the expense of a more complex pack code)
- NEO does not make any difference between object deletion and creation undone (data serial always null in storage)
Apart from that, conversion of database back from NEO should be fixed.
Other changes are:
- A warning was added in ‘neo.conf’ about a possible misuse of replicas.
- Compatibility with Python 2.6 has been dropped.
- Support for recent version of SQlite has been added.
- A memory leak has been fixed in replication.
- MySQL backend now fails instead of silently reconnecting if there is any pending change, which could cause data loss.
- Optimization and minor bugfixes.
- Client failed at reconnecting properly to master. It could kill the master (during tpc_finish!) or end up with invalid caches (i.e. possible data corruption). Now, connection to master is even optional between transaction.begin() and tpc_begin, as long as partition table contains up-to-date data.
- Compatibility with ZODB 3.9 has been dropped. Only 3.10.x branch is supported.
- checkCurrentSerialInTransaction was not working.
- Optimization and minor bugfixes.
This version mainly comes with stabilized SQL tables format and efficient backup feature, relying on replication, which has been fully reimplemented:
- It is now incremental, instead of being done on whole partitions. Schema of MySQL tables have been changed in order to optimize storage layout, for good partial replication performance.
- It runs at lowest priority not to degrade performance for client nodes.
- A cluster in the new BACKINGUP state is a client to a normal cluster and all its storage nodes are notified of invalidations and replicate from upstream nodes.
Other changes are:
- Compatibility with Python < 2.6 and ZODB < 3.9 has been dropped.
- Cluster is now automatically started when all storage nodes of UP_TO_DATE cells are available, similarly to mdadm assemble --no-degraded behaviour.
- NEO learned to check replicas, to detect data corruption or bugs during replication. When done on a backup cluster, upstream data is used as reference. This is still limited to data indexes (tid & oid/serial).
- NEO logs now are SQLite DB that always contain all debugging information including exchanged packets. Records are first kept in RAM, at most 16 MB by default, and there are flushed to disk only upon RTMIN signal or any important record. A ‘neolog’ script has been written to help reading such DB.
- Master addresses must be separated by spaces. ‘/’ can’t be used anymore.
- Adding and removing master nodes is now easier: unknown incoming master nodes are now accepted instead of rejected, and nodes can be given a path to a file that maintains a list of known master nodes.
- Node UUIDs have been shortened from 16 to 4 bytes, for better performance and easier debugging.
Also contains code clean-ups and bugfixes.
- Client didn’t limit its memory usage when committing big transactions.
- Master failed to disconnect clients when cluster leaves RUNNING state.
- Storage was unable or slow to process large-sized transactions. This required to change protocol and MySQL tables format.
- NEO learned to store empty values (although it’s useless when managed by a ZODB Connection).
- storage: a specific socket can be given to MySQL backend
- storage: a ConflictError could happen when client is much faster than master
- ‘verbose’ command line option of ‘neomigrate’ did not work
- client: ZODB monkey-patch randomly raised a NameError
- client: method to retrieve history of persistent objects was incompatible with recent ZODB and needlessly asked all storages systematically.
- neoctl: ‘print node’ command (to get list of all nodes) raised an AssertionError.
- ‘neomigrate’ raised a TypeError when converting NEO DB back to FileStorage.
NEO is considered stable enough to replace existing ZEO setups, except that:
- there’s no backup mechanism (aka efficient snapshoting): there’s only replication and underlying MySQL tools
- MySQL tables format may change in the future