Who needs a relational database anyway? This is not just a theoretical question but the key question we had to consider when building our application. An assumption when architecting a new application is that you need a relational database to persist data. What if we revisited that assumption?
Our application required the ability to capture very diverse data but at the same time scan that data in sub-second time. The idea is basically simple. Memory access is fast. How can we solve a complex business problem with a system that takes advantage of objects in memory and is not limited in design by a traditional relational database interaction. What would the solution entail?
Many of the classic problems and solutions that we took for granted needed to be considered. Transactions, fault tolerance and backups were a few of the hurdles we overcame. Operating system and memory limitations forced us to distribute the data across multiple physical and logical machines.
Using the Spring Framework and other open source tools we constructed a fault tolerant, distributed, in-memory “database” framework. This framework provided the traditional features of CRUD (Create, Read, Update and Delete) to the application in question. Specifically the following technical areas will be discussed in detail:
- Transactions – How we take advantage of the Spring Transaction Model
- Fault Tolerance – Ensuring the data in memory survives application restarts using XML
- Software/Data Upgrades – How we handle changes to the object model
- Online Backups – Backing up memory without blocking transactions
- Data Distribution – Growing the model beyond OS and JVM memory limitations
- Application Performance – Tuning the object queries and maximizing memory capacity
- Import/Export of Data – Interfacing with external systems
- Testing – Unit and integration testing benefits
The application was tested with approximately 3.6 million rows of data distributed across 6 physical machines and 7 JVMs. Sub-second response times were maintained by farming out the queries to each remote JVM. Extensive monitoring tools were constructed that provided information to allow the system to be optimized from both a memory efficiency and object distribution perspective.
There were benefits realized by removing the relational database. The removal of the OR mapping layer allowed the objects used in the business layer to be more natural in design. The model was not constrained or affected by complexities in the underlying storage mechanism. This actually made for simpler, easier to read code, what all developers strive for. Developers concerned themselves less with complexities of the framework and more with the business problems. Ultimately the application development cycle was shortened reducing the “Time to Market”.
There were also some unexpected benefits that were realized. Unit and integration testing became easier not only to write but to run. Having the data persisted in XML allowed for easy home state initialization for testing. Data became truly portable between developers, tests, and application installations. The classic “undo” problem of relational databases went away. The persisted transactions could simply be removed allowing the developers to “go back in time”.