Since its first open-source release in 2004, MonetDB has metamorphosed from a Linux focused research project into an end-user oriented industrial-ready product. Today, I let my most precious kākāpō fly over MonetDB’s ecosystem.
First of all, we have all the basics covered. Today, MonetDB is used by customers using all three major hardware architectures (i.e. Intel, AMD and ARM) on all three major operating systems (i.e. Linux, Windows and Mac OS). For virtual hardware users, MonetDB is also officially released on Docker.
MonetDB started as a server-only database system. But since 2020, the server flavour has got an embedded sibling, MonetDB/e, a lightweight C library that can be included in user applications. MonetDB/e inherits MonetDB’s data analytics power by sharing the same core library with the server-based variation. This design allows both flavours to enjoy new features in the core library immediately, while reducing their maintenance. Integration with Python and Java have been subsequently released, while integrating with R is a work in progress.
Offering MonetDB products in the cloud is our ultimate dream. The ServiceNow investment of 2021 gave us a substantial push in this direction. Shortly after that, the first MonetDB on the AWS marketplace was released, a Christmas gift for our users. With this release, we have embarked on a multi-year project. First, we will release more free MonetDB products on cloud platforms in 2022, e.g. Azure, DigitalOcean and Google Cloud Platform. Simultaneously, we are expanding MonetDB’s cloud offerings to paid products with more user-oriented features and eventually arrive at a fully-managed serverless platform.
High-level software architecture
From the bird’s-eye view, MonetDB has a straightforward architecture. Users can interact with a MonetDB server through standard SQL interfaces as illustrated in this figure.
High-performance query engine
One of MonetDB’s design philosophies is that users shall not be required to have in-depth DBMS knowledge before they can enjoy the performance of the database system (a.k.a. the zero-knob design principle). Hence, for most (analytics-heavy) workloads, MonetDB performs well out-of-the-box.
The one thing that differentiates MonetDB from other relational database systems is hidden behind the standard interfaces: the query engine. It is only noticeable to the users through query execution time. MonetDB’s query engine achieves its high performance for data analytics through several techniques:
A pure-columnar data storage scheme that is particularly advantageous for complex analytical queries.
Parallel processing of columnar data throughout the whole system.
In-memory optimisation allows faster follow-up queries.
A multicore architecture that can easily leverage modern hardware accelerations.
Auto-tuned indices eliminate manual DBA work.
Advanced in-database analytical features through SQL User-Defined Functions implemented in Python, R and C/C++.
Input data loading interface
At the bottom, MonetDB supports various ways to import and export the text and binary data. Next to the standard SQL command INSERT INTO, MonetDB provides COPY INTO for super-fast bulk loading of large data sets. MonetDB supports standard data formats, including CSV and JSON.
Data vault is MonetDB’s equivalent to the data lake concept. The MonetDB data vault framework provides a unified interface to load foreign file formats into relational tables to be queried using the standard SQL interfaces. Currently, MonetDB supports SHP (for geospatial data) and FITS (for images) files. The support for NetCDF and GDAL is still experimental. One can easily extend the data vault framework to support other file formats.
Output query interface
At the top, MonetDB supports a broad scope of standard SQL features, such as:
Common SQL features, e.g. keys, joins, views, triggers, stored procedures, user authentications, and data access control,
Full-ACID properties for concurrent transactions,
SQL:1999 ROLLUP, CUBE, GROUPING SETS,
SQL:2003 merge statements,
SQL:2011 window functions, and
PostGIS compatible geospatial features
MonetDB provides more than ten drivers so that users can connect to a MonetDB database server from within their favourite programming language environments. In addition, MonetDB supports object-relational mappers, including SQLalchemy and Hibernate.
Jan. 31, 2022