21 Best Free Open Source Databases
As a developer or DBA, you must be using some of the widely used databases like MS SQL Server, MySQL, Oracle, PostgreSQL, MongoDB etc. MySQL is the best free open source database which is used today, that we all know. Beside MySQL there are a lot of free and open source databases which you might not be knowing or never used. Some of the free free and open source databases are PostgreSQL, MongoDB, HBase, Cassandra, Couchbase, Neo4j, Riak, Redis, Firebird and lot more. I am using Firebird in my current project which with Delphi XE4. I have compiled a list of 21 Best, Free and Open Source Databases available to us. Lets have a look at them:
The most widely used open source database for Web apps (and many other things) remains MySQL. Support for multiple storage engines, clustering, full-text indexing, and plenty of other professional features have allowed numerous other apps profiled here, from WordPress to Movable Type, to rely on MySQL as their default database. Graphical front ends, such as phpMyAdmin and Adminer, make using the database far less of a chore. And for those seeking escape from the long shadow of Oracle, there's a community fork named MariaDB, maintained by MySQL's original lead developer, Monty Widenius.
When Oracle acquired MySQL, reduced the development staff, and more or less killed the open source nature of the project, it reopened a market that MySQL had locked down. PostgreSQL has a much nicer set of drivers and supports both standard ANSI-SQL and extended features, in many cases better than MySQL. On the downside, its long legacy has left it multiprocess in the era of multithreaded. The high-availability/clustering features of PostgreSQL require a lot of elbow grease and leave much to be desired. Yet while organizations look for a community developed database, one of the eldest starts to look pretty good. Many cloud providers, such as Heroku, have chosen PostgreSQL as their RDBMS storage option as well.
4. Hadoop (HBase)
Hadoop is the name brand in big data. It is also the convergence of "clustered storage" systems like Gluster and Ceph with NoSQL. Hadoop is really a collection of projects to solve large and complex data problems. In fact, there are multiple types of databases and query languages built on the overall Hadoop framework. Hadoop's complexity is as legendary as its capability, and its lack of high-availability features has both held it back and created a commercial add-on ecosystem.
The project aims to host very large tables like "billions of rows, millions of columns". It has a REST-ful web service gateway that supports XML, Protobuf, and binary data encoding options.
5. Apache Cassandra
Written in Java, this BigTable-based key-value database is getting more popular by the day. Open source and built to integrate with Hadoop, Cassandra offers the column family solution to developers wanting to move away from the relational database model while working with Hadoop. Focusing mainly on getting in very fast writes and providing high availability, Cassandra has slower reads than some alternatives. It is mostly used for logging purposes and real-time analysis.
Cassandra is a highly scalable second-generation distributed database that is used by giants like Facebook, Digg, Twitter, Cisco & more. It aims to provide a consistent, fault-tolerant & highly available environment for storing data.
While Couchbase was a fork of CouchDB, it has become more of a full-fledged data product and less of a ball of framework than CouchDB. Its transition to a document database will give MongoDB a run for its money. It is multithreaded per node, which can be a major scalability benefit -- especially when hosted on custom or bare-metal hardware. With some nice integration features, including with Hadoop, Couchbase is a great choice for an operational data store.
The database for interconnected data, Neo4j provides a reliable Java-based platform for conquering highly interconnected database problems. Available with full ACID transaction compatibility -- rare in a NoSQL database -- Neo4j has a SQL-like query language called Cypher and a scripting language called Gremlin for graph traversals. Best used to accurately and efficiently model highly complex, interconnected networks like network topologies, social networks, and conditional access control problems, it provides indexes on nodes and relationships. Direct path calculations take hundreds of lines of code for a RDBMS but two lines of code for Neo4j.
An open source distributed database written in Erlang and C, Riak treats all nodes equally. No one is a master or a slave. Thus, there is no fear a master will be a single point of failure. However, the masterless, fully distributed model with SNMP monitoring is not available in the open source version. Much simpler than its peers (such as Cassandra), Riak is optimal for places where even seconds of downtime would hurt.
There are many NoSQL databases, but Redis remains close to our heart because it has so many features that some call it a "data structure store." You don't just store numbers and strings -- you can dump in entire hashes, lists, sets, and other complicated structures. Then, to make the deal sweeter, Redis offers replication and persistence.
Redis is an advanced fast key-value database written in C which can be used like memcached, in front of a traditional database, or on its own. It has support for many programming languages & used by popular projects like GitHub or Engine Yard. There is also a PHP client named Rediska for managing Redis databases.
Firebird is a relational database that can run on Linux, Windows & various UNIX platforms. It offers high performance and powerful language support for stored procedures and triggers.
Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering. It is intended for use in speeding up dynamic web applications by alleviating database load.
12. Oracle Berkeley DB
It is an embeddable database engine that provides developers with fast, reliable, local persistence with zero administration. Oracle Berkeley DB is a library that links directly into your application & enables you to make simple function calls rather than sending messages to a remote server for a better performance.
Hypertable is a high performance distributed data storage system designed to support applications requiring maximum performance, scalability, and reliability. It is modeled after Google's BigTable and mostly focuses on large-scale datasets.
It is a consistently replicated, fault-tolerant key-value store that works in Windows OS. Keyspace offers high availability by masking server/network failures & appearing as a single, highly available service.
4store is a database storage and query engine that holds RDF data. It is written in ANSI C99, designed to run on UNIX-like systems & offers a high performance, scalable & stable platform.
MariaDB is a backward compatible, drop-in replacement branch of the MySQL® Database Server. It includes all major open source storage engines + the Maria storage engine.
It is a fork of MySQL that focuses on being a reliable database optimized for Cloud and Net applications.
It is a SQL relational database engine written in Java. HyperSQL offers a small & fast database engine which has in-memory and disk-based tables, supports embedded/server modes. Also, it has tools such as a command line SQL tool & GUI query apps.
MonetDB is a database system for high-performance applications in data mining, OLAP, GIS, XML Query, text & multimedia retrieval.
eXist-db is built using XML technology. It stores XML data according to the XML data model & features efficient, index-based XQuery processing.