Read this article on how to classify traditional and non-traditional databases based on internal and implementation models.
Thanks to two major trends — DBMS specialization and the NoSQL movement — the database management systems space is generating more interest and more innovation than any time I can remember since the 1980s. Ever since around 1990, when the relational database management system (RDBMS) became firmly established, IT has played DBMSroulette: spin the wheel and use the DBMS on which the needle lands — Oracle, DB2, or SQL Server. (If you think this trivializes things, not so fast: a friend who was the lead DBMS analyst at a major analyst firm once quipped to me that this wheel-spinning was his job, circa 1995.)
Obviously, there was always some rational basis for DBMS selection — IBM shops tended to pick DB2, best-of-breed buyers liked Oracle, performance whizzes and finance types often picked Sybase, and frugal shoppers would choose SQL Server, and later MySQL — but there was no differentiation in the model. All these choices were relational database management systems.
Over time, our minds became dulled to orthogonal dimensions of database differentiation:
The latter is the domain of Brewer’s CAP theorem which I will not discuss today. The former, the database model, will be the subject of this post.
Every DBMS has some native modeling element (NME). For example, in an RDBMS that NME is the relation (or table). Typically that NME is used to store everything in the DBMS. For example, in an RDBMS:
In general, the more naturally the data you’re storing maps to the paradigm (or NME) of the database, the better things will work. For example, you can model XML documents as tables and store them in an RDBMS, or you can model tables in XML and store them as XML documents, but those approaches will tend to be more difficult to implement and less efficient to process than simply storing tables in an RDBMS and XML documents in an XML server (e.g., MarkLogic).
The question is not whether you can model documents as tables or tables as documents. The answer is almost always yes. Thus, the better question is should you? The most famous example of this type of modeling problem is the storage of hierarchical data in an RDBMS. To quote this article on managing hierarchical data in MySQL:
Most users at one time or another have dealt with hierarchical data in a SQL database and no doubt learned that the management of hierarchical data is not what a relational database is intended for.
(Personally, I blame the failure of Microsoft’s WinFS on this root problem — file systems are inherently hierarchical — but that’s a story for a different day.)
I believe the best way to classify DBMSs is by their native modeling element.
The biggest limitation of this approach is that classifying by model fails to capture implementation differences. Some examples:
Nevertheless, I believe that DBMSs should be classified first by model and then sub-classified by implementation optimization. For example, a relational database optimized for big data analytics (Aster). An XML database optimized for large amounts of semi-structured information marked in XML (MarkLogic).
In closing, I’d say that we are seeing increasing numbers of customers coming to Mark Logic saying: “well, I suppose we could have modeled this data relationally, but in our business we think of this information as documents and we’ve decided that it’s easier and more natural to manage it that way, so we decided to give you a call.”
After thinking about this for some time, I have one response: keep calling!
No matter how you want to think about MarkLogic Server — an XML server, an XML database, or an XML document database — dare I say an [XML] [document] server|database — it’s definitely a document-oriented, XML-oriented database management system and a great place to put any information that you think is more naturally modeled as documents.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 License.