October 28, 2009

MySQL vs Oracle vs Microsoft SQL Server

At a previous job I worked as a software develop supporting a set of data migration tools and frameworks. Most of the databases that we worked with were either Oracle or Microsoft SQL Server.We had a few occasions to work with MySQL or PostgreSQL, and even a few times that we were using "databases" like CSV files, and excel. I became somewhat familiar with the strength and features of Oracle and SQL Server, especially in the areas of high performance loading and extracting of data. Some of the starting and/or ending datasets were upwards of 200 GB, and with the tight scheduling constraints of doing data conversion we needed to do the conversions as quickly as possible, to minimize downtime between taking down an old system and starting up a new system.

Recently I've been working more with MySQL as a backend database for a number of different applications. Overall I really like it: it performs well, is easy to maintain, and the price is obviously right. But I have noticed that it takes a little more work to tweak query performance in MySQL than I remember in Oracle or SQL Server. If I remember correctly from my college days, relational algebra provides for a pretty good framework for reworking/optimizing queries. MySQL seems to either not do this at all, or do a really bad job. I have found myself on multiple occasions with a SQL query that performs much more poorly than I would expect. Sometimes there are index issues that require altering the databse in some way, but the end problem in most cases is that the query needs to be reworked to perform better. Switching a
SELECT * from tableA, tableB where tableA.field = tableB.field
to a
SELECT * from tableA join tableB on tableA.field = tableB.field
can make a huge difference. Shouldn't the query engine be able to determine that these are equivalent and adjust accordingly? Oracle and SQL Server seemed to be able to. There was very little I could do to improve query performance in most cases.

There are other cases where I've had to modify a query slightly to something that is functionally equivalent, but dramatically faster. I realize that there are many cases where a query engine/optimizer would not be able to easily find the best or even a better way to execute a query, but MySQL doesn't even seem to try. Does MySQL even have the notion of a query optimizer? Hopefully this is something on the radar for MySQL, because it seems like it wouldn't take to much work to get some pretty substantial wins for performance.

No comments: