MySQL: The Real Grid Database, Mark Callaghan, Chip Turner
A tremendous amount of work also done by Wei Li and Gene Pang.
Google has a large MySQL deployment, and they enhance it as needed.
MySQL@Google: too many queries, transactions, data, and rapid growth. Real workload with OLTP and reporting. Workload at Google is *critical*.
The well known solution is to deploy a “grid database”:
- use many replicas to scale read performance
- shard your data over many masters to scale write performance (vertical partitioning of data)
- sharding is easy, resharding is hard
Large number of small servers, not much capacity lost when a server fails, support as many servers as possible with a few DBAs.
Manageability is important at Google – make all tasks scriptable. Gives you time to solve more interesting problems, and also support hundreds of servers with one DBA.
Google prefers under-utilizing servers – better to have 3 servers at 50%, rather than 2 servers at 75%. Less maintenance, less tuning, load spikes tolerated better.
Monitor everything you can: vmstat, iostat, MySQL error logs, /var/log/messages, SHOW STATUS & SHOW PROCESSLIST output, etc. Archive it for as long as you can. And automate all this as much as possible. Allow to query/visualize data from the archive. There are tools out there, and Google has some internal ones that they use.
They tend to not store the logs in a database. Its more efficient to bzip it or something (I’m thinking the ARCHIVE storage engine might be appropriate for them, possibly).
Many fast queries can be as much of a problem as one slow query.
mypgrep is an open source tool that Google has released.
Changed MySQL to count activity per account, table, and index:
- SHOW USER_STATISTICS – Displays per account activity
- SHOW TABLE_STATISTICS – each table, number of rows fetched/changed
- SHOW INDEX_STATISTICS – rows fetched per index, find indexes that were never used
MySQL High Availability, with even brighter future – with DRBD for instance. Cluster and Replication rock, however they need features right now. Committed to InnoDB and MySQL Replication
Zero transaction loss on failures of a master. Minimal downtime on failures of a master (if they can get downtime within 1-2 minutes, they’ll be happy). Reasonable cost in performance (added latency for a single workload), and dollars.
Readers and writers don’t lock each other.
Failure happens everywhere: OS (kernel OOM or panic), still running a lot of 32-bit servers, mysqld crashes due to code they themselves write, corrupted write (so InnoDB checksums rock), file system becomes inconsistent after an unplanned reboot (they use ext2), bad RAM, people (! rebooting by mistake :) ).
Features they want to see in MySQL: synchronous replication as an option, product that watches a master and initiates a failover, archives of the master’s binlog stored elsewhere (in case the master becomes unreachable), state stored in the filesystem to be consistent after a crash (a more modern FS besides ext2 will make this better).
And everyone’s been pretty much talking about how Google is contributing to MySQL, with some interesting twists to why they’re contributing back and open sourcing their code. google-mysql-tools.
Technorati Tags: mysql, google, mysqlconf, mysqlconf07, architecture, google-mysql-tools, opensource