Posts Tagged ‘MySQL’

Twitter, Facebook MySQL trees online – pushing MySQL forward

Just yesterday, I’m sure many saw Twitter opensourcing their MySQL implementation. It is based on MySQL 5.5 and the code is on Github.

For reference, the database team at Facebook has always been actively blogging, and keeping up their code available on Launchpad. Its worth noting that the implementation there is based on MySQL 5.0.84 and 5.1.

At Twitter, most of everything persistent is stored in MySQL – interest graphs, timelines, user data and those precious tweets themselves! At Facebook, its pretty similar – all user interactions like likes, shares, status updates, requests, etc. are all stored in MySQL (ref).

The media has picked up on it too. A fairly misinformed piece on GigaOm (MySQL has problems focused on Stonebrakers fate worst than death? Pfft. Facebook wants to move its code to github? Read the reasoning — its spam handling on LP.), and a shorter piece on CNET.

Both Twitter and Facebook code trees mention that its what they use in their environments, but it’s not supported in any way, shape or form. Facebook recommends Percona Server or MariaDB. Facebook also has tools like online schema change in the repository, amongst others like prefetching tools written in Python.

I haven’t had the chance to play with the Twitter release yet, but it looks like this can only push Percona Server and MariaDB forward. Based on 5.5, some of these BSD-licensed features can make it in, and some have already made it in I’m sure. And what pushes these servers, will push MySQL forward (see lots of new features in MySQL 5.6).

On a personal note, it is amazing to see some MySQL-alumni push this forward. At Twitter, there’s Jeremy Cole and Davi Arnaut. At Facebook, the team includes Domas Mituzas, Harrison Fisk, Yoshinori Matsunobu, Lachlan Mulcahy. Nothing would be complete without mentioning Mark Callaghan (though not-MySQL alumni, active MySQL community member) who led a MySQL team at Google, and now at Facebook.

Replication features of 2011 by Sergey Petrunia

Sergey Petrunia of the MariaDB project & Monty Program.

MySQL 5.5 GA at the end of 2010. MariaDB 5.3 RC towards the end of 2011 (beta in June 2011).

MySQL 5.5 is merged to Percona Server 5.5 which included semi-sync replication, slave fsync options, atuomatic relay log recovery, RBR slave type conversions (question if this is useful or not), individual log flushing (very useful, but not many using), replication heartbeat, SHOW RELAYLOG EVENTS. About 2/3rds of the audience use MySQL 5.5 in production, with only 2 people using semi-sync replication.

MariaDB 5.3 brings replication features brings group commit in the binary log, which is merged into Percona Server 5.5. Checksums for binlog events which is merged from MySQL 5.6. Sergey goes in-depth about the group commit for the binary log. To find out a little more about MariaDB replication changes, see Replication in the Knowledgebase.

There are several implementations of group commit. Facebook started it, followed by MariaDB & Oracle. Percona 5.5 is GA so the feature is there, its not in MySQL 5.6 (yet?), and MariaDB 5.3 is where its at. Seems like the MariaDB implementation is the best so far – refer to the Facebook benchmark performed by Mark Callaghan.

Annotated RBR poses a compatibility problem. MariaDB 5.3 has annotate_rows, while MySQL 5.6 has rows_query event. They are different events. So you cannot have a MariaDB 5.3 master and a MySQL 5.6 slave at this moment. So MySQL 5.6 will have a flag to mark “ignorable” binlog events which will be merged into MariaDB and this will make binary logs compatible again.

There is now also optimized RBR for tables with no primary key.

MySQL 5.6 also has crash-safe slave (replication information stored in tables). Crash-safe master (binary log recovery if the server starts & sees the binary log is corrupted). Parallel event execution is something that is new in MySQL 5.6 which is the most important feature for Sergey.

Pre-heating: There is mk-slave-prefetch (famous quote: “Please don’t use mk-slave-prefetch on #MySQL unless you are Facebook.”). There is replication booster by Yoshinori Matsunobu. There is a Python version of mk-slave-prefetch that Facebook uses.

MySQL HA reloaded by Ivan Zoratti

MySQL HA reloaded – old tricks and cool new tools to guarantee high availability to your MySQL Servers by Ivan Zoratti of SkySQL. This talk is a little longer, so check out: HA Reloaded – many ways to provide High Availability. The slides are already online.

Questions to ask: which level of high availability do I need? Do I require no loss of data? Do I need failover or is switchover enough? Can I provide a reasonable service when a component is down? Remember, five nine’s high availability also means a lot of infrastructure costs.

Other things to clarify: availability vs scalability. HA costs. HA for your entire architecture, not just for your database servers. Review your SLAs.

The best high availability solution today is combined solutions.

MySQL replication – asynchronous & semi-synchronous (lots of people use MySQL 5.5, about 4 people in room were on Percona Server – question asked due to semi-sync replication only being available in 5.5 & greater), there are pros & cons of row based replication vs statement based replication.

MySQL Replication via Multi-Master replication Manager (MMM). Features such as monitoring, automatic failover, data backup & resync. Unfortunately, it has some problems with the stability & automatic failover. The project is not improved anymore, so there are other solutions that you should consider today.

MySQL Replication with MHA is a preferred solution. Something to consider: –read-only=1 and log-bin on slaves. Master IP failover. FIltering rules. Multi-tier replication.

Tungsten Replicator – open source, heterogenous replication. Truly multi-master and fan-in with Global ID. Per-schema multi-thread. You can also use it to replicate to Postgresql, Oracle and other databases. There is also Tungsten Enterprise.

Synchronous replication with DRBD is typical for active/standby environments. People don’t really like this because they feel that there is a server doing nothing. You can always do it active/passive. It works with InnoDB only.

Synchronous replication with Galera works for InnoDB. Its multi-master with no SPOF. Its new/young technology so you may find some issues with it. Application failover must be managed, but the conflict resolution is quite tricky (when you commit a transaction you might be fine, but you may have transactions that are removed due to conflicts).

There is a commercial SchoonerSQL that provides synchronous master-slave replication for InnoDB. Its defined explicitly as a master-slave solution.

Active/Passive clusters using Shared Storage. Points to consider are the fact that redundancy & replication must be guaranteed by the shared storage. InnoDB only. What about filesystems?

Virtualized environments – data storage, high availability & load balancing are provided and managed by the virtualized software. The faults are handled by the software, not the database.

There is also geographical replication for disaster recovery, having a master-master asynchronous replication is used to update the backup data centre. There is also storage snapshots for disaster recovery (not-specific to MySQL, its storage systems based, use only InnoDB).

There is also MySQL Cluster but there is another presentation about this later at FOSDEM. Very nice closing slide, “The absolutely necessary comparison chart” which some may disagree, but Ivan thinks is the best way forward.

MariaDB/MySQL users in Paris & Brussels

I’m about to head to Paris to present at the February meetup of the MySQL User Group in Paris, France. It happens 1st February from 6-8pm at the Patricks Irish Pub. Its free to attend, and I understand that SkySQL keeps this event afloat.

I’m also heading to my first FOSDEM right afterwards and will definitely hang out at the MySQL & Friends Devroom. There is an amazing lineup of speakers, with all talks being about 25-30 minutes, it looks like it is going to be a lot of fun. To boot, Michael “Monty” Widenius will also be there, so expect lots of Salmiakkikossu.

If you want to keep track of where Monty Program folk are going to be to talk about MariaDB, make sure you’re subscribed to our news page, which also includes important release information. Pretty much every conference that we plan to attend (and have attended) is at the conference page.

I am looking forward to meeting & learning from many MariaDB/MySQL users!

SCALE 10x – there’s lots of MySQL there!

I’m just about to get on a plane to head to my inaugural SCALE event. It’s their tenth year running!

In a world filled with NoSQL related media, its kind of nice to see that on Friday January 20 2012, we have a MySQL room right next to the PostgreSQL room (schedule). It is awesome to see that the track will have participation from Oracle, Monty Program Ab, and SkySQL Ab.

On Saturday for the main tracks, I’ve got a talk about the growing MySQL diaspora (just got larger this year in case you haven’t paid attention to the packaged up Galera product!). This one is a constant work in progress and I’m hoping to complete research closer towards March ’12.

Monty Program and SkySQL are also sharing a booth in the expo hall, so come by booth #65 for some interesting schwag (t-shirts, poppers, etc.). Looking at the schedule lineup, I’m surprised I’ve never ever been to a SCALE before – looks totally awesome. See you in LAX (well, we’re so close-by the Los Angeles Airport :P)

The SkySQL Reference Architecture

I have a bunch of notes from the O’Reilly MySQL Conference & Expo 2011, and I figure its about time I started blogging it. These are notes from the panel on the SkySQL Reference Architecture, led by Kaj Arno and Ivan Zoratti. The notes are raw (read their FAQ for more), and I talk a little bit about the SkySQL Configurator at the end (a tool I immediately used, and submitted some bugs/improvements for – 7 at last count, which I hear got fixed in the 0.02 release, which got pushed last night!).

There were 7 panelists. The MySQL world needs:

  • technical support
  • monitoring & administration tools
  • simplified interfaces
  • development & user tools
  • consulting & training
Services & consulting generally are difficult to scale.
The most comprehensive architecture around MySQL, scalable, adaptable and cloud ready
Implementation:
  • select and test specific components
  • integrate components
  • provision the components in a simple interface
  • simplify monitoring & administration
  • technical services & support
  • validate solutions
  • improvements and new releases can be done
  • knowledge sharing related to the reference architecture
Technologies selected from Webyog, Sphinx, Drizzle, Monty Program, Calpont, Tokutek, ScaleDB, Schooner, Linbit, Zimory, Canonical.

SkySQL Provisioning tools:

  • SkySQL Manager – control and administer the SkySQL/MySQL environment
  • SkySQL Configurator – configure and update SkySQL reference architecture modules
  • SkySQL Tuner – analyse the configuration and prepare the packages

I did a test, and it seemed like I got binaries built in under 5 minutes. Custom configurations with a stock build. You get a 70MB binary. Hosted at http://www.enovance.com/. A lot of people never configure their my.cnf, so I think having a GUI on the web might be a good idea to help people have sensible defaults.

lovegood:skysql byte$ ls
total 143352
drwxr-xr-x    3 byte  staff       102 14 Apr 06:13 ./
drwx------@ 598 byte  staff     20332 14 Apr 06:13 ../
-rw-r--r--@   1 byte  staff  73395132 14 Apr 06:12 SkySQL-mariadb-poboffcfrm5bi054559q8iea74.tar.gz

lovegood:skysql byte$ tar -zxvpf SkySQL-mariadb-poboffcfrm5bi054559q8iea74.tar.gz
x etc/
x etc/my.cnf
x install
x packages/
x packages/xtrabackup-1.4-74.rhel5.x86_64.rpm
x packages/MySQL-client-5.5.10-1.rhel5.x86_64.rpm
x packages/MySQL-server-5.5.10-1.rhel5.x86_64.rpm

SkySQL is also going to have a customer advisory board, and they are starting it this week. (I don’t know any further details about this as of yet.)

The SkySQL Configurator can only get better. I expect it will do custom packages including things like Sphinx/SphinxSE, Drizzle, and other things in due time.


i