Archive for the ‘MariaDB’ Category

Optimizing your InnoDB buffer pool usage by Steve Hardy

Steve Hardy of Zarafa.

Work that has been done to make Zarafa better. Why do you optimise your buffer pool? To decrease your I/O load. How can you do it? Buy more RAM, page compression, less (smaller) data, rearrange data.

MariaDB or Percona Server allows you to inspect your buffer pool (unsure if this is now available in MySQL 5.6). Giuseppe in the audience says this is available in MySQL 5.6, but Steve used this on MariaDB 5.2.

Strategies to fix it: Make records smaller. Remove indexes if you can use others almost as efficiently. Make records that are accessed around the same time have a higher chance of being on the same page. Use page compression. Buy more RAM. Try Batched Key Access (BKA) in MariaDB 5.3+.

Best to view the presentation since there are specific examples that speak about how Zarafa solves their problems like a user trying to sort their email, etc.

Practical MySQL Indexing guidelines by Stéphane Combaudon

Stéphane Combaudon of Dailymotion.

Index: separate data structure to speed up SELECTs. Think of index in a book. In MySQL, key=index. Consider that indexes are trees.

InnoDB’s clustered index – data is stored with the Primary Key (PK) so PK lookups are fast. Secondary keys hold the PK values. Designing InnoDB PK’s with care is critical for performance.

An index can filter and/or sort values. An index can contain all the fields needed for the query you don’t need to go to the table (a covering index).

MySQL only uses 1 index per table per query (not 100% true – OR clauses), so think of a composite index when you can. Can’t index TEXT fields (use a prefix). Same for BLOBs and long VARCHARs.

Indexes: speed up queries, increases the size of your dataset, slows down writes. How big is the write slowdown? Simple test by Stephane, for in-memory workloads he says adding 2 keys makes performance 2x worse; for on-disk workloads he says its 40x worse. Never neglect the slowdown of your writes when you have an index. There is a graph in the slidedeck.

What is a bad index? Unused indexes. Redundant indexes. Duplicate indexes.

Indexing is not an exact science, but guessing is probably not the best way to design indexes. Always check your assumptions – EXPLAIN does not tell you everything, time your queries with different index combinations, SHOW PROFILES is often valuable. Slow query log is a good place to start.

Many slides with examples, so I hope Stephane posts the deck soon. If possible, try to sort & filter (an index is not always the best for sorting).

InnoDB’s clustered index is always covering. SELECT by PK is the fastest access with InnoDB.

An index can give you 3 benefits: filtering, sorting, covering.

See Userstats v2 – you need Percona Server or MariaDB 5.2+. See also pt-duplicate-key-checker to find redundant indexes easily. See also pt-index-usage to help answer questions not covered by userstats.

MariaDB 5.3 query optimizer by Sergey Petrunia

Sergey Petrunia of the MariaDB project.

What exactly is not working in MySQL? MySQL is poor at decision support/analytics. With large datasets you need special disk access strategies. Complex queries like insufficient subquery support and big joins are common int he MySQL world.

DBT-3 is used, scale=30, with a 75GB database and run a query “average price of item between a range of dates”. Query time took some 45 minutes to execute. Why? Run iostat -x to see what is going on. See that the CPU is mostly idle, so its an IO-bound load. Next you run SHOW ENGINE INNODB STATUS and you’ll see how many reads per second is happening. Possible solution is to get more RAM or get an SSD (good to speedup OLTP workloads, but analytics over data is probably not viable since SSDs are small and not cheap).

The MySQL/MariaDB solution to the above problem is improved disk access strategies: multi-range read (MRR) and batched key access (BKA). In MariaDB, MRR/BKA need to be enabled (they are not turned on by default). The query time only took 3 minutes 48 seconds, which is some 11.8x faster than the previous 45 minutes.If you look at EXPLAIN output, its almost as same as before, expect the Extra filed. iostat -x will now show some CPU load, svctm down as well (so its not random disk seeks anymore — some 8ms seek time on a regular 7,200rpm disk), SHOW ENGINE INNODB STATUS will show some 10,000 reads per second rather than the previous 200.

If you are on Fedora, check out the Systemtap feature to look at I/O patterns. stap deviceseeks.stp -c “sleep 60”.

Subqueries handling in MariaDB 5.3: check out the Subquery Optimizations Map. Only about 10% of the audience use optimizer hints in MySQL.

Sphinx user stories by Stéphane Varoqui

Stephane Varoqui, Field Services SkySQL, Vlad Fedorkov, Director of PS, Sphinx Inc, Christophe Gesche, LAMP Expert, Delcampe, Herve Seignole, Web Architect, Groupe Pierre & Vacances Center Parcs – this is a big talk!

Pros: Filtering takes place on attributes in separate tables. Rely on the optimizer choice. HASH JOIN can help (MariaDB 5.3). Table elimination can help (MariaDB 5.2). ICP Index Condition Pushdown can help (MariaDB 5.3/MySQL 5.6). Max 80M documents at Pixmania, all queries come in less than 1s using 128GB of RAM (MariaDB 5.2). At PAP.fr, there is 16GB RAM with MariaDB 5.2.

Cons: CPU intensive (replication with many slaves). Need covering indexes to cover various !filter !order. Join & sorting cost on lazy filtering.

The more indexes you have in the system, the more you need to increase the main memory of the server. Keep the Btree’s in memory.

What about denormalized schemas? Not really CPU intensive, just IO. Can go to disk, full partition scan with filtering taking place on record order using covering index. Can shard but not that easy. Use the spider storage engine or shard-query. Can use memory engine for isolation. There are cons like duplicate data, duplicate indexes, missing material views, merge index cost, impact on write performance, and can consume a lot of memory with many indexes.

MySQL can push hardware, so read less/do less/read serialized/map reduce to get better latency. Chose data type wisely, replace string with numeric, vertical & horizontal sharing, snowflake compression (combination of attributes, build a table of the combination and replace it with an ID). If you are lazy, just use Sphinx!

Sphinx is just another daemon that can serve queries. Its easy to setup, easy to scale, storage engine makes it accessible to current MySQL users, API in core MariaDB (SphinxSE), SphinxQL, SphinxSE is transparent to the application layer of MySQL protocol.

Demo done using the Employees DB.

Pierre & Vacances – Centerparcs. Free text search, they use MariaDB using Levenshtein UDF implementation. Went live 01/2011. First implementation of Sphinx (12 indexes). Its grown, they use PHP API. The new goal is to never send an empty result. 1 index per website/market, with a total of 15 million docs. Index built on standalone server. Using internal gearman job schedule to generate index before cache generated. Current monitoring is via Nagios & perl, but the next step is to use Monyog & MariaDB INFORMATION_SCHEMA plugin.

Delcampe is an auction website with 45M ‘active’ items. Its dedicated for collectors. 3 string fields, and 15 attributes. 40-120K new items daily. Started with mysql fulltext in 2007, moved to Sphinx in 2008. There was a need to have more filters. Now they have 5 sphinx servers + 1 MySQL server. HAproxy to load balance.

MariaDB/MySQL users in Paris & Brussels

I’m about to head to Paris to present at the February meetup of the MySQL User Group in Paris, France. It happens 1st February from 6-8pm at the Patricks Irish Pub. Its free to attend, and I understand that SkySQL keeps this event afloat.

I’m also heading to my first FOSDEM right afterwards and will definitely hang out at the MySQL & Friends Devroom. There is an amazing lineup of speakers, with all talks being about 25-30 minutes, it looks like it is going to be a lot of fun. To boot, Michael “Monty” Widenius will also be there, so expect lots of Salmiakkikossu.

If you want to keep track of where Monty Program folk are going to be to talk about MariaDB, make sure you’re subscribed to our news page, which also includes important release information. Pretty much every conference that we plan to attend (and have attended) is at the conference page.

I am looking forward to meeting & learning from many MariaDB/MySQL users!

SCALE 10x – there’s lots of MySQL there!

I’m just about to get on a plane to head to my inaugural SCALE event. It’s their tenth year running!

In a world filled with NoSQL related media, its kind of nice to see that on Friday January 20 2012, we have a MySQL room right next to the PostgreSQL room (schedule). It is awesome to see that the track will have participation from Oracle, Monty Program Ab, and SkySQL Ab.

On Saturday for the main tracks, I’ve got a talk about the growing MySQL diaspora (just got larger this year in case you haven’t paid attention to the packaged up Galera product!). This one is a constant work in progress and I’m hoping to complete research closer towards March ’12.

Monty Program and SkySQL are also sharing a booth in the expo hall, so come by booth #65 for some interesting schwag (t-shirts, poppers, etc.). Looking at the schedule lineup, I’m surprised I’ve never ever been to a SCALE before – looks totally awesome. See you in LAX (well, we’re so close-by the Los Angeles Airport :P)


i