Big DBA Head!

Database Brain Power!

December 29th, 2008

Putting Butts in the seats for my 2009 UC Sessions….

By now you have been looking at the UC’s upcoming schedule. On Tuesday @ 2pm Yves and I are currently scheduled to present our presentation on Waffle Grid entitled: “Distributed Innodb Caching with memcached “ . When we submitted this topic, we had not yet come up with a name for the project, So we are really hoping one of the conference gods allows us to change it to something like “The Waffle Grid Project: Distributed Innodb Caching with memcached”. Speaking of changes something, my other accepted proposal a talk on Solid State Disk scheduled for Wednesday @ 2pm Entitled “SAN Performance on a Internal Disk Budget: The coming Solid State Disk revolution” (http://en.oreilly.com/mysql2009/public/schedule/detail/5991) went horribly wrong in the formatting department when I got lazy and copied the outline of a presentation I gave earlier this year on that topic. I really need to figure out who can change that to something more legible. Or something like:

 

“We will spend 45 minutes discussing the past, present, and future of Solid State Disk technologies. Specifically we will look at the performance of drives from Mtron, Memoright, and Intel. Paying special attention to DBT2, Sysbench, Orion and other disk based benchmarks. I will also touch on other technologies from Fusion IO, Texas Memory systems, Violin memory and others to compare and contrast various companies approaches to speeding up the disk IO layer. “

Read the rest of this entry »

December 22nd, 2008

Waffle Grid: Improving Performance on Ec2

As I mentioned earlier their are some limitations with Ec2’s setup and configuration that make it difficult to get Waffle Grid to perform at a high level.  One of the items we are hoping that helps us overcome the Ec2 limitations is Async ( non block ) sets in memcached, unfortunately their current implimentation has some limitations as well, but Brian Aker said they are working on fixing these right now.  But even the current Async implementation showed a slight performance bump on my test hardware, so I decided to give it a spin on Ec2 again.

No Waffle Waffle (No Async) Waffle (Async)
TPM 1400 1708 2006

Thats showing a little more improvement!   43% boost instead of the 22%we saw earlier.  Hopefully with the new libmemcached async code this will show an even greater boost later on.

December 18th, 2008

Waffle Grid: Non-blocking Memcached

So in playing with waffle using memcache non-blocking ( async ) sets I noticed a huge spike in get latency from previous runs on my local hardware, take a look:

Memcached set: Block: 0:112195 : Thread : 1091426640 time:14
Memcached set: Block: 0:58107 : Thread : 1091426640 time:26
Memcached set: Block: 0:58105 : Thread : 1091426640 time:25
Memcached set: Block: 0:58103 : Thread : 1091426640 time:29
Memcached set: Block: 0:58075 : Thread : 1091426640 time:25
Memcached set: Block: 0:58073 : Thread : 1091426640 time:28
Memcached set: Block: 0:58069 : Thread : 1091426640 time:38
Memcached get: Block: 0:112547 : Thread : 1082132816 time:4170
Memcached get: Block: 0:58119 : Thread : 1159866704 time:6065
Memcached get: Block: 0:158858 : Thread : 1084148048 time:6462
Memcached get: Block: 0:125743 : Thread : 1159666000 time:7107
Memcached get: Block: 0:219163 : Thread : 1075673424 time:7511
Memcached get: Block: 0:56587 : Thread : 1078278480 time:9503
Memcached get: Block: 0:128459 : Thread : 1077758288 time:10090
Memcached get: Block: 0:192504 : Thread : 1159666000 time:905
Memcached get: Block: 0:206916 : Thread : 1085438288 time:6409
Memcached get: Block: 0:189286 : Thread : 1160268112 time:4700
Memcached get: Block: 0:158097 : Thread : 1160067408 time:703
Memcached get: Block: 0:112607 : Thread : 1160870224 time:9123
Memcached get: Block: 0:159963 : Thread : 1160067408 time:650

Sets are really fast, but the get time is now much much higher in some cases…

So I went searching through the code of libmemcached and found this:

/*
Here is where we pay for the non-block API. We need to remove any data sitting
in the queue before we start our get.

It might be optimum to bounce the connection if count > some number.
*/

Dohhhh!  Before a get happens the queue needs to be cleared. I don’t want/need the data to be cleared out… their is no reason we should not just go to disk if it has not made it to memcached yet.   Need to think about this and what we can do to fix this.

December 18th, 2008

Waffle Grid: Waffling in the clouds

Cloud computing has been getting lots of buzz over the last few years, and it only seems fair that we talk a little bit about where a Waffle Grid can fit in the cloud. One of our key visions for Waffle Grid is to enable people to add capacity & resources on demand ( as needed ) as load and demand increases without rearchitecting the environment or the application. As I showed previously given the right setup ( disk bound workload, fast interconnect ) Waffle Grid can yield substantial performance benefits. This offers a compelling offering for cloud computing environments.Let’s use a simple example for those that may not fully understand the benefits of cloud computing. Being Christmas time, if you run a ecommerce site that is blessed with increased sales this time of year you probably know that your web infrastructure can sometimes become strained during this time of year. In the past many companies have built their infrastructure to handle this peak load ( you don’t want to lose sales after all ). So you setup and run enough web, application, and databases servers to handle your Christmas load. The problem is the other 11 months of the year the traffic maybe only a small fraction of what it is around Christmas time. So those other 11 months your web servers, application servers, and database servers run almost idle. On top of that next year when your load exceeds your expectations, your left scrambling for the resources you need ( try and find new servers, set them up, rearchitect your app, etc). This is not mentioning the other concerns you may have, I.e. datacenter issues ( power, cooling, etc ) .

Following so far? Good. Instead of spending all that money on hardware and software that you may only use 10% of the time, what if you could only pay for what you need when you need it? Using a service like EC2 you can quickly add ( provision) servers or services as you need them paying by hour or by some other metric. For our ecommerce site example you could add additional web/app/db servers for the Christmas rush and then remove them when the rush is over.

Read the rest of this entry »

December 17th, 2008

WaffleGrid status and roadmap

Since I didn’t have a blog server available in the last few months, Matt offered me to use his BigDBAHead and I accepted.  I must admit, I am not a big blogger but with WaffleGrid, there is a need to communicate more.  Since I have done most of the InnoDB hacking, here is a bit of status and roadmap information.

1. Async memcached_set

Believe it or not… all our results up to now have been done with sync IO…  We were discussing the need of doing async when I discovered memcached_behavior_set and  MEMCACHED_BEHAVIOR_NO_BLOCK. Sometimes, you just feel suddenly tired.  Preliminary results are very interesting, basically, the overhead of the set vanished. We will need to verify no coherence problem are introduced and redo our tests.

2.  Cache coherence between startup

Following a suggestion from Mark Atwood, the memcached key as been prefixed by the pid of the mysqld process.  This should avoid problems with cache coherence after restarting MySQL.  After some tests, this will be pushed to Launchpad.

3. Dynamic innodb_memcached_servers

Adding a dynamic variable to MySQL is not that difficult but a dynamic string… Strings need a malloc a some point, I might need to preallocate a fixed size.  I someone knows how to do this… I am interested.  This variable would allow to dynamically change the memcached servers list, let’s say because  a peak load is expected.  Of course, this will invalidate the items already cached.

4. Error handling

Error handling in WaffleGrid is minimal, WaffleGrid is still very experimental.  It will need to be addressed.

5. MySQL optional compile option

I would like to have something like “./configure –enable-wafflegrid” as a MySQL build option and add optional compiles tags in the code.  I know very little about autoconf but Monty Taylor offered me to help…  When I have some time free, I will try to grab part of his brain, not too much… I don’t want to overflow.

6.  Drizzle port

I tried to apply the patch to the Drizzle InnoDB plugin and it failed miserably.  The new compression code change the InnoDB files significantly but it should not be too hard to create a patch for Drizzle.   The main constraint here is time, being a MySQL consultant doesn’t allow a big bandwidth for coding.

For now, it is about all, I will try to keep you up to date in terms of what is coming up with WaffleGrid.

Yves

December 17th, 2008

Waffle Grid: A Meeting of the minds

Yves and I had the rare opportunity to share a gig this week, which means we have been spending a lot of time over meals discussing how to improve Waffle Grid.  Yves will be blogging about some of the things we talked about very soon, I will also be publishing my EC2 experiences with Waffle Grid very soon.  Stay Tuned!

December 9th, 2008

Waffle & SSD Coming to a MySQL User Conference Near You

For those interested In SSD or Wafflegrid I will be presenting on both topics at the 2009 MySQL User Conference!  I want to keep these fresh, so their will be more then just a rehash of my blog here, but their will be some overlap.  Interested in something I have not talked about yet?  Drop me a line!  Always looking for good ideas.

December 3rd, 2008

Waffle Grid: Remote Buffer Cache -VS- SSD Grudge Match

As one of the co-founders of the Waffle Grid project, I beam with pride every time I get a stellar benchmark or every time I find a new use for the Waffle.  But as a professional I still have to be critical of all solutions I would recommend or deploy.  One of the big goals of Waffle Grid is to replace disk IO which should be slow with remote memory which should be much faster.  But what happens when the disk is no longer slow?  This leads me to ask myself, is Waffle Grid only good for servers with slower disk?  Or is this a solution that can also help systems with fast disk?  So which should you deploy SSD -vs- Waffle?  Are they competitors?  Or are they complementary technologies?

I am going to say this, in these tests latency is king.  The faster the drives can deliver data, the higher the benchmarks should be.  Basically if my interconnect can deliver faster then the drive can serve up data, I should still see Waffle Grid perform better then SSD.  A note, all previous tests were done against 2 stripped 10K RPM disks.  So from a latency perspective how does the Intel do?

Read the rest of this entry »

December 1st, 2008

Waffle Grid: 300% better then before

Maybe I should make the sub-tag line, make your MySQL database run up to 300% faster by using Waffle Grid…   Ahhh its all about marketing.

Had an interesting holiday, snuck away several times to benchmark and test out the new Waffle Grid release… nothing says turkey day like Waffles and a really fast database  ( Hey! Yes my wife knew she married a complete and utter nerd, she just might not have understood the depths of my nerdom ).  As I mentioned before with the release of MySQL 5.1.30 I switched over all my testing over to this version, and in my tests this combination is running really well.  Additionally I had a hardware issue which compelled me to retest some of the tests I did last week. So who loves benchmarks!  I DO I DO!!!  I like to go fast, so lets get right down to it.

I ran a series of 20W DBT2 benchmarks on MySQL 5.1.30 with and without waffle grid ( and with a new patched memcached ), take a look:

DBT2 – Memcached Patch TPM
768M Local 3218.54
768M Local/768 Remote No MC patch 6575.8
768M Local/768 Remote MC patch 9371.54
1500M Local 16508.54

Read the rest of this entry »

November 29th, 2008

Waffle Grid: 0.2 Patches are available!

Ok, Yves pushed up two patches onto launchpad the other Day. I wanted to get a chance to fully test out the patches before blogging about them, and as I mentioned in my previous blog post, I have been dealing with some hardware issues. I spent the day testing out these patches, running them through the paces looking for potential slowdowns or issues, and everything looks good. I am going to limit this post to just the new features, as I am still finalizing a few benchmarks as I write this.

Whats new?

First the patches work against 5.1.30. I have been running all my recent tests against these without any issues so far ( knock on wood ).

Second you will notice 2 patches. There is a patch to memcached which will only work in 1.2.5 right now ( 1.2.6 changes how the memcached LRU queue gets updated ). This patch is probably only useful for Waffle grid deployments as it changes a get in memcached to not only retrieve an object, but put the object to the front of the queue for eviction.   This ensures that this useless data will be cleared first when we need space.  I blogged about the why a few days ago if your interested.   Everything will still work without the memcached patch, but this improves performance significantly.  Significantly as in this patch increased my DBT2 cache hit ratio from around 65% to 95% and dbt2 scores where 50% higher with the patched memcached. Benchmarks will follow shortly.

Also you will see a new my.cnf variable called “innodb_memcached_servers”. This can be used to set your memcached servers for waffle grid, so no more recompiling hardcoded servers!

And finally you can access hit/miss stats directly from the database, these are available when you run show innodb status.

———

MEMCACHED

———

Memcached puts 386564

Memcached hits 336566

Memcached misses 39942

————–
I have a slew of benchmarks I will be putting up in the next couple of days including 100Mb-vs-1000Mb-vs-localhost numbers, benchmarks with the memcached patch, and more.