Scale Stack and Database Proxy Prototype

Back in January when I was between jobs I had a free weekend to do some fun hacking. I decided to start a new open source project that had been brewing in the back of my head and since then have been poking at it on the weekends and an occasional late night. I decided to call it Scale Stack because it aims to provide a scalable network service stack. This may sound a bit generic and boring, but let me show a graph of a database proxy module I slapped together in the past couple days:

Database Proxy Graph

I setup MySQL 5.5.2-m2 and ran the sysbench read-only tests against it with 1-8192 threads. I then started up the database proxy module built on Scale Stack so sysbench would route through that, and you can see the concurrency improved quite a bit at higher thread counts. The database module doesn’t do much, it simply does connection concentration, mapping M to N connections, where N is a fixed parameter given at startup.

In this case I always mapped all incoming sysbench connections down to 128 connections between Scale Stack and MySQL. It also uses a fixed number of threads and is entirely non-blocking. As you can see the max throughput around 64 threads is a bit lower, but I’ve not done much to optimize this yet (there should be some easy improvements where I simply stuck in a mutex instead of doing a lockless queue). It’s only a simple proof-of-concept module to see how well this would work, but it’s a start to a potentially useful module built on the other Scale Stack components. One other thing to mention is that these tests were run on a single 16-core Intel machine. I’d really like to test this with multiple machines at some point.

So, what is Scale Stack?

Check out the website for a simple overview of what it is. The goal is to pick up where the operating system kernel leaves off with the network stack. It is written in C++ and is extremely modular with only the module loader, option parsing, and basic log in the kernel library. It uses Monty Taylor’s pandora-build autoconf files to provide a sane modular build system, along with some modifications I made so dependency tracking is done between modules. You can actually use it to write modules that would do anything, I’m just most interested in network service based modules.

The kernel/module loader is also just a library, so you can actually embed this into existing applications as well. Some of the modules I’ve written for it are a threaded event handling module based on libevent/pthreads and a TCP socket module. There is also an echo server and simple proxy module I created while testing the event and socket modules. The database proxy module builds on top of the event and socket module. The code is under the BSD license and is up on Launchpad, so feel free to check it out and contribute. If you need a base to build high-performance network services on, you should definitely take a look and talk with me.

What’s up next?

I have a long list of things I would like to do with this, but first up are still some basics. This includes other socket type modules like TLS/SSL, UDP, and Unix sockets. Then are some more protocol modules such as Drizzle, a real MySQL protocol module, and others like HTTP, Gearman, and memcached. It’s fairly trivial to write these since the socket modules handle all buffering and provide a simple API. As for the DatabaseProxy module, I’d like to rework how things are now so it’s not MySQL protocol specific, integrate other protocol modules, improve performance, add in multi-tenancy support for quality-of-service queuing based on account rules, and a laundry list of other features I won’t bore you with right now.

I also have plans for other services besides a database proxy, especially one that could combine a number of protocols into a generic URI server with pluggable handlers so you can do some interesting translations between modules (like Apache httpd but not http-centric). For example, think of the crazy things you can do with Twisted for Python, but now with a fast, threaded C++ kernel. I also still need to experiment with live reloading of modules, but I’m not sure if this will be worthwhile yet.

If any of this sounds interesting, get in touch, I’d love to have some help! I’ll have some blog posts later on how to get started writing modules, but for now just take a look at the existing modules. The EchoServer is a good place to start since it is pretty simple. Also, if you’ll be at the MySQL Conference and Expo next week, I’d be happy to talk more about it then.