Looking for MySQL DBA

July 21st, 2008

Concentric, the company I work for, is looking for a MySQL DBA! The position is in San Jose, CA, and here is a link to the official job posting. To apply, please email your resume and a cover letter to SELECT REVERSE(’moc.ox.ws@semuser’);. Include “MySQL DBA” in your subject line. Here is the description:

Concentric, an XO Communications Service, is looking to grow its Engineering team with a MySQL Database Administrator. The successful candidate will have great technical depth and breadth to span our applications, server and web service. We want to talk to people who can think out-of-the-box to help us find unique solutions in a fast moving environment.

Responsibilities

  • Design and support the MySQL 5 databases for customer-facing and back-end systems used in a high volume internet services company using high availability techniques. Applications will be but not limited to: Customer application data, back-end billing data, system statistics, data-warehousing, archiving, calendaring, web site, and email data.
  • Ownership of entity relationship designs for development of MySQL 5 databases while working very close with Software Engineering and Operations.
  • Ownership of all database performance characteristics including query plans, views, stored procedures, triggers and optimizations. This requires close working with Engineering and Operations to stay on top of performance metrics and pre-determine eventual bottlenecks and prevent, work around or through them.
  • Capability to work close and effectively with a cross functional team (product management, software and network engineering, and QA)

Qualifications

  • B.S in engineering or computer science, or other degree or equivalent professional experience.
  • Seven years of database development or DBA experience along with substantial MySQL experience.
  • Takes ownership, responsible and committed

Desired Experience

  • Knowledge or experience with one other prominent relational database (Oracle, MS SQL Server, Postgres, etc).
  • Programming knowledge with one or more database access APIs
  • Experience on a high traffic and high volume commercial sites (Commerce, Email, Web Data, etc)
  • Experience with VERY LARGE data sets as well as knowledge on how to painlessly grow them and keep them running optimally.
  • Experience in High Availability solutions including NDB and Replication.
  • Experience in implementing and tweaking open source software
  • Experience in Solaris, Linux and Windows production environments.

OSCON

July 20th, 2008

I’m getting geared up for OSCON next week. It looks like there will be quite a few regular sessions, along with a number of Birds of a Feather sessions, focused on MySQL.

For those making it out, I thought I would mention a couple fun things to do around Portland. I’ve been living here about three years now, so I haven’t seen it all, but these are some must-sees.

  • Powells Technical Books - I know they are a conference sponsor and will have a booth, but try to make it over to the real store (it’s just a short walk across the bridge from the convention center). Not only is it one of the best technical book selections I’ve seen, they also have a small computer museum with all the classics.
  • Voodoo Doughnut - You can get some late night snacks for hacking here (it’s open 24 hours/day). They have regular and vegan doughnuts, along with wedding services in case you find that special someone.
  • Portland Tram - You can get some great views from up here along with testing if you’re scared of heights.
  • Forest Park - For the runners/hikers out there, you have to check this place out. The best areas to enter are off of NW Thurman.
  • Vegetarian/Vegan Restaurants - If you happen to be veggie (or not and just want some tasty food), Portland is a great place for it. My personal favorites include Blossoming Lotus and Veggie Thai. I’ll be trekking out to these places from the convention center if anyone would like to join.

I’d be happy to try and answer any other questions that a local might know. Hope to see you there!

Hot off the press

July 1st, 2008

The UPS decided to visit today and left this on the front doorstep:

Woot! If it’s even half as good as the tips and tricks described in the authors’ blogs we should all be in for a treat.

…and while I’m posting pictures, I can’t resist sharing this photo we took while visiting my family in Maine last week:

Yes, that’s a full size riding lawn mower strapped to the top of a small Ford Escort hatchback.

Summer Reading

June 20th, 2008

With the bulk of my classes over, I finally have time to read some hand-picked books (rather than those 1000 page dry volumes required for courses). This is perfect timing since the new “High Performance MySQL” book was just released. I got a little carried away with my order, but felt I should catch up on some books I’ve been eyeing for some time. Anyone else care to share their summer reading list?

By June 30th, I should have:

  • High Performance MySQL: Optimization, Backups, Replication, and More
  • The Art of Multiprocessor Programming
  • The Definitive ANTLR Reference: Building Domain-Specific Languages
  • Programming Erlang: Software for a Concurrent World
  • The Joy of Vegan Baking: The Compassionate Cooks’ Traditional Treats and Sinful Sweets

(The last book is for both Wendy and me, it should help provide tasty treats while reading the others)

I was also considering “Managing Gigabytes” but choose not to purchase it since it is a bit dated (published in 1999) and it appears that close to half the book is dedicated towards image compression. While this would be interesting, I’m mostly interested in the full-text index structures and algorithms portion. Any recommendations for more information on the subject (online docs or books)?

It’s Over (sort of)

June 11th, 2008

I just finished taking my last college final exam. Ever*. No more lectures, homework, or tests. I still have a group project to finish up over the summer before they hand me my shiny piece of paper proving that I know how to use a computer, but this was a big step getting there. :)

* (Yes, grad school is always a possibility, but I have no plans yet)

Pondering MapReduce

June 11th, 2008

I spent this past weekend writing a Paper for a project I’ve been playing with. It is a simplified distributed processing system loosely based on Google’s MapReduce, except rather than focusing on larger batch jobs, it prototypes out some common database application uses. The model is currently very basic, but I plan on exploring this further (possibly with a performance-enhanced implementation in C). I’ve also been reading up on other interesting projects like Hadoop, HyperTable, Amazon’s SimpleDB, and of course the DB interface for Google’s AppEngine. I’m wondering how these distributed models can apply to relational databases (the Pig project has some relational constructs).

OSS Business Model?

May 16th, 2008

Let me prefix this post by stating that I’m a developer and not a business analyst, but I wanted to point out something that strikes me as a bit contradictory. In recent posts like these by Savio Rodrigues and Matt Asay, there are references that say the commercial plugins MySQL/Sun was planning on releasing were part of an OSS business model. While the plugins would have been tied to an OSS project, the products themselves would have been commercial*. I think these products (and other current products MySQL/Sun offers only as closed source) should actually be considered as commercial product offerings (under a commercial business model), and not associated with a pure OSS business model. To me (and some may disagree) an OSS business model involves only software that provides the source to everyone, not just those who pay to license the source. For example, from a commercial business model perspective, how would those MySQL plugins have differed from a commercial Microsoft Outlook plugin? Sure, the MySQL plugin would have been plugging into an open source product, but the plugin product itself is just as closed (well, I suppose with Microsoft you need to buy the development environment if you wanted to write your own).

Also, I don’t believe “the community” was hurting the OSS business model in any way; on the contrary, it was helping preserve it. As you could see from the many responses to the MySQL announcement, this was only going to push some community members away. Were they hurting the commercial/hybrid business model? Of course! Who want’s to see their favorite open source database company venturing further into closed source commercial offerings? The community is going to speak up to keep everything open.

In summary, I’m not sure what the best business model would be for open source or commercial (like I said, I’m a developer), but when talking about “OSS Business Models”, just be clear what is actually open source and what is commercial. Perhaps I’m being a bit naive, but I’m still hoping OSS businesses find a way to thrive while keep all products 100% open.

* Even if they had given the source code away with the product, it would not have necessarily been “Open Source”, but rather just a source code license for a commercial product. If it really was going to be released under open source, then the first customer could have legally redistributed the source to the community, thus negating the product to begin with.

Asynchronous I/O - How It Could Speed Up Your App

May 6th, 2008

In a previous post I wrote about how I have started implementing asynchronous I/O into the MySQL client library. I plan on contacting and working with other client API maintainers (PHP, Python, Ruby, …) to make sure this functionality gets pushed out to those places too. Any comments or suggestions on how the interfaces should behave are of course welcome, and I’ll get patches posted somewhere for testing once I have the basics working. This is also my first project going through the MySQL Community Contributions Program so it can be included as part of a later release. I sent out the first contact e-mail to MySQL a week ago, but have not received any response yet. Anyone at MySQL listening? :)

For those of you new to the idea of asynchronous clients, check out the Wikipedia pagefor an introduction. The basic idea is to be able to issue a query, have the function return immediately, do some other work (while the server processes the query), and then check for the query response and process the result (as normal). This may not be much of a gain (if any) for fast queries, but for those potentially sluggish queries that take a few hundred milliseconds, this could be significant (assuming you have something else to process during that time). This is especially so if you need to issue multiple queries that take a little time since they could all run in parallel connections.

Let me demonstrate with a simple use case, ignoring error checking for brevity:

...
$mysql = mysql_connect("localhost", "myuser", "mypass");
$result = mysql_query("...query that takes 500ms...");
...process result...
$result = mysql_query("...query that takes 500ms...");
...process result...
...

Imagine this is part of a PHP script for our new webpage, and it needs to issue two queries to the MySQL server. Each of these queries takes approximately 500ms, resulting in a total processing time of ~1 second. Now if the code uses asynchronous I/O:

...
$mysql[0] = mysql_connect(”localhost”, “myuser”, “mypass”, 1, MYSQL_CLIENT_ASIO);
$mysql[1] = mysql_connect(”localhost”, “myuser”, “mypass”, 1, MYSQL_CLIENT_ASIO);
mysql_query_start(”…query that takes 500ms…”, $mysql[0]);
mysql_query_start(”…query that takes 500ms…”, $mysql[1]);
$result = mysql_wait($mysql[0]);
…process result…
$result = mysql_wait($mysql[1]);
…process result…
…

(Note, this is psuedo code, it will just break if you try it currently!)

In this example, the “mysql_query_start” function calls will return immediately (send the packet out on the socket and return). This allows the server to process both of them in parallel, resulting in a total processing time closer to ~500ms. We just cut our page load time in half! (…well, ignoring network latency, but you get the idea)

You can of course expand on this by doing some other time consuming processing before waiting, say image manipulation. You can also issue more than two SQL queries (possibly to different servers) and then collect all the responses in the end. I also plan on adding a query pool so that you can add multiple pending queries to it and wait for the first one that returns. This way you would not need to order your “mysql_wait” calls in how you *think* they will return, you will always get them in the order in which they finish.

Again, this is still in it’s infancy, so please let me know if you have any comments or suggestions!

Will Sun’s job cuts hit MySQL?

May 2nd, 2008

I just came across this article about Sun’s numbers, and read “Schwartz plans to trim as many as 2,500 workers.” Does anyone know if MySQL will get hit? It seems like they still need help (quite a few job requisitions open), and this obviously won’t help matters much.

Salutations, MySQL Community

April 26th, 2008

For the past couple weeks I’ve been catching up on the Planet MySQL blogs in preparation to start participating in the community. I figure this would be the best way to get to know the “who’s who” of the most active members and to follow the latest features and developments. I’ve also been doing a fair amount of reading/research on MySQL development and databases in general. In particular, I’ve found the new MySQL Forge site very helpful (especially the wiki), along with the “Understanding MySQL Internals” book written by Sasha Pachev. My plan is to become familiar enough with the code base to contribute new features, help fix bugs, and hopefully meet some friends along the way! In particular, I have an interest (and experience) in parallel, clustered, and distributed computing, and trying to get the most out of these multi-core processors. I was happy to see others were interested as well (for example, this post by Keith Murphy).

A few projects in particular that I’m going to start poking at are:

  • Asynchronous I/O support in the client library.
  • Connection pooling in the client library (need ASIO first).
  • Multi-master replication (real multi, not just dual or circular).
  • Synchronous replication for low-latency environments (building on Google’s patches to support ALL nodes).

I’ll be writing additional blog entries for each of these later. I know many (all?) of these are already on the radar within the MySQL development team, but hopefully there are ways to aid in some of this core development without being an employee of MySQl AB/Sun. I already have an email out to Jim Winstead on the ASIO support since I saw him mention it in a previous blog entry.

As a little background on myself, I’m a senior software engineer at Concentric (blog entries have the usual disclaimer: they are my opinions and do not express those of my employer). I have been writing high performance, high availability software in a clustered environment for some time now (including some proprietary clustered database projects). I’ve been a supporter of open source since my first Slackware v3 Linux install, and have been using MySQL since the late 90s. I’ve always been impressed by the stability and progress of the MySQL project, and when I spent some time to evaluate the NDB storage engine last year (very cool stuff), I decided to get myself up to speed so I could participate in the community. So, after front-loading a bunch of research, I’m excited to begin. :)