Blog | Admin | Archives

Buck Twenty-Nine Fail

About a year ago, Amazon.com raised the prices on some of their MP3 offerings to $1.29. Previously, songs were offered in the $0.89 to $0.99price range individually, and less when buying entire albums.
This was a move mirrored by Apple and other online music sales due to price hikes and retail price demands from record labels

I was a big fan of Amazon MP3, and this price hike greatly saddened me. It also changed my music consumption habits, or rather reverted them. I’ll still buy a $0.99 song from Amazon, but if the offering isn’t available at that price point, I will break the law and download the song — often the entire album, because that’s just as easy — for free.

The music industry continues to slowly dig it’s own grave.

MySQL Conference Day 1

My first day at my first MySQL conference was a riotous success. I attended the “State of the Dolphin” keynote followed by talks given by Tim O’Reilly and Facebook’s own Mark Callaghan, who also won a MySQL Community Member of the Year Award during the opening talks. Congrats to Mark!

After the Keynotes, I synced up with other Facebookers at our expo hall booth, and then I went to Domas Mituzas’ talk on “High Concurrency MySQL”. The ballroom couldn’t hold all the people who wanted to watch — there was actually a line outside the door of people listening in on his talk! Although I wouldn’t suggest Domas give up his day job to write slides full-time, he had a great presentation overall that kept the audience interested and engaged.

Next, I attended a presentation on Sqoop by my two-time TA at the UW and now Cloudera co-founder and presenter extraordinaire, Aaron Kimball. Sqoop is a SQL-to-Hadoop translation layer that automates many of the steps of shuttling data from OLTP stores to HDFS for analytics. It is open source and Aaron is it’s primary developer. You can check out the code on github, or use it as part of Cloudera’s Hadoop Distribution.

After lunch, I went to a presentation by Lars Thalmann on new MySQL replication features in 5.1 and 5.5. Lead replication developer Mats Kindall was also there to answer questions. It’s good to see that MySQL is making progress on replication, but it is still woefully limited in a number of ways: not crashproof, single-threaded, and difficulty in replicating to non-MySQL data stores are all weak points of MySQL’s replication system today. These are all on the roadmap, but from the answers to my questions, I got the impression that these ideas are still mostly bullet points on a slide rather than almost-features in MySQL.

Make no mistake, these features are hard to add — I’ve dabbled around in the area myself — and it took Mark a concerted effort to port rpl_transaction_enabled from our 5.0 patch to Facebook’s 5.1 patch. Still, I hope MySQL takes the rpl_transaction_enabled patch and  into 5.1 or 5.5 officially, because in any large deployment, it is incredibly useful to not manually intervene when a slave crashes.

After the replication talk, I went back to the expo hall to talk with people, then I hacked on MySQL in the afternoon. Could there possibly be a better venue for this? Two (small) diffs later, and I was back into the expo hall socializing/recruiting for Facebook. The night ended well with a trip to In-and-Out.

MySQL Conference Begins Tomorrow

The 2010 O’Reilly MySQL Conference starts tomorrow in Santa Clara. Facebook’s Database Engineering team (which includes me!) will be there along with some of our Operations team and our one-man Performance team. Each team will be giving a talk at the conference:

On Tuesday, the Database Performance Team will be presenting on “High Concurrency MySQL.” Domas is an interesting, animated fellow, and I imagine that his talk will be quite entertaining as well as informative.

On Wednesday, the Database Operations Team will be speaking about Database Operations at Scale. Our DB Ops guys are some of the best in the business; they keep our database tiers, which are often under enormous pressure from growth and changing requirements, running remarkably well.

On Thursday, Mark Callaghan, Ryan Mack, and I will be presenting our talk on High-throughput MySQL (we claim that Domas stole our title rather than the other way around). Mark Callaghan is one of the leading advocates for MySQL at Facebook and in the MySQL community. Working with him and the original Ryan (as I call Ryan Mack, who preceded me on this team) has been nothing short of an extraordinary opportunity for me to learn from the best.

Uncommunicative Tweets

My friends are occasionally perplexed  by my tweets. In response to one recent tweet, my friend Dan responded:

@RyanMcE You need to add more words if you intend your tweets to be communicative.

And he is absolutely correct. In this case, there is nothing private about the tweet in question (“Dubious indeed”). The story was that my friend Maria and I pulled an April Fools prank on Facebook by becoming engaged. Enough people fell for it that it was fun, but one of my friends called the timing of the announcement “dubious,” since it did come of the first of April. The tweet was in reference to this comment; probably only those who happen to follow me on Facebook would have had any idea what I was talking about.

So, if there is a tweet you don’t understand, know this: not all my tweets are meant to be communicative to all audiences.  Just like with some of my blog posts, some of my tweets are really just markers in time for my future reference. I wrote about something like this before, in a  post called Why I Blog, and before twitter, I would occasionally post a one-liner to this blog. Now those one-liners have simply migrated to Twitter.

On Password Restrictions

Websites should list their password restrictions on their login pages. Sometimes I run into the following problem:

I try to use a password generated by my “standard model” — ie, a standard prefix depending on the nature of the site and some salt determined by the website itself. However, some sites have stupid rules on their password requirements. In real life, I have encountered a wide variety of password requirements:

  • A requirement of an exactly 6-character password
  • A prohibition on “special characters” like any of !@#$%^&*()+=></?{}[]|\/.
  • A requirement for a special character that happens to be one of !@#$%^&*()
  • A requirement for numbers, uppercase, and lower case in the password
  • A requirement for two sets of letters and numbers in the password — ie, fit the regex /([a-zA-Z]+[0-9]+){2}/

When my standard model password doesn’t fit into one of the more esoteric requirements, I have to modify it to fit. Fortunately, I find that on this subject at least, I tend to think the same way over time, so, given the standard model and a set of constraints, I will usually come up with the same password. However, it is uncommon for websites to list their password constraints on the log-in page. Therefore, I will usually try the standard model password first, and only when that fails twice (in case I mistyped the first time), and I’m down to one more try, do I realize that this website might be “special.”

Then I have to go to the trouble to find out what the password requirements are. This is not difficult — usually it involves clicking the “sign up button” and reading a little bit — but it does take some time and it is very annoying. Listing the password requirements at the login screen would make for a much better user experience (since it is so easy to find this information, not displaying it on the login screen can’t be interpreted as a security measure either).

Of course, the real solution is for websites to get rid of their inane password requirements, so I never have to deviate from the standard model.

Safety Agains Reopen

What does this comment in the MySQL source mean? (log.cc, currently line 2295 in 5.1)

{ // Safety agains reopen

I think I understand what it’s supposed to mean — the writer is pointing out that the code is checking again, to be double sure that the log is still open (although, if it can close between this call and the last call to is_open(), I’d be worried about it closing after this call too… note that both checks are after LOCK_log has been acquired).

What I’m more interested in is what the comment, as written, actually means? The grammar is very odd. I’m open to suggestions.

Blog Optimization

In the last two days, I

  1. Changed my blog’s MySQL tables storage engines from the MyISAM to InnoDB
  2. Installed the WordPress Memcache Plugin to mimimize database queries (16-25 queries reduced to 2-7)
  3. Installed APC (Alternative PHP Cache) to reduce PHP bytecode compilation overhead. As a result, all PHP sites on mimimus should be faster.

In addition, I did some general cleaning up and upgrading of software on minimus and nexus.

Altogether, these changes reduce the typical Checksum Arcanius page load from 2.5-3.5 seconds to 0.5-1.5 seconds, a 2-7x improvement.

These are very easy steps to take — I would suggest them to anyone running WordPress. Step-by-step directions follow (assuming Ubuntu Linux):

  1. For each table in your blog’s database, execute the following SQL via a mysql client instance, phpMyAdmin, etc:
    ALTER TABLE <tablename> ENGINE = InnoDB;
  2. Install memcache:
    sudo apt-get install memcache
  3. Download the WordPress Memcache Plugin and place it in your wp-content directory. That is all you have to do to get memcache support in WordPress!
  4. Install APC:
    sudo apt-get install php-apc
  5. Restart Apache:
    sudo /etc/init.d/apache2 restart

Very simple steps with a very high payoff.