Skip to content

Toru Maesaka
Syndicate content
Hackaholic and a Web Addict based in Tokyo
Updated: 5 hours 52 min ago

Two Weeks in Review

Fri, 08/27/2010 - 10:58

It’s been almost two weeks since I left mixi where I literally had a great time for the past 3 years and 8 months. Here’s what I’ve been up to lately.

  • Having dinner every night with life-long friends that I made through working at mixi.
  • BBQ in the Mountains of Chiba with friends.
  • Riding my new road bike (bicycle).
  • Day trip to Kamakura.
  • Studying relevant technologies for my next gig.
  • Searching for a new apartment to rent in Tokyo.
  • Throwing things out to make moving easier.
  • Cancelled my gym membership due to moving.
  • Writing final evaluation for Djellel on Google Summer of Code.
  • Retired my iPhone 3G that I’ve been using for over two years.

My free time comes from a surprise stack of unconsumed vacation time that I had left at mixi. You see, I hardly used my paid leave while I was at mixi due to working on exciting projects and having fun people around me. Time just flew.

As a replacement for my iPhone 3G, I bought a BlackBerry Bold 9700 which now makes me a BlackBerry + Android user. I don’t have anything against the iPhone as a product (although I have improvement suggestions) but I wanted a change after using it for over two years. I’m also hoping that the new iPod Touch will come with a camera which eliminates my desire for an iPhone 4. I still actively use my HTC Magic for mobile web surfing and tethering my b-mobile on the field.

My opinion on the BlackBerry so far is that it’s a powerful email machine combined with Google Sync and Google Contacts (now part of Gmail). I’m certain that I’ve been replying to more emails while I’m out than when I was using the iPhone. The web browsing experience on the 9700 is poor but I have an Android to fulfill that void.

Anyhow, I just wanted to let my readers know that I’m doing fine. My vacation ends next week so until then, please feel free to ping me for a hand on your project, drinking or whatever :)

Categories: Blogs, MySQL

BlitzDB Crash Safety and Auto Recovery

Thu, 07/22/2010 - 10:43

Crash Safety is a big deal in the database league. Lack of durability can lead to all sorts of terrible things upon a catastrophic event. Many projects, especially in the so called NoSQL world compromises crash safety in return for higher QPS. The argument there is that the availability of the overall system should be accomplished by replication since a database server can’t be rescued if the physical disk breaks. I happen to agree with this philosophy but I am also aware that this isn’t a correct answer for everyone. So, what will I do with BlitzDB?

Several relational database hackers have pointed out that BlitzDB isn’t any safer than MyISAM since it doesn’t guarantee crash safety. This is currently true but I plan on making BlitzDB much safer than MyISAM by providing following features.

  1. Auto Recovery Routine (startup option)
  2. Tokyo Cabinet’s Transaction API (table-specific option)

The second feature above would actually guarantee BlitzDB to be crash safe (especially combined with auto recovery) but I won’t get into depth in this post since this topic deserves a blog post of it’s own. Let me just state that this feature will be provided in a form like this:

CREATE TABLE t1 (
  a int PRIMARY KEY,
  b varchar(256)
) ENGINE = BLITZDB, CRASH_SAFE;

From here on, I’ll cover how I plan on hacking auto recovery in BlitzDB.

Auto Recovery Challenges

As I blogged a while back, recovering Tokyo Cabinet is relatively simple. However, this is not a sufficient solution in BlitzDB since the data file (hash database that actually holds the rows) and the index file(s) are independent from each other. That is, the likelihood of the data file and the index file(s) to be inconsistent is very high after a crash. So, how can we hack on this? Pretty simple.

Indexes aren’t Important at Recovery Phase

Because BlitzDB logically separates the data file and it’s indexes, index files aren’t that important. If a server crash had occurred, BlitzDB could delete the index file(s) and recompute them from the data file. Needless to say, this process would involve a lot of random access and computation but it would not dominate the time space of the system since it’s a one-time cost. This approach however has one flaw in it such that the index files can’t be recomputed if the data file is broken or is unrecoverable.

Therefore to guarantee crash safety, BlitzDB must ensure that the data file is unbreakable. This is precisely where Tokyo Cabinet’s Transaction API comes in. I’m planning on using it to protect the data file from breaking. If the data file is protected, the table can be rescued. Simple!

So, that’s what I have in mind for making BlitzDB a safer engine. Unfortunately I can’t start hacking on it immediately since I have several bugs to fix first. Nevertheless I’m looking forward to start hacking on it. This challenge should be quite fun to tackle.

Categories: Blogs, MySQL

Extending CREATE TABLE Syntax in Drizzle

Wed, 07/21/2010 - 18:37

The flexibility to add table-specific options for things like compression, encryption and optimization can be useful to storage engine developers as this flexibility can open up new possibilities. Here’s what I’m talking about:

CREATE TABLE t1 (
  ...
) ENGINE = my_engine, MY_OPTION = your_arg;

Supporting this is relatively easy in Drizzle and this API feature (and a bit more) is available in MariaDB as well. Unfortunately Drizzle’s method to do this isn’t documented in the Wiki yet but it should be added when our Storage Engine API becomes stable (as in, no interface changes).

Implement StorageEngine::doValidateTableOptions()

Here’s the actual interface.

bool StorageEngine::doValidateTableOptions(const std::string &key,
                                           const std::string &state);

This function is called for each table options given at CREATE TABLE syntax execution. The first argument, key is a const reference to a string that represents the option name. The second argument, state represents the argument given for that option.

Therefore, given: COMPRESSION = YES_PLEASE, key would be “COMPRESSION” and state would be “YES_PLEASE”. The objective of this function is to check whether the key/state pair makes sense to your storage engine. If this function returns false, Drizzle will return an error for the CREATE TABLE query. Personally I think this interface can be improved to be a bit more Developer friendly, such as making life easier to validate numeric values without enforcing the developer to play around with the data. Saying that, given the pace that Drizzle is growing, this could be improved before we know it.

Access Options at StorageEngine::doCreateTable()

Here’s the actual interface for doCreateTable().

int doCreateTable(drizzled::Session &session,
                  drizzled::Table &table_arg,
                  const drizzled::TableIdentifier &identifier,
                  drizzled::message::Table &table_proto);

Given that the options were successfully validated, doCreateTable() is called next. In Drizzle, all information regarding a table (including options) is represented in a Google Protocol Buffer message. A reference to that message object is passed to doCreateTable() as the fourth argument so all you need to do is loop through the options list in the message object and extract what you need. Here’s a minimal example that only takes care of one option.

int n_options = table_proto.engine().options_size();
 
for (int i = 0; i < n_options; i++) {
  if (table_proto.engine().options(i).name() == "my_option_name") {
    // Do whatever you like with this stream.
    std::istringstream stream(table_proto.engine().options(i).state());
  }
}

The above example should be simple to extend to handle multiple options. What’s really important in the above example is that the option name can be accessed with the name() accessor and the state (value) of that option with the state() accessor.

So, that’s all I have to cover for now. I hope this feature will help storage engine developers create and provide useful table specific features for their engine.

Happy Hacking.

Categories: Blogs, MySQL