Skip to content

Couchio - About CouchDB
Syndicate content
Updated: 5 hours 51 min ago

CouchDB: The Definitive Guide — Redesigned Website, Up To Date Content and All Open Source And Forkable on GitHub

Sat, 08/28/2010 - 23:22

Last week we (Noah, Chris and Jan, with great design help from Kristina Schneider) released our latest work on CouchDB: The Definitive Guide.

The latest update includes:

  • Updated and edited content as it appears in the printed book.
  • All new styles for your reading pleasure.
  • All new open source setup using Github for contributions and translations.

It took quite a while to get it all out, but we couldn’t be more proud about what we can offer you now. O’Reilly’s great editorial team went through all our chapters and cleaned up all the nitty-gritty and we think it turned out great. It feels almost like a real book now ;)

The Long Haul

The bulk of the work went into making sure the book is super easy to work on in the future. We want to be able to improve the book and collaborate with the open source community as much and as efficient as possible.

Minimal Technology

This includes switching writing the sources from Asciidoc to a subset of HTML. This allows us to work with the native web, our primary publication platform without having to deal with conversion issues left and right (trust us on that one).

The minimal nature of our markup might throw you off (we don’t even close our

’s!) but it is really great to write in and keeps everything lean and free from technological cruft that is not really needed. We encourage you to view source at least once.

A small XSLT (yuck, we know ;) transforms our minimal HTML into DocBook which O’Reilly in turn can take and produce a book from every once in a while.

Github Goodness

We’ve all been using Git and Github for quite some time in other projects and we finally migrated our book repository over (yes, Kristina is an avid Git user :). Half way through, Github introduced Organizations and it is just perfect for our needs.

You can now fork, edit, and contribute back to the book without much hassle. This is a good opportunity to thank the folks at O’Reilly again about their commitment to open source and allowing us to publish our work under the Creative Commons license.

Aftermath

Getting everything together truly felt like shipping a 1.0 project. Once we agreed on a final date, we cut our todo list in half in favour of getting done in time. There is still work to do, but we managed to get all the rough edges out.

The result is amazing, in the few days we’ve been live, more work on the content has been done than in the past six months. We already included contributions from third-party contributors and the German Translation is already two chapters into being done.

And nothing is stopping you from helping out :)

Here’s to the second edition!

— Chris, Noah, Jan & Kristina

Categories: Companies, NoSQL

What’s new in CouchDB 1.0 — Part 4: Security’n stuff: Users, Authentication, Authorisation and Permissions

Sat, 08/28/2010 - 22:22

Welcome to Part 4 on my little mini-series on new features in CouchDB 0.11.01.0. Do not miss parts one, two and three.

Today, I get a little help from Rebecca. She’s writing a CouchApp, an application that is served right out of CouchDB and that lives in the browser. It has no middle tier application server in Ruby or Java. The application and display logic is written in JavaScript, the user interface is HTML & CSS, the backend is CouchDB and uses Ajax to shove JSON back and forth.

Rebecca is writing a small todo list app for herself and her friends, but we’ll punt on the actual application for now, so we can concentrate on the security features. Part two and three of our book CouchDB: The Definitive Guide explains how the rest of the application development works, make sure to read up on it!

Security

Security is a wide field. This article only discusses some of things you need to know to add authenticated-user features to your CouchDB application (whether it is a CouchApp or a regular application). It does not discuss best practices for securing network servers or defending against cross site scripting.

View Source is Open Source

The CouchDB security model is based around the premise that Rebecca can control who can create documents of what form into which database inside CouchDB. It does not try to make CouchDB and all the data she and others put in is absolutely water-tight and doesn’t leak any information. Although you can lock CouchDB as much down as you need, open and sharable databases are the default and it is a good thing.

Traditional applications are built with a database in the back; an application is the only logical “user” of a database, the only entity that accesses the database directly (aside from maybe an administrator). CouchDB happily supports that model (there is nothing wrong with it either). The only case where security is relevant here is shared hosting where you have multiple mutually untrusting parties accessing a single CouchDB instance. The mechanics I describe can be used to make CouchDB useful here, but I won’t describe this specific scenario. Partly because it is a lot simpler to just give every user a separate full instance of CouchDB with “root” access, but mostly because I believe there is a much more interesting deployment scenario:

The idea of standalone CouchApps is that they travel with the data, since they are just some HTML, CSS & JavaScript that CouchDB tack onto a _design document as attachments. Applications as data allow us to replicate them around just like we do with data. Data ultimately wants to be free and shareable with the people and applications we trust. Why shouldn’t applications do the same?

Since CouchApps run in the browser you can’t hide their implementation anywhere. In that, CouchApps are inherently Open Source and we believe that is a good thing because that is how the web works and that’s a crucial feature of the web. View source allows everyone curious to learn how a website was built.

Yadda, yadda, Open Source zealottery, you can’t hear it anymore, sorry :) — If you like your Rails, Django, PHP or Java, CouchDB won’t prevent you from using them. You can create private, closed source applications, but you’re losing the powerful attribute of native app-shareability.

All that said you can make CouchDB as closed as you need it, but with each barrier to entry you lose a layer of data and application flexibility. You might want to reconsider some of your previous ideas about how to lock down your app in the light of ultra-portable, peer-to-peer shareable applications.

Ok, CouchApps are a big deal, you get that now, I’ll shut up. Back to the nuts and bolts of CouchDB security :)

Terminology

Lets make sure we talk about the same things.

  • Admin Party: CouchDB by default comes in the admin party mode. Each request made to CouchDB is considered to come from an admin. This means it is extremely easy to get started. Isn’t that terribly insecure, you ask. By default CouchDB will only listen on 127.0.0.1, your localhost IP address. Only users on your computer can access CouchDB. Most of the time that‘s just you, so no biggie; but be aware of this when you are on machine with multiple, possibly untrusting users.

  • Database: A database is a bucket that holds any number of documents in CouchDB. Each CouchDB server can have any number of databases. Each database is self contained and access to CouchDB can be defined on a per-database level. I’ll show you how below.

  • User: A user is identified by a username and matching password that is securely stored inside CouchDB. A user can have one or more roles assigned. A user with an empty name and password is the anonymous user. CouchDB further distinguishes admin users between server admins and database admins.

  • Authentication: The process of a user proving it’s her by providing the correct username / password combination in an authenticated HTTP request.

  • Authorisation: The process of determining whether an authenticated user is allowed to do what she wants to do.

  • Roles: Roles are associated with users, you could also call them “group”. For example, in Admin Party mode, each request is implicitly authenticated with the anonymous user that in turn implicitly gets assigned the _admin role that allows each request to do anything.

  • Anonymous User: A user with an empty username and password. All unauthenticated requests are implicitly assigned to the anonymous user.

  • Access Control Lists: A list of usernames or roles for a database. CouchDB distinguishes reader-ACLs and admin-ACLs. A database admin can fully access the database. The reader-ACL list defines a list of users or roles that can read from the database. If no reader-ACLs are defined, everybody can read from the database. Note that there is no writer-ACL; see Validation Functions next.

  • Validation Functions: A JavaScript function stored in the validate_doc_update field of a _design document. It gets executed whenever a write requests reaches the database. It can decide to allow or deny access to the database based on the document that is being written and the authenticated user or her roles.

  • Stateless HTTP: Each HTTP requests stands on its own. A client and server do not expect any previous requests have be made.

  • Basic Auth: An authentication mechanism for HTTP that uses base64 encoded headers to send a users credentials to the server. Most notably known for producing an ugly pop-up window in the browser (although this can be prevented). Base64 encoding is not a form of encryption. It is not a safe transport for user credentials. It is easy for third parties to spy out a user’s password. Basic auth can only reasonably used in a trusted environment; a local LAN, a VPN or over SSL.

  • Secure Cookie Auth: Unlike Basic Auth Secure Cookie Auth uses HMAC-encryption for transporting user credentials. It can be used to securely authenticate users over an untrusted connection.

  • OAuth: Lets users allow applications to authenticate as the user to a service. The canonical example is a web application that does something with a user’s private account data on another service. With OAuth, the web application does not have to know the user’s credentials to do its work. Access permissions can be managed and revoked on a per-application basis. OAuth is not limited to web applications though.

Getting Started with Security in Futon

Let’s start with a blank slate. A fresh installation of CouchDB 0.11.0 or later, the admin party and a look at Futon:

In the lower right you should see that in fact, we are having an admin party.

Admin Party!

You should also see a link that asks you to “Fix This”. Click it and you should see a form that asks you to specify a username and password for the first server admin user.

Creating an Admin User

I put in rebecca and 12345, you can choose whatever else you like. Just remember it, otherwise, you’re locked out of the server (there are ways to get in again, but that’s beyond the scope of this article.)

The lower right should now greet you with your username and offer you to create more admins or to log out. Futon uses Secure Cookie Authentication to keep you logged in.

Logged In

At this point, CouchDB no longer runs in Admin Party mode and requires you to be logged in to perform certain actions: Creating or deleting a database; creating, updating or deleting a _design document inside a database; read or update the _config API; read _stats or _log and request temporary views.

Under the Hood

Let’s see what goes on under the hood of Futon. We’re following the same steps as above, only we use curl on the command line instead of Futon.

First, see if CouchDB is running:

> curl http://127.0.0.1:5984/
{"couchdb":"Welcome","version":"1.0.0"}

Yes!

Next, let’s create an admin user:

> curl -X PUT http://127.0.0.1:5984/_config/admins/rebecca -d’"12345"’
""

Well that was easy. We created a config-level admin user and that takes CouchDB right out of admin party mode (intentionally left over production note: reference “partypooper” in some way).

Now all administrative requests to CouchDB (see above) need to be authenticated. To make your life a little easier Futon does a little dance for you and automatically logs you in as the newly created user.

What does logs you in mean? First, Futon creates a new document in the _users database. It has a special format that you have to follow if you are doing this on your own.

Luckily CouchDB’s buit-in client libraries couch.js and jquery.couch.js do all the heavy lifting for you.

> cat rebecca.json 
{
  "_id":"org.couchdb.user:rebecca",
  "name": "rebecca",
  "salt": "68cf5946d9760d19759b5016d90f612c",
  "password_sha": "3588a9b2039e53b674d8da361e4be98f00637f5a",
  "type":"user",
  "roles":["admin"]
}
> curl -X PUT "http://127.0.0.1:5984/_users/org.couchdb.user%3Arebecca" \
   -d@rebecca.json
{
   "ok":true,
   "id":"org.couchdb.user:rebecca",
   "rev":"1-9aa9e9a2c855e81061d6d8553d6adbc5"
}

This is laying foundations for the future. Once you have a user document in the _users database, you can use the _session API to get an encrypted session cookie that authenticates you for the next few requests. By default, a session cookie is valid for 10 minutes.

Showing all the cookie business with curl would be a little tedious, so I’ll jump over to how to do all that in your own code.

Aside: What’s the deal with users or admins created with the _config API vs. the _users database? Admins created through the _config API are persisted to CouchDB’s configuration file ($prefix/etc/couchdb/local.ini by default). Users in the _users database are stored in that database. For some setups, it is required that some external tool is able to create a user for CouchDB without having a user or admin account on that CouchDB, but access to the ini file (think system setup software). In addition, _config users are always automatically server admins, so use them with care.

Using jquery.couch.js in Your Application

jquery.couch.js is the standard JavaScript API that ships with CouchDB. Futon uses it for its snazzy interface. And so can you, or should, really, unless you want to re-do all the work the CouchDB project put into it :)

Let me show you the methods in question. I’m quoting right out of the API docs:

$.couch.signup(user_doc, password, options)
  • Hashes the password
  • Adds an empty roles array to the user_doc when not specified
  • Adds an id, composed of “org.couchdb.user:” and name, to the userdoc when not specified
  • Saves the user_doc with options as parameters in the userDb
  • Performs the success callback on the saved user_doc
$.couch.login(options)
  • Does a POST request to “_session” with username and password, they have to be present in the options hash. Throws a 404 error when the password is wrong or there is no user with that username stored in the userDb.
$.couch.logout(options)
  • Does a DELETE request to “_session”.
Concepts

The _session API provides you with a convenient endpoint to manage authenticated requests to CouchDB. A simple GET /_session returns a JSON object detailing your current session state.

> curl http://127.0.0.1:5984/_session | jsonpretty 
{
  "userCtx": {
    "name": null,
    "roles": [

    ]
  },
  "info": {
    "authentication_handlers": [
      "oauth",
      "cookie",
      "default"
    ],
    "authentication_db": "_users"
  },
  "ok": true
}

userCtx is where all the authentication information is stored. name is your login username and roles is list of roles your user has assigned to it.

info has some server-wide information about the authentication system. authentication_handlers are the different ways CouchDB can do the actual authentication process for you. By default CouchDB ships with an OAuth handler, a cookie handler and the default handler (which does HTTP basic auth). The authentication_db is the database that user documents are stored in. The default is _users, but you can change it in the CouchDB configuration settings. Only do so if you have a very good reason.

ok just lets us know our request was a-ok.

We made an unauthenticated request to CouchDB, so we don’t see any values for userCtx.name or userCtx.roles. Let’s make one with user credentials:

> curl http://rebecca:12345@127.0.0.1:5984/_session | jsonpretty 
{
  "userCtx": {
    "name": "rebecca",
    "roles": [
      "_admin"
    ]
  },
  "info": {
    "authentication_handlers": [
      "oauth",
      "cookie",
      "default"
    ],
    "authenticated": "default",
    "authentication_db": "_users"
  },
  "ok": true
}

The result looks a lot similar, but this times the values inside userCtx are filled out. We used HTTP basic auth. The details of OAuth authentication are out of the scope of this article, but we sure should feature them at some point.

Cookie authentication or it’s full name Secure Cookie Authentication works by granting access through HMAC digest transported credentials and one time tokens. To increase convenience, a one time token is actually valid for 10 minutes by default, but you can adjust that as needed.

We showed you the login() method of jquery.couch.js earlier, use it to log a user into CouchDB with cookie authentication.

Roles

With roles you can group multiple users. We’ll show you in a bit how roles allow you to define permissions on the CouchDB server and individual databases. A role is a simple string that doesn’t start with an underscore. Underscore-roles are reserved to CouchDB. You roles can be anything, really. The only role that CouchDB prescribes is the _admin role (with an underscore, see?). It grants the user server-wide privileges to do anything.

ACLs, Database Admins & Validation Functions

To allow more fine-grained control over who can read from your databases, CouchDB comes with Access Control Lists (ACLs), Database Admins and Validation Functions.

Each database in CouchDB comes with its own security object. It is not a document, but simply a JSON structure associated with the database. On a newly created database, it looks like this:

{}

The empty object, duh :)

You can set two properties admins and readers. Both are another JSON object with the two properties roles and names and these two are lists of roles and names respectively.

Here is an example:

{
  "admins": {
    "roles": [],
    "names": ["rebecca"]
  }
}

For this database, our user rebecca is the admin. A database admin has full read access to the database as well as the ability to update the security object. You can add more users:

{
  "admins": {
    "roles": [],
    "names": ["rebecca", "pete"]
  }
}

Or if that starts to get tedious, you can assign roles, and then by adding roles to a user, they automatically inherit the right to administer the database. This assumes, “rebecca” and “pete” each have the “local-heros” role assigned:

{
  "admins": {
    "roles": ["local-heroes"],
    "names": []
  }
}
Readers

Now this is awesome :) — You can specify in the same way a list of usernames or roles to grant read-access to a database. If no readers are specified, everyone can read your database. This is cool, again, public databases make the world a better place.

In case you want only specific authenticated users to be able to read from your database, use the security object:

{
  "admins": {
    "roles": ["local-heroes"],
    "names": ["rebecca", "pete"]
  },
  "readers": {
    "roles": ["lolcat-heroes"],
    "names": ["simon", "ben", "james"]
  }
}

Now simon, ben and james are among your trusted readers as well as all users with the role “lolcat-heores”.

There is no need to add names or roles from the admins section, since they automatically are also readers.

Validation Functions or How to Control Write Access

What about restricting write access, can you just create a new property writers in the security object and do as before? — No, for this, you will be using a validation function.

CouchDB has had validation functions for quite some time and always have been the way of restricting write access to your database. The cool thing with validation functions is that they have full access to the document a user is trying to write as well as the user context, i.e. the username and any roles.

This allows a validation function to reject a document write because of both user-authentication (or the lack thereof) and document content or structure.

Validation functions are invoked once for every document that is written to the database. It gets passed the document to be written, the previous revision of a document, if it exists, and the user context. To block a document write, the validation function needs to throw an exception. The return value is ignored. If no exceptions are thrown, the document write can proceed.

Here are a few examples.

Disallowing anonymous writes:

function(new_doc, old_doc, userCtx) {
  if(!userCtx.name) {
    // CouchDB sets userCtx.name only after a successful authentication
    throw({forbidden: "Please log in first."});
  }
}

Only allow writes to users with a certain role:

function(new_doc, old_doc, userCtx) {
  if(userCtx.roles.indexOf("editors") === -1) {
    // sure lovely that JavaScript doesn’t
    // have an Array.includes() method
    throw({unauthorized: "You are not an editor."});
  }
}

Only allow updates by the author (this assumes, that the user sets his or her username as doc.name).

function(new_doc, old_doc, userCtx) {
  if(doc.name != userCtx.name) {
    throw({unauthorized: "You are not the author"});
  }
}
Conclusion

Alright, this was a really long post and we should get it wrapped. We hope to have given you a good overview of the security concepts in CouchDB and enough pointers to keep you reading and experimenting.

Categories: Companies, NoSQL

Don't Reinvent The Wheel by Josh Berkus @ CouchCamp

Wed, 08/18/2010 - 18:20

The talk description that went out in some recent PR about Josh Berkus at CouchCamp wasn’t quite accurate, my bad.

Here is a description of the talk Josh will be giving at CouchCamp that we are all very much looking forward to.

Don’t Reinvent The Wheel

CouchDB, as a new database, is doing a lot of new cool stuff. But bucking the conventional wisdom doesn’t mean that you need to be ignorant of database history; the older databases like PostgreSQL have decades of experience which the Couch community can profitably steal from. This talk will touch on issues like scaling, security, complex queries, data architecture, optimization, upgrades, long-term maintenance, and standards that developers and users of Couch should be thinking of for the future of the database.

Categories: Companies, NoSQL

Guest blog post from Max Ogden, creator of PDX API We posted a case study today on PDX API, which is...

Tue, 08/17/2010 - 07:41

Guest blog post from Max Ogden, creator of PDX API

We posted a case study today on PDX API, which is a JSON API that provides access to data from CivicApps, the open data initiative by the City of Portland. Max worked with us to write a blog post that gives some background information about working with government geo data. We really appreciate Max taking the time to help share his use of CouchDB with the community.

Now for Max’s blog post


Portland, PDX API and GeoCouch


In the fall of 2009 the City of Portland graciously hosted the volunteer organized WhereCamp conference at Metro headquarters. Metro is a regional government organization that, among other things like operating many local parks and the zoo, acts as a data warehouse for the 45+ municipalities in the greater Portland area. WhereCamp is a geo un-conference, where instead of lectures there are group discussions on proposed topics. One of the highlights was a brainstorming session lead by Metro employees regarding how Metro can release their datasets to the public.

Since Sam Adams, the City of Portland’s current ‘younger, tech savvy’ mayor, took office, the idea of a city wide open data initiative had been drifting closer to reality. Metro is predictably bureaucratic, and there was apprehension within Metro about releasing data directly onto the internet. They didn’t want to have to spend a significant amount of money to engineer some sort of web infrastructure for hosting their vast collection of geo data. I have found that Metro’s concerns are also echoed throughout the region at all levels of government.

The solution came in the form of a joint effort amongst all of the major civic entities in Portland (such as Portland public schools, Metro, public transportation, etc). They started hosting around 100 raw GIS files of various shapes and sizes from CivicApps (http://www.civicapps.org), a website that they created for the initiative. The datasets themselves range in size from lists of bicycle parking racks to outlines of city parks and neighborhood boundaries.

After downloading some datasets and attempting to interact with the data it quickly became evident that the workflow was incredibly clunky. Most of the datasets are created and released as a Shapefile, which is a proprietary desktop GIS format. Expensive desktop GIS software is capable of analyzing the geo data in a Shapefile in many amazing ways, but it isn’t easy to extract the data out for other use cases. Shapefiles are the GIS equivalent of a Word document. Sure, the useful data is in there somewhere, but it’s buried deep within years of vestigial formatting bloat. CivicApps provides raw data downloads of entire datasets, which is a good start, but the same data could be distributed a more efficient and accessible manner. Public geographic data ought to live on a server, but accessed a-la carte in small, efficient and interesting chunks rather than the entire dataset at a time.

For example, one of the datasets available on CivicApps contains all restaurants in the Portland area. I wanted to see which restaurants were near my house, but in order to find the handful of restaurants in my neighborhood out of the 3300 listed it took countless open source data conversion utilities, hours of reading documentation and many cups of coffee. After going through this process a few times I decided that nobody else should have to dive that deep into GIS-land in order to get at the data in a meaningful way.

There is a definite disconnect between government and open source when it comes to understanding the term ‘accessible data’. Most non-GIS developers aren’t going to want to learn how to work with a Shapefile. A great strategy for gaining adoption in open data initiatives is to build distribution tools work around the constraints of the existing government data. Portland’s regional government, along with most GIS users at the professional level, use Shapefiles almost exclusively. You aren’t going to convince an entire region full of career GIS developers to convert their datasets to some random open source format. It is the responsibility of the community to develop tools to convert the government’s raw data into a more usable form.

Government-level open data initiatives are increasing in frequency for a variety of reasons. I believe that they will become successful when a cosymbiotic relationship forms between the regional government (data suppliers) and the developer community within the region (data consumers). When local developers create applications using government data, governments save money because they no longer have to hire contractors to create the applications themselves, and citizens get to reap the benefits of applications with rich data created from government maintained datasets.

This means that you need a platform for hosting open data that is built on formats that developers already know.

GeoCouch is CouchDB plus set of geospatial extensions. Recently released was a mostly rewritten version that features a super fast R-tree spatial indexing implementation. GeoCouch didn’t take long to set up and populate with GeoJSON. Of the many formats to describe geographic data, the most ubiquitous is perhaps GeoJSON. GeoJSON is a standardized way of representing geographic data in pure JSON, and therefore you can throw any GeoJSON object at any application with a JSON parser. CouchDB uses JSON for transferring data, so GeoCouch has naturally baked in support for GeoJSON.

I created some utilities to facilitate the conversion workflow from the raw Shapefiles that come from the government to documents in a GeoCouch instance. The overall process involves dumping the Shapefiles into a PostGIS instance, exporting GeoJSON from PostGIS, and bulk importing the GeoJSON into GeoCouch. This is the process that I have found to be the most fault tolerant when performing coordinate transformations against large datasets.

Once the data has been loaded into GeoCouch it instantly turns into a nice, clean REST API for developers who want to be able to retrieve bounding box queries against municipal datasets from Portland. For example anyone can ask my GeoCouch, PDXAPI (http://www.pdxapi.com), to return a list of bicycle friendly trails in any rectangularly shaped region.

=== Benefits ===

I initially wrote a simple proximity query server in Ruby that was able to do a proximity query and return objects from a dataset that were closest to a specified point. You could retrieve, for example, the closest 5 bus stops to your current location. The proximity lookup itself was very much brute force and didn’t use any type of spatial indexing. This was okay for a prototype, but definitely wouldn’t have scaled very far. GeoCouch’s spatial indexer, on the other hand, only has to index a particular dataset once and then subsequent lookups are incredibly snappy. GeoCouch’s R-tree is literally thousands of times faster that my own implementation. GeoCouch does the heavy lifting and lets me relax.

Being able to offer read access to large amounts of municipal data is great, but switching to GeoCouch also lets me accept upstream changes from users, or even create entirely new databases. This is a huge win. Being able to edit documents means that you can, with minimal effort, let users edit any data in Couch in a wiki-like fashion.

There are many projects in Portland that are dedicated to sharing quantitative information about local objects and places. Urban Edibles (http://www.urbanedibles.org) lets anyone contribute the locations of publicly accessible fruit bearing trees or other edible plants. TapLister (http://www.taplister.com) lets users contribute to lists of microbrews on tap at Portland bars. PC-PDX (http://www.pc-pdx.com) is a community calendar for live music.

These are all examples of applications embracing the principles of the civic web. Whereas the social web tries to reinvent and replace real life conversation, the civic web simply tries to augment the systems we already use by encouraging efficient and convenient participation in the happenings of your neighborhood. Your neighborhood is where you work, where you raise children, and where you invest time and emotion, so tools that let people more proactively engage in their communities have a more wholesome and positive long term impact on those communities than social media does.

Don Park (@donpdonp) and I have been working on an example CouchApp for manipulating data in GeoCouch. We’ve adapted the CouchApp to be a wiki for food cart and truck information in the Portland area, available at http:// www.foodcartpages.com. If someone wants to start a community database of, say, publicly accessible rope swings in front yards around Portland, they can start a new database on my GeoCouch. They can then take the source code from Food Cart Pages (http://github.com/donpdonp/foodcartpages) and adapt it to work with their new rope swing dataset.

Going a step further, I’ve created an example iPhone native application (an Android app is in the works) for manipulating GeoCouch hosted data. The rope swing dataset developer can grab a copy of my iPhone source code (http:// github.com/maxogden/pdx-food-carts-mobile) and adapt it to work with their application. Instead of writing in pure Objective-C, I chose to craft the application using Titanium, a JavaScript framework for cross platform native mobile development. This means that anyone who knows JavaScript can jump in and edit the iPhone source code and tailor the application to fit their use case.

When developing the mobile application, I didn’t need a special CouchDB client- side library in order to interact with the remote data stored on GeoCouch. I was able to use the vanilla AJAX library included with Titanium and fully interact with the data stored on GeoCouch. This is a great example of the importance of the ubiquitous patterns like REST that are present in the design of CouchDB.

GeoCouch happily acts as the centralized data store for both the web based CouchApp version of Food Cart Pages, as well as the iPhone application. When a user on an iPhone takes a photo of a food carts menu and uploads it, anyone else consuming data from GeoCouch will see the new photo. As a bonus for using CouchDB, I get free support for conflict resolution and an easy way to store old revisions of documents.

At the end of the day, GeoCouch has, in a few months time, help me to create an ecosystem for regional government and open source developers to share information with citizens, and for those same citizens to share information back. Crafting architecture for sharing large amounts of information over the web is usually no easy task, but GeoCouch has let me focus on the big picture and not get bogged down in the details. I am hoping to enable other developers in Portland and other cities to also see the big picture and start creating applications that not only their communities enjoy, but that they themselves also enjoy.

Categories: Companies, NoSQL

CouchCamp Early Bird Registration Closes Tomorrow, August 17th

Mon, 08/16/2010 - 11:26

Hey all, this is just a quick reminder that our early bird sales for CouchCamp closes tomorrow, August 17th. Trust us, you want to be there :)

CouchCamp is ideal for anyone interested in learning more about CouchDB, including developers, administrators and business users. The three-day camp will include speaking sessions from Damien Katz, creator of CouchDB, Selena Deckelman, founder of Open Source Bridge, Ted Leung, director of advanced technology at Disney and Stuart Langridge of Canonical, makers of Ubuntu Linux. There will also be unconference sessions led by conference participants.

For more information or to register, visit: http://www.couch.io/couchcamp.

See you there!

Categories: Companies, NoSQL

Press Release: Apache CouchDB Now Available on Google Android

Tue, 08/10/2010 - 19:51

Developers can now build web or native mobile applications taking advantage of CouchDB’s peer-to-peer sync 

Oakland, CALIF. – August 10, 2010 – Couchio (http://www.couch.io/), corporate sponsor of the CouchDB post-relational database, today announced that the first release of a CouchDB SDK for Android devices is now available for free download. Designed to take full advantage of CouchDB’s peer-to-peer sync facilities, CouchDB for Android allows developers to build web or native applications that work even if the Internet connection is slow, intermittent or completely down. With continuous access to a local copy of data, developers can leverage their existing knowledge about web technologies to quickly build collaborative business applications on mobile devices.
 

CouchDB for Android allows shared applications to work offline by automatically synchronizing  between platforms, alleviating a common pain point for users. Developers no longer have to develop an application once for the web, once for each mobile platform and then synchronize between the two. 
 

 “Our goal is to provide users with a kick-ass SDK for Android devices to build web and native applications using CouchDB as the device-native data store,” said Damien Katz, creator of CouchDB and CEO of Couchio. “CouchDB now makes sync ubiquitous and part of the mobile computing fabric.”
 

With CouchDB on Android, developers can build applications and access their data freely across devices, desktops or in the cloud, regardless of the network. Palm has already announced that the next version of it webOS will include services for syncing local data with CouchDB. 

For more information about CouchDB on Android, or to download it for free, visit  http://www.couch.io/android. From an Android device, intersted parties can directly install CouchDB through the Android Marketplace.


About Couchio

Couchio (http://www.couch.io/), co-founded by the creator of Apache CouchDB, is the commercial CouchDB company providing services, support, training and hosting. CouchDB is an open source database designed for the reporting and storage of large amounts of semi-structured, document oriented data, unlike SQL databases, which store and report on very structured and correlated data. CouchDB changes the way document-based applications are built, benefiting from the cloud while also keeping data available at the network edges via replication. Couchio has received venture funding from Redpoint Ventures.

Media Contact:

Ray George

Page One PR

650-922-3825

ray@pageonepr.com

Categories: Companies, NoSQL

Because we just couldn’t not do it.

Fri, 08/06/2010 - 19:22


Because we just couldn’t not do it.

Categories: Companies, NoSQL

RelaxBack for Thursday 8/5/2010

Fri, 08/06/2010 - 00:58
Upcoming Events

Tonight! 7pm socal.js Mikeal will be talking about CouchDB and node.js.

CouchCamp tickets are on sale for $500 (all inclusive) until August 17th.

RESTFest is going to be September 17th - 18th in South Carolina.

Recent Happenings

New Case study on Migrating to CouchDB from a Relational Database.

Enzo has another post in his series about using CouchDB with Rails about understanding map/reduce.

Damien and Aaron Miller have erlang running on iOS :)

CouchCamp attendee spotlight on Max Ogden.

Lena Hermann handed in her Thesis on Realisation of a Distributed Application Using the Document-Oriented Database CouchDB.

jchris wrote up a new description of CouchApps.

Jan started a page for everyone to add their local CouchDB meetups.

New expanded docs on installing CouchDB on Windows.

Jobs

Couchio is hiring a ton of roles! 6 week vacation
. I gotta figure out somewhere to travel to .

Categories: Companies, NoSQL

CouchCamp Attendee Spotlight: Max Ogden

Wed, 08/04/2010 - 21:49

Max Ogden is a programmer from Portland, OR. Max is becoming quite well known in the open government applications community with his recent work PDXAPI which won the Civic Apps award for best overall utilization of data. Max was also the first CouchCamp ticket buyer and we’re incredibly excited to have him attending.

What was your first CouchDB project?

Trying to set up the old version of GeoCouch back when it had dependencies on Python and Spatialite. I spent more time trying to get the dependencies to compile than I did actually working with any actual geographic data. When the new GeoCouch came out and it didn’t have any external dependencies I was way excited.

What are you currently working on?

I’m working on PDX API (http://pdxapi.com), a developer interface to civic geo datasets in Portland, OR. It’s a big GeoCouch instance that has a bunch of geographic datasets that mostly come from Portland’s regional government agencies.

What is your favorite part of CouchDB?

The ubiquity of JSON and JavaScript. It’s easy to get people excited about working with Couch, since a lot of developers already think RESTfully and throw JSON objects around all the time. CouchApps are also really exciting.

What are you looking forward to at CouchCamp?

Kickball in Marin county in late summer, seeing what other people are working on.

What is your favorite color?

#FFB901

What drink(s) would you like to see at CouchCamp?

Some fancy draft root beer (I don’t drink drink)

Categories: Companies, NoSQL

Diplom Thesis: Realisation of a Distributed Application Using the Document-Oriented Database CouchDB by Lena Herrmann

Wed, 08/04/2010 - 16:19

Lena Herrmann & Thesis

Major Congrats to Lena Herrmann for handing in her Diplom Thesis on Realisation of a Distributed Application Using the Document-Oriented Database CouchDB.

It’s whooping 163 pages containing all the nitty-gritty-researchy details on why CouchDB is the number one choice for writing distributed applications in both the small and large scale.

Its review is pending but we’ll give you a shout when the full text is available. The great folks at UPSTREAM where Lena wrote the thesis are contributingthe text back to the wider community. Thank you Lena & UPSTREAM!

Categories: Companies, NoSQL

RelaxBack for Tuesday 8/3/2010

Wed, 08/04/2010 - 05:50
Upcoming Events

Mikeal is speaking about node.js and CouchDB at the socal.js meetup on Thursday August 5th.

CouchCamp tickets are on sale for $500 (room, food and drink included) until August 17th.

Mathias Meyer is speaking about Couchapps at WebAppDays in September.

Recent Happenings

Geoff Buesing wrote a Rack adapter for CouchDB external processes.

Jason landed Debian support in build-couchdb.

Mikeal did a post about abstracting CouchDB.

Mailing List

On the dev list the request for comment on CouchDB 1.0.1 and a proposal for view server protocol changes are still kicking around as well as a new thread to get Filipe’s replicator db work in to trunk.

The user list has threads about the content of userCtx, compilation errors, view performance testing, couchdb-lucene, info about deleted documents, and multiple view queries.

Categories: Companies, NoSQL

RelaxBack for Friday 7/30/2010

Sat, 07/31/2010 - 04:37
Upcoming Events

CouchCamp tickets are still on sale. The event will take place on September 8th - 10.

Recent Happenings

Damien wrote up a thorough article about bringing your open source project to 1.0.

Jason’s linux packages now support fedora

Scott Davis gave a well received talk at the Dallas Tech Fest on CouchDB.

Sam Bisbee posted his into to CouchDB slides.

Simpsons CouchApp :P.

Alexander Lang wrote up a great article on how to handle transactional use cases in CouchDB.

And Max Ogden has started a new GitHub project for all his GeoJSON JavaScript code.

Couchapps

BigBlueHat. Badass content management system.

Jobs

Something in Stuttgart that Jan has a line on.

Categories: Companies, NoSQL

RelaxBack for Wednesday 7/28/2010

Wed, 07/28/2010 - 22:41
Upcoming Events

NYC NoSQL meetup tonight

Recent Happenings

Jason Smith put together install binaries for CouchDB 1.0 on linux 32 and 64bit.

Klaus Trainer posted a good roundup of CouchDB 1.0 retrospectives.

New Stack Overflow question about what use cases CouchDB is best suited for.

CouchDB is not being re-written in C.

Calvin Yu posted a new node script for syncing design doc functions.

Mailing Lists

On the dev list Norman Baker started a thread about putting CouchDB code on GitHub. Also messages about bounding box queries in GeoCouch, the CouchDB SDK for Android on the HTC Tattoo,

On the user list some people are tracking down beam CPU performance issues that Sivian Greenberg is having.

Couchapps

Henrik Skupin is working on a test results dashboard at for Mozilla test automation.

Categories: Companies, NoSQL

RelaxBack for Tuesday 7/27/2010

Tue, 07/27/2010 - 23:05
Upcoming

NYC CouchUp Tonight tonight.

NYC NoSQL Meetup tomorrow.

Recent

CouchDB case study of Aptela telecom. Highlight for me: “Reliability has been exceptional”.

Enzo has a good new post up on CouchDB and Rails.

Firefox 4 Beta 2 was released today which includes support for the latest draft version of the IndexedDatabase specification. IndexedDatabase provides low level transactions, indexing, and object stores to the browser. Mikeal has been working on a full implementation of CouchDB on top of IndexedDatabase and all of it’s tests pass on Firefox4B2.

Mailing lists

The dev list Noah Slater put out a request for comment for releasing CouchDB 1.0.1. Jason Smith added to the view protocol changes with a request for better error handling on form POST, errors on form POST current can’t return a nice HTML page which is something we need to fix.

The user list has a new thread about reporting in CouchDB and some talk about issues related to external handlers not being quite in sync with commits. This was brought up in the context of couchdb-clucene but most likely effects other services that use a similar externals method.

Couchapps

Afghan war leaks in a CouchDB CouchApp by Benoit! This is seriously awesome!

Jobs

Couchio hiring a C / database engineer.

Categories: Companies, NoSQL

Guest blog post from Mahesh Paolini-Subramanya, CTO of Aptela We posted a case study today on...

Tue, 07/27/2010 - 18:08

Guest blog post from Mahesh Paolini-Subramanya, CTO of Aptela

We posted a case study today on Aptela, the leading provider of business phone services for small business and mobile workers nationwide, and how they use CouchDB to scale their application.  Their CTO Mahesh Paolini-Subramanya worked with us to write a guest blog for us to further explain their use of CouchDB.  We really appreciate Mahesh taking the time to help share their use of CouchDB with the community.

Now for Mahesh’s blog post


Aptela Achieves Replication and Scaling with CouchDB

We just launched our next generation calling platform, Aptela v5.0, and I had to make sure that when we rolled it out, we had a solution that would help our new calling platform be massively, yet affordably, scalable, and aid us as we continue to deliver reliable, crystal clear phone service; a mission critical requirement for all our customers. They rely heavily on our service to run their businesses, and we cannot go down.

As a business-class phone service provider with a customer base of more than 17,000 users across 3,000 small businesses nationwide, Aptela handles over 100 million minutes of calls per year. That’s a lot of calling! We needed a way to effectively manage the millions of Call Detail Records (CDRs) generated by those calls on a daily basis, so that we could provide those CDRs to our customers (and internal Aptela folks) instantly. We also needed the data to synchronize across all of our servers, all of the time.

I know, it’s all supposed to be so easy - you put all of your information in a database somewhere and magically, the problem is solved. Come to think of it, that is exactly what we did in the previous iteration of our architecture - A nice Postgres database happily serving up data kept all of our systems in sync. This of course, worked perfectly until the day the database server crashed (Really? The backup generator doesn’t work for more than 10 minutes? Quelle surprise!), and our customers were offline until our (previous!) hosting facility figured out which circuit-breaker to un-trip.

This, naturally, lead us to replication, master-mater configurations, master-slave setups, new hosting facilities, load-balancers, MySQL, and the next thing you know, it was Yet Another 3 AM crisis with me frantically Googling “repair corrupt MySQL database unknown error 3l33t”.

Our entire server-infrastructure is (and needs to be!) cloud-based, i.e., highly distributed, reliable, scalable, location-independent, fine-grained and with built-in coffee service.   Take incoming calls for example.  In our environment, they go to a randomly chosen server, which figures out what to do with the call based on the called number.  Then the server waits until the Official Data Store (MySQL or Postgres, or Oracle) figures out what to do with the call. The problem? We were spending all of our time figuring out how we could improve our databases to support our application, and not actually spending any time on improving our application! This was clearly not the answer. Side note - There should be some kind of law about this, e.g. Any software operation will eventually stagnate when the Development budget equals the Maintenance budget.

Enter CouchDB, which has worked like a charm for us. If anything, we have only begun to tap into all of the cool things that it does for us.

We are now able to handle our massive amounts of data by dumping it into local instances of CouchDB on each of our telephony nodes.  At this point, a couple of really neat things happen (ok, neat for me, probably not for you):

  • - Billing information gets extracted from these CDRs, and replicated over into the billing system
  • - Metadata associated with voicemails and recordings get replicated across to the other telephony nodes
  • - Metadata associated with the calls get replicated over to the application nodes
  • - The CDRs themselves all end up getting replicated to the reporting servers, where all sorts of goofy reports can now get generated off of them.

The free-form nature of CouchDB is tailor-made for reporting and that alone makes it worth the price of admission. Come to think of it, that was pretty much what made us look at CouchDB in the first place! That said, once we started working with it, it became immediately clear that this was the solution to all our data management and maintenance issues. 

CouchDB is written in Erlang and to paraphrase – We love Erlang so much we wrote our entire application in it – which makes it trivially easy to integrate it into our application. It also has an extremely easy to use REST API (Representational State Transfer), which makes integrating it into our back-office systems just about as trivial.

Now, you might be nitpicking that we didn’t really solve the problem as I originally described it (consistency across all the nodes). This is quite ok, since we are actually fairly devout believers in Eventual Consistency, i.e., trading off high-availability for eventual consistency.

For example, when a voicemail gets received at one node, we copy the audio and metadata over to the other nodes asynchronously. If, however, the client calls up at one of the other nodes before the audio/metadata gets there, and wants to listen to that voicemail, we tell the caller that “The Audio is still being Processed”, and to “Call Back In Just A Wee Bit”. You could consider this a bit of a cop-out, but it works just fine for everyone involved because the view of voicemails is Eventually Consistent, but we don’t lock anyone out of the system while updates are occurring. For extra credit, we just move the call over to the node where the voicemail was left, so that it is immediately accessible.

Finding the right tool for the job was the goal, and CouchDB is the perfect match! I know that we have barely begun to fully utilize everything available with CouchDB. Going forward, we plan to use it to help us continue to improve the way we manage our data and I feel confident that it will be able to evolve right along with us. Bravo Damien!

Categories: Companies, NoSQL

RelaxBack for Monday 7/26/2010

Tue, 07/27/2010 - 01:39
Upcoming

There is a CouchDB meetup tomorrow, Tuesday the 27th, in NYC at 7pm at DBA, 41 First Avenue, New York City, NY. 10003. Between 2nd and 3rd Street.

Mikeal Rogers will be speaking on Wednesday the 28th at the NYC NoSQL meetup.

Recent Stuff

Yay Benoit! Benoit has been rockin the couchapp commits. Benoit has pushed 7 major versions of the Python couchapp toolkit so far and hasn’t shown much sign of slowing down. Couchapp also had recent contributions from Henrik Skupin and Geoff Buesing.

David Nolen wrote some awesome clojure code that can write about 5500 documents a second using bulk docs.

Cloudant was at OSCON last week where SETI showed off Open SETIQuest which hosts all of it’s metadata in Cloudant.

Mathias wrote up his 10 biggest pet peeves in CouchDB.

Mailing lists

On the dev list Mikeal suggested some changes to the view server architecture and Robert Newson worked out the integration of the lastest MochiWeb release in to CouchDB which now supports native SSL.

One the user list there were debates about the best way to structure AND & OR style queries, scheduled tasks, and Simon Metson posted a CouchDB job that has opened up in Bristol, UK.

Couchapps

TweetEater, displays tweets that are stored in CouchDB, source and example.

And Russell pushed his new sofa blog http://chewbranca.com.

Categories: Companies, NoSQL

"CouchDB Everything" talk at NYC NoSQL on Wednesday the 28th

Tue, 07/27/2010 - 01:25

Mikeal Rogers will be speaking on Wednesday the 28th at the NYC NoSQL meetup.

Categories: Companies, NoSQL

NYC CouchUp Tuesday the 27th at DBA

Tue, 07/27/2010 - 01:24

There is a CouchDB meetup tomorrow, Tuesday the 27th, in NYC at 7pm at DBA, 41 First Avenue, New York City, NY. 10003. Between 2nd and 3rd Street.


View Larger Map

Categories: Companies, NoSQL

New O’Reilly Book: CouchDB Kurz & Gut

Fri, 07/23/2010 - 10:45

CouchDB Kurz & GutHey you fellow Germans out there! O’Reilly published a new book on CouchDB just for you! It is Mario Scheliga’s CouchDB Kurz & Gut — roughly translated to CouchDB Short and Good. The Kurz & Gut series consists of compact books with enough content to get you running and that are later useful as a quick reference in day to day work (Jan happily remembers having the LaTeX edition at hand a couple of years back).

From all of us at Couchio:

Mario, awesome job! Thanks a lot for help getting the good word out and providing such a great resource for the German CouchDB community. Rock on! (or relax, either is fine :)

Categories: Companies, NoSQL

CouchDB Relaxback

Thu, 07/22/2010 - 06:29

The last week was a big one for CouchDB. Here is a brief recap, (inspired by Mark Phillips’s great blog post on the Riak Recaps). I can’t get everything in, cause there’s way too much. Leave a comment (or email jchris@couch.io) if I left anything out, or if you have ideas for what to mention next week.

1.0 Released

We’ve been working toward CouchDB 1.0 for five years, so geting that out the door was rather major. There was even a New York Times article that calls CouchDB the first production ready NoSQL database (I heartily concur, we’ve been production worthy since 0.8.0) Now that we are 1.0 there’s no reason not to use CouchDB in your banking, medical, air-traffic control, or other mission-critical applications.

As part of the 1.0 release we asked the community to write retrospective blog posts on how they came to CouchDB. Thanks to Klaus for a collection of links:

We’re still hoping to find more of these, so if you’ve been involved in CouchDB for a while (or even a short time) and you write one, let me know.

Windows Support

Part of the 1.0 release is Windows support. Thanks to Aaron Miller for building a provisional installer kit. Some folks are having trouble with the installer, so we’re waiting on Mark Hammond to get home and build a proper one. Thanks Mark!

CouchCamp

Last year we had CouchHack which was the first time a few of the committers met each other. It was also when Damien, Jan, and Chris first decided to form a company behind CouchDB.

This year we are hosting CouchCamp which will be the biggest gathering of CouchDB supporters in the history of all time. Of all time. There will be scrumptious organic food (and maybe Damien will even make a McDonald’s run) and quality beer. All this and a cabin bed at Walker Creek Ranch in Marin County. Slots are filling up fast, and there are only a limited number of beds. Anyone who registers after we are out of beds will be welcome, but you’ll end up camping (for real) in the great outdoors. Registration for this two and a half day event is currently only $500, in celebration of CouchDB 1.0.

CouchRest

In other news, the CouchRest Ruby library for CouchDB has a new set of active maintainers. For help with patching CouchRest, talk to Marcos TapajĂłs, Sam Lown, and Will Leinweber. Thanks Rubyists!

GeoCouch

GeoCouch is taking the world by storm. There was a meetup in Augsburg. PDXAPI continues to kick ass, and has begun to spawn related applications like Food Cart Pages and mobile clients.. Thanks Volker Mische, Max Ogden, Don Park and others.

This just in: PDXAPI has won an award (and $1,000) from the Mayor of Portland!

Faster JavaScript Views

A few months ago, the Riak team (great hackers) released some code to make communication between Erlang and JavaScript way more efficient. After that Paul Davis integrated it with CouchDB, as a proof of concept. Since then, he’s started to refactor it for easier builds. We are looking forward to integrating this with CouchDB trunk in a near-future release.

Mr. Rogers on Sesame Street

Just kidding about the PBS reference (as a kid I was always fascinated by cameos). But Cloudant’s Mike, Alan, and Dave stopped by the Couchio offices and we talked about collaboration and strategy, and how to help get the CouchDB story to the millions of developers who haven’t heard a word of it yet.

One thing that came out of the meeting, we plan to host some CouchDB trainings. If you want one in your town, email hello@couch.io and we’ll get it together.

Monthly webinars

O’Reilly has been sponsoring monthly Webcasts in association with the CouchDB book. Here’s the list so far.

CouchApp Evently Guided Hack. Learn how to make CouchApps the JChris way.

Intro to Apache CouchDB

What’s new in CouchDB 1.0. A round up of the features and improvements that CouchDB’s seen in the last few months.

Flexible Scaling with CouchDB Replication (video not yet available.)

Next month (on the 25th) we’ll be doing one on how to make crash-only applications using the _changes feed. Signup link to follow.

New Committers

The Apache CouchDB project is very lucky to have two new committers joining us: Filipe Manana and Robert Newson. Filipe has been hard at work on updating the replicator to be more robust and performant. Robert is the force behind CouchDB-Lucene.

Categories: Companies, NoSQL