Jan Lehnardt and Damien
Katz (click to enlarge) |
Damien Katz and Jan Lehnardt are talking about CouchDB. My students
have mentioned it several times and we've had brief discussions about
it, but I've never spent much time on it. This seemed like my
chance. CouchDB's goal is a simple, non-relational database.
Damien started the CouchDB project after working for a number of
years on the Lotus Notes project. He loved the document model of the
data store (as did a lot of other people). He wanted an open source
version of that model and CouchDB was born.
In real life, most data is document centric--not relational. A
business card has all the data on it. The downside is that of your
job title changes, a self-contained document model doesn't update
that (it's not a separate table). On the other hand, more and more
documents are starting to contain references to other data (URLs)
which makes up for this in some cases.
CouchDB documents are in a JSON format. If you're not familiar with
JSON, it's an XML-like format for storing data, but without the angle
brackets. It's easier for people to read and write. It's not a
substitute for XML, but it's great when just just need simple
structured data. JSON is widely supported.
CouchDB uses an HTTP API. This allows CouchDB to make use of existing
caches, load balancers, and analyzers. You can use curl to
drive CouchDB from the command line or HTTP libraries for various
languages to use it.
CouchDB views allow you to filter, collate, and aggregate data.
Views are powered by Map/Reduce.
The map stage processes key/value pairs to produce intermediate
values and reduce then combines intermediate values for particular
key. Map/Reduce is inherently parallelizable making it useful on
clusters of machines.
CouchDB is designed to be easily replicated and supports
synchronizing machines.
Disks are getting cheaper and machines are being built with more and
more cores. That makes a model like CouchDB uses very appealing.
CouchDB is written in Erlang and provides a non-locking MVCC and ACID
compliant data store.
There are some bonus features: Lucene is integrated for fulltext
search and CouchDB also provide JSON searching using JSearch, a
wrapper on Lucene for JSON structures.
CouchDB has been accepted for incubation as an Apache project and
uses the Apache license.
Tags:
etech
etech08
databases