A Django site.
March 6, 2008

Phil Windley
pjw
Phil Windley's Technometria
» CouchDB from 10,000 Feet

Jan Lehnardt and Damien
Katz
Jan Lehnardt and Damien Katz
(click to enlarge)

Damien Katz and Jan Lehnardt are talking about CouchDB. My students have mentioned it several times and we've had brief discussions about it, but I've never spent much time on it. This seemed like my chance. CouchDB's goal is a simple, non-relational database.

Damien started the CouchDB project after working for a number of years on the Lotus Notes project. He loved the document model of the data store (as did a lot of other people). He wanted an open source version of that model and CouchDB was born.

In real life, most data is document centric--not relational. A business card has all the data on it. The downside is that of your job title changes, a self-contained document model doesn't update that (it's not a separate table). On the other hand, more and more documents are starting to contain references to other data (URLs) which makes up for this in some cases.

CouchDB documents are in a JSON format. If you're not familiar with JSON, it's an XML-like format for storing data, but without the angle brackets. It's easier for people to read and write. It's not a substitute for XML, but it's great when just just need simple structured data. JSON is widely supported.

CouchDB uses an HTTP API. This allows CouchDB to make use of existing caches, load balancers, and analyzers. You can use curl to drive CouchDB from the command line or HTTP libraries for various languages to use it.

CouchDB views allow you to filter, collate, and aggregate data. Views are powered by Map/Reduce. The map stage processes key/value pairs to produce intermediate values and reduce then combines intermediate values for particular key. Map/Reduce is inherently parallelizable making it useful on clusters of machines.

CouchDB is designed to be easily replicated and supports synchronizing machines.

Disks are getting cheaper and machines are being built with more and more cores. That makes a model like CouchDB uses very appealing. CouchDB is written in Erlang and provides a non-locking MVCC and ACID compliant data store.

There are some bonus features: Lucene is integrated for fulltext search and CouchDB also provide JSON searching using JSearch, a wrapper on Lucene for JSON structures.

CouchDB has been accepted for incubation as an Apache project and uses the Apache license.

Tags: etech etech08 databases

» Larry Lessig on Changing Congress

Larry Lessig on Changing Congress
Larry Lessig on Changing Congress
(click to enlarge)

Lessig's keynotes are hard to blog, but the message isn't. Lessig's basic message is that government makes poor policy--even when the choice ought to be easy. The problem isn't overt bribery. In fact, we may have the best situation we've ever had in that sense. But even good people are affected by indirect dependence on money. Money in politics causes problems in three ways

  1. Divert access - congressmen pay attention to donors over others.
  2. Change reasoning -
  3. Sets up an perverse incentive where regulation creates money raising opportunities

This has created a fundamental loss of confidence where people believe there's corruption even when there's not.

The error, the wrong decisions, are the direct result of the improper dependence of politics on raising money.

There are numerous proposals on what to do to lessen the dependence on money.

Congress is an incumbency machine. The whole set up is designed to make sure the congressman gets reelected. Earmarks are a perfect example. This gives an extraordinary advantage to the incumbent. Congressmen abuse earmarks for the purpose of increasing their personal wealth.

The insiders are the enemy. The outsiders are the only ones who will change this. Technology is not a Utopian solution, but it's the most powerful tool we have to change the system.

Lessig is launching a project, in the spirit of Creative Commons, called Change Congress that would allow candidates to commit to three things (below) and if they did, they'd get a badge for their Web site showing their commitment.

  1. Stop taking PAC money
  2. Support banning earmarks
  3. Support public financing of elections

We can get candidates to consider this by running against them to increase the cost. We can also ask candidates to support it. Delegates are in a powerful position to influence candidates.

Tags: etech etech08 politics

March 5, 2008

Phil Windley
pjw
Phil Windley's Technometria
» Kicking Ass

Kathy Sierra talks about kicking ass
Kathy Sierra talks about kicking ass
(click to enlarge)

Kathy Sierra takes the stage again at ETech to talk about kicking ass. She says that people aren't passionate about things they suck at. Finding passion is a way to kick ass.

She talks about neurogenesis, the idea that the brain can change positively. It's more plastic than we ever thought. She recommends an article by Jonah Lehrer in Seed Magazine on the work of Professor Elizabeth Gould. Stimulating environments matter--cages (or cubes) aren't stimulating environments.

The common thread of people who perform at a world class level is that they focus, concentrate, and practice. They put in the time. Putting in the effort is a key factor in more than 90% of the cases. So much for the slacker attitude, huh?

Do experts actually know more? Kathy shows two diagrams of chess positions and asks which is easier to recall. The question is "what do chess masters know" that everyone else doesn't. Chess masters recall "real" boards much better than "non-sense" boards much better than beginners but for non-sense boards, masters have no advantage. But with real expertise, it's not what you know--it's what you do.

Kathy gives some hints on how to kick ass.

  1. Exploit your telephony superpowers--use mirror neurons to exercise your brain without doing the thing itself. Primates respond to things other primates do by simply watching them. Mirror neurons allow us to run simulations of another persons brain. Video and pictures is better than text. Simulation resolution depends on you--you have to have expertise in the thing you're watching. Don't watch people who suck.
  2. Reduce interference--stop the mental chatter that's always going on when you try to do something. Tell the dumber part of your brain to shut up.
  3. Manage your flight or fight response-- Kathy recommends the Stress Eraser.
  4. Get to know your brain--Your brain is constantly telling you thing are or aren't important that your mind wants to learn. Read A Mind of its Own: How Your Brain Distorts and Deceives and Understanding how good people turn evil (here's a Wired article if you'd like the digest). Exercise (not mental, physical) is important.

Two big problems are motivation and practice. Get unplugged. We're addicted by intermittent variable reward. It's what makes slot machines addictive. It's what makes checking your email addictive. We can't lose the ability or intense concentration. You have to put in the hours.

Tags: etech etech08

» John McCarthy on the Elephant Programming Language

John McCarthy
John McCarthy
(click to enlarge)

He wasn't on the program, but this morning's keynote was given by Professor John McCarthy--the inventor of LISP and coiner of the term "artificial intelligence."

This morning, he's talking about Elephant 2000, a programming language designed for writing programs that interact with people. One of the things he points out that I find interesting is the idea that the compiler should generate required data structures without the user having to specify them. I'm not sure how that works from his explanation, but I'm certain that if we want languages that admit more parallelism, this is a feature that would help.

He describes the idea of ascribing believe to a thermostat. He uses a simple system like a thermostat not because it's necessary to understand the operation of thermostats, but rather because it's useful for understanding the nature of belief.

ELephant programs have input and output specifications since their goal is human interaction. They also need accomplishment specifications.

He gives some examples of Elephant programs. This, for example sets is a program saying that if the flight isn't full the make a commitment to the passenger and then communicate that act.

if !full(flt) then accept.request(make commitment(admit(psgr,flt)))
  answer.query exists commitment admit(psgr, flt).

There are more examples online. Obviously, the notation itself is clean. The question is whether you can do something with the semantics associated with the notation. Unfortunately, Professor McCarthy ran out of time and we didn't really get to the punch line, I think.

Tags: cs330 lisp etech etech08 programming+languages

» DIY Drones: Building Cheap UAVs

Chris Anderson
Chris Anderson
(click to enlarge)

One of the reasons I love ETech is talks like this one from Chris Anderson (of Wired) on building homebrew drones, or unmanned aerial vehicles (UAVs). He has a Web site that shows how to build the various kinds of drones he talked about. He's used Lego Mindstorms, cell phones, and microcontrollers on planes. The results are pretty astounding.


Jordi with the blimp
Jordi with the blimp
(click to enlarge)

He wanted something you could do indoors, and hit on the idea of using blimps--which are inherently autonomous since they float. The blimp uses ultrasonic sensors to maintain altitude. When it's powered up it looks determines it's altitude and then holds that. He has infrared beacons that serve as way points. The blimps don't have an absolute frame of reference, but know where they are in the room. Chris describes it as having "sub-Rumba-level intelligence."

Chris has a "blimp board" that he is using for the next generation blimp with motor controllers, compass, ultrasonics, and a even a way to know room temperature (which makes a huge difference with a blimp).

Tags: etech etech08 robotics

March 4, 2008

Phil Windley
pjw
Phil Windley's Technometria
» Amazon's SimpleDB

Jay Ridgeway
Jay Ridgeway from Nextumi
(click to enlarge)

This afternoon, I was torn between the session on botnets and one on Amazon's SimpleDB by Mike Culver and Jay Ridgeway. I chose the latter.

The goal is a durable, flexible datastore at a cheap price: $0.14 per machine house, $0.10/Gb into the cloud and $0.18/Gb out.

The API call list is short. Domains are used to partition data. You can think of them as tables, that helps. To add something to a domain you use this syntax:

PUT (item, 123), (description, Sweater), (color, Red), (color, Blue)

The first name-value tuple is the name of the row and needs to be unique. The remaining tuples are attributes and names can be repeated to represent a attribute with multiple values. There are no datatypes. Everything is a string.

A query looks like:

Domain = MyStore
['description' = 'Sweater']

Note that this isn't SQL. :-)

There's a Javascript application called SimpleDB Scratchpad that can be used to play with SimpleDB. All you need is your AWS key.

Jay Ridgeway from Nextumi took the mic to talk about their experience using SimpleDB to implement ShareThis. They've made heavy use of SimpleDB. He concluded with the following list of downsides and upsides. On the downside:

  • Limited features
  • minimal toolset and documentation
  • no experience in house
  • high switching cost

On the upside:

  • zero software cost
  • minimal staff costs
  • low barrier to development
  • responsive and reliable
  • simple, pragmatic solution for a complex problem.

Nextumi does maintain a copy of the raw data in case Amazon ceased to exist for some reason, but using it would obviously require some redesign of their site. I wonder if anyone has created the SimpleDB API on top of BerkeleyDB or MySQL? That would be handy.

SimpleDB doesn't handle binary data well. The best thing is to put binary data in S3 and put a reference to it in SimpleDB.

Tags: etech etech08 aws amazon databases

» Sectored Wi-Fi Architecture

Xirrus Wi-Fi array controller
Xirrus Wi-Fi array controller
(click to enlarge)

O'Reilly is using one of these Xirrus Wi-Fi arrays and so far, I've got to say I'm impressed. The bandwidth has been great with none of the traditional conference wi-fi problems we all have learned to live with The picture is of the operational array on the light truss in front of the stage. Looks much cooler in real life since all the lights are blinking! According to the Web site, the XS16, which is what we've got here, can deliver up to 864Mbps of bandwidth. Very cool.

Tags: etech etech08 wireless

» Your Carbon Footprint

Saul Griffith
Saul Griffith
(click to enlarge)

This morning's opening keynote at ETech was Saul Griffith who ran down the steps he used to calculate his own carbon footprint and then what he had to do to put himself on a "carbon diet." It's not pretty. Doing the calculation is relatively straightforward in terms of the math, but gathering the data isn't easy. I'm hoping that we can get his slides when we put the audio up on IT Conversations because there's some great data there.

Speaking of IT Conversations, a recent IEEE show has a section on home co-generation. You can buy a furnace for your home right now that generates electricity to create the heat. You get power and heat from the same plant, making it much more efficient than buying power separately. You're still burning a hydrocarbon, but you're essentially getting the electricity for (close to) free. Retrofitting an existing home isn't a problem.

On a similar topic, today I put up the latest Technometria show on green computing. The guest is Jeremy Faludi, an expert in green computing. We talk about the carbon footprint of various parts of the computing industry and also mention where computers can help by reducing carbon use.

Tags: itconversations etech etech08 energy environment

March 3, 2008

Phil Windley
pjw
Phil Windley's Technometria
» Marc Hedlund: Debugging Hacks, What They Never Taught You About Solving Hard Bugs

Marc Hedlund talks about debugging
Marc Hedlund talks about debugging
(click to enlarge)

There's no doubt that debugging is a critical skill for anyone who codes. Marc Hedlund is talking about how to tackle the really difficult ones. I enjoyed Marc's tutorial from last year, and picked this one on that basis.

Most bugs aren't hard. 95% of the time, you can find a fix easily and move on. Marc's tutorial is about what to do when the simple methods don't work anymore. He gives an example of a login that would fail once every 10,000 times or so. Turns out the problem was a filter that would through out URLs with swear words in them. Finding bugs like that can be hard.

Marc recommmends Why Programs Fail: A Guide to Systematic Debugging . This is a great guide to systematic debugging. Some people are great debuggers. Others can use help.

He uses this example: Segmentation fault using libtidy (symptoms, diagnosis, and bush medicine cure. Here's what he did right:

  • Eliminated possible causes and narrowed in
  • Wrote a test case that exercise the bug and discovered Rails was factor
  • Used source code and a debugger to gather data.
  • Noticed a coincidence
  • Reproduced failure in his test case.

Here are some common mistakes:

"That doesn't look right, but it's probably fine." If you think there's a bug, there's a bug. Pay attention to small hints. If you can't find anything file a bug report.

"It seems to have gone away." If you didn't fix the bug, it's still there. If you don't understand what the problem is, it will bite you later.

"I bet I know what this is." Wait o form theories until you have data. Let the data lead you. He quotes Sherlock Holmes: "It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts."

"That's impossible." Impossible conditions are often the source of bugs. Set up logging, exceptions,a nd assertions. Make sure you get the report. Make sure you see failure when it occurs. Ignoring the obvious is a good tool. When your Web site produces and exception, send it to the whole engineering team.

"Beats me...probably a race condition." Not all hard problems are race conditions. Usually this means "I don't know." This is forming a theory without data.

"I'm just going to fix this other bug quickly." Don't make any changes until you understand the bug. File and log and bugs you find along the way--but don't fix them. You end up suppressing the first error and missing it.

"That's probably the [server/client] code." Don't guess. Prove it. Be humble--don't assume you're better. If you keep getting wrong reports proof will help.

"I think I found a bug in the OS." In all likelihood, the problem won't be in the libraries of in the operating system. That can happen, but you'd better have pretty good evidence.

"That not usually the problem." Beware of representativeness errors. Sometimes 40-year olds have heart attacks. If the data leads that way, then follow it.

"Oh, I saw this just last week." This is known as an availability error. Third in a week could be an epidemic--or not.

"This guys too smart to make that mistake." Beware of sympathy errors. Even engineers put CDs in upside down. Check the data no matter the source. The opposite is also true: assuming someone's stupid.

"I found a bug and fixed it-done." Finding a bug is different than finding the bug.

"I haven't made any progress--it's still broken." Think of the bug report is a collection of information. Adding data, eliminating theories, and recording changes leads to understanding. Clearing bugs is the end goal, but progress can be represented by other things.

"I've got to get this out now--not time for..." Rushed fixes tend to introduce more bugs. Stick to a good process even if the situation is urgent. Break down suppression and closure.

Here's Marc's general approach to fixing bugs.

  • Revert any changes you made looking for a quick fix - Bring the system to its initial state. People usually try something quick. Getting back to the original condition as quickly as possible is important.
  • Collect data from each of the components involved - Maintain a page with the most concise problem descriptions. State everything you know for a fact. List the questions for which you need answers. Don't delete data; instead move it to a "probably unrelated" section.
  • Reproduce the bug and automate it - You must have access to the reporters environment. Use virtualization and the browser version archives where needed.
  • Simplify the bug conditions as much as possible - Con you reproduce the bug in other circumstances. Can you remove a condition and still see it? Are there any contradictions in the conditions? "We only see this on OSX with IE." Can you separate the problem? Could be an error in the data?
  • Look for connections and coincidences in the data - Build a set of "that looks weird" observations. Describe all the actors and their roles. Parallel timelines can help. Look at data from client and server viewpoints.
  • Brainstorm theories and test them - State each theory separately. Does the theory cover all of the data in the report? Does it explain why the conditions are necessary? Does it cover all the related reports?
  • When you find a fix, verify it against the report - Go back and re-read the whole bug report. Run all of your reproduction test cases.
  • Check that you haven't created new bugs - Very common for one fix to create new bugs. Automated test quites help enormously at this point. If X was failing under condition Y but not Z and it now passes under Y, does it still pass under Z? Often the answer is "no."

These steps almost always work. You might have to go through it several times. You might need several people to make it work. You might decide its too costly. Even so, if you go all the way through this process, you will get a fix.

I missed 45 minutes after the break because of a conference call I had to join. So, there's a gap here in what Marc said and what I heard.

The best predictor of new bugs is change rate. Code that is changing a lot will have a lot of bugs. Direct QA efforts by counting changes per file. Spend time testing the stuff that changed.

The best estimator of code quality is the rate you find bugs. When the find rate goes down, you're ready to ship. You should ignore every other QA measure.

You can four things with each bug

  1. Fix it
  2. Suppress it
  3. Record it and wait for more info
  4. Ignore it

You probably can't always afford (1). Of the rest, (3) is the best option.

There's a culture surrounding bugs. Don't scold people for bugs. Everyone creates bugs. If bugs cause punishment, reports will be killed and there will be severe tension with QA. If there's a chronic problem with bugs from one person, deal with it in person.

Reopen rates measure how development deals with bugs. Lots of reopens is a red flag for process--especially within one release. Reopens indicate that bugs are being hidden rather than closed.

Marc has some book recommendations for people who want to understand debugging better:

Tags: etech08 debugging programming

» Kathy Sierra: Storyboarding for Non-Fiction

Kathy Sierra talks about storyboarding
Kathy Sierra talks about storyboarding
(click to enlarge)

How do you create riveting technical presentations and user manuals? Tell a story. Kathy Sierra is teaching the tutorial and using her own experience creating the "Head First" books on Java and Design Patterns as examples.

Define your "post-click" behavior. After someone has gotten your message, what would happen in the reader? Does you message change the readers behavior? Do you know how you want it to change them? You can't create the right material without understanding what you want to achieve. In the case of

What creates a page turner? Suspense, for one. The feeling that you can't wait to see what comes next. Or even just the joy of understanding something complex.

What makes you stop turning the page? The audience responds with several examples including jargon, exclusion, competition with funner activities, and so on. Kathy talks about how jargon is an example of something that's good for people to know, but someone has to provide the bridge and too many books suffer from only being understandable after you know the area.

There's a hierarchy of requirements for a "bestseller."

  1. The right topics
  2. In the right order
  3. Clear and accurate
  4. Interesting
  5. Enchanting

The competitive advantage is in the final two. Good writers don't usually have a huge challenge with the first three. But you have to nail those first.

One of the biggest problems that technical writers have is contentitis: the need to cover everything. Most books would do better with much less material in the same time. This is related to the happy user peak that Kathy has talked about related to product features.

Good books are brain friendly and seductive. When you can properly introduce people to topics, they get a richer experience. It's important to get people past the suck threshold and above the passion threshold as quickly as possible. That means you need to get information to them quickly. And that means that creating a page-turner is vital.

People don't want to be experts at a tool. They want to get something done, accomplish their goals. People do become passionate about tools, but largely because they allow them to get things done. Kathy uses the Nikon Learning Center as a positive example of focusing on the pictures people want to take and then walking through the ways to use the camera to get those results. Contrast that with user manual which is a dry exposition of features. She takes it one step further and contrasts sales literature with user manuals.

The brain has spam filters. No matter how hard you might want to get some information in your mind, the brain has natural processes that filter out some information as "not important." Authors have to fight to get past the spam filters. Legacy brains have to be fooled into thinking that the topic we want to learn is something they care about.

Brains care about chemistry (i.e. emotion). Brains pay attention to things that have an associated emotional cue. Brains like novelty, weird, different. You should be afraid every moment that you're losing you're reader. Brains pay attention to things that are scary. Brains care about changes in light and shadow. Brains pay attention to faces. Brains like joy. Brains like young cute things.

Brains don't care about cliche. Brains don't like things that aren't resolved. The brain gets pulled in trying to figure things out. Curiosity is irresistible.

Bottom line: emotions tell the brain "this matters." So talk to the brain, not the mind. In other words, trick the brain into thinking polymorphism is as important as a tiger.

Formality is a problem. People's ability to use and apply new information is positively affected by using a conversational tone. She cites research by Moreno and Mayer (here's a good summary) that " research seems to show a self referential effect where information is retained, memorized better when it is given a personal reference."

Biggest lesson: if you're in the business of communicating things, you're in the emotion-delivery business.

Kathy Sierra talks about
the Hero's Journey
Kathy Sierra talks about the Hero's Journey
(click to enlarge)

Reader as hero. This doesn't mean that you write your book as a fictional story. But you want the reader to identify with the journey and the endpoint goal. The experience should be a hero's journey: life is normal, something happens to change that, (add a helpful sidekicks and mentor), things really suck, hero overcomes bad things, and then, finally, the hero returns to normal. The hero is changes after overcoming bad things. As an author, you should figure out what that change is. This is the outcome that you want for the reader.

The reader as hero can't be supported by "dumbing down" the material. The reader won't feel heroic if the material is too easy. Don't shy away from challenging. But use brain friendliness to create the emotional experience that gets them through the challenge

Here's the overall process we're going to consider.

  1. Log line
  2. High level 3-acts
  3. Create 'story' template
  4. Create topic list/cards
  5. Rearrange in template
  6. Fill in holes
  7. Do detailed storyboards
  8. Miracle occurs

The log line answers three questions:

  1. Who is this about?
  2. What is he up against?
  3. What is at stake?

Act one is the call to action. Act one typically ends with the hero's refusal of the call and ultimate acquiescence. Act two is the challenge and ends with figuring out how to meet the challenge. Act three is the road home.

Story templates
Story templates
(click to enlarge)

Here's the story template:

  1. Set tone
  2. Question posed
  3. What's at stake
  4. Catalyst or motivation
  5. Skepticism or debate
  6. Cross threshold (to Act 2)
  7. back story and tools
  8. Fun and games using tools
  9. Stakes raised
  10. Not out of woods
  11. All is lost
  12. Answer found! I rule!
  13. Lessons learned

Keep in mind that learning experiences are fundamentally different from reference experiences. You can't create a great leaning tool and make it a great reference too at the same time.

Steps 1-5 in the preceding template are Act 1 and should be about 20-40 pages in a technical book. Steps 6-12 are Act 2 and are usually around 300 pages. Step 13 is the final Act and is again around 20-40 pages.

For every thing in your book from Acts to chapters to parts of chapters, use the spiral user experience model.

Motivational milestones
Motivational milestones
(click to enlarge)

Ask "why?", "who cares?", and "so what?" about very topic. Make sure this is front loaded so that the brain knows what it should pay attention. Make sure that when you say "This matters because..." that what follows in emotionally engaging. Show, don't tell. This might mean pictures but can also be an example. "Imagine you want to do..." is a way to set up a scenario.

Smoking out the topics with the "who cares?" question can ignite your natural passion about why it matters and affect writing in a positive way. Make this discussion real. Better to do it with a helpful critic, I think.

The other part of the spiral that readers care about is the payoff. Once users understand concept A, that leads right into the motivation for concept B: now that you understand this, you're ready to approach something even better. Game designers are good at this.

A technique for getting emotional benefit is "just in time" vs. "just in case." Just-in-time learning is highly motivated. Just-in-case is what books and lectures are all about. Setting up scenarios with "image you want to ..." is one way to making just-in-case feel like just-in-time. You'll end up with the topics in the right order.

The representation of getting to the next level. What are the rewards someone gets for completing an activity or learning something new? What are the new "superpowers" that they get? What can they do now that they could do before?

We need to get readers in the flow state--that state where they can't stop reading. To get there, knowledge and challenge have to be in balance.

What engages the brain?
What engages the brain?
(click to enlarge)

These things turn the brain on deeply: discovery, challenge, narrative, self-expression, social framework, cognitive arousal, thrill, sensation, triumph, flow, accomplishment fantasy, and growth. Complete the description: "Learning experience as..." Don't take people outside the experience with extraneous material and narrative. Don't make users think about the wrong things.

Variety is important. We get tired if we hop on one leg over and over. Make sure that you're exercising different parts of the brain. Insert cooler stuff with dry stuff. Pace the topics.

We ended with an exercise of writing out some storyboard ideas for something we care about. I did it for the first product Kynetx is building and it was fun and helpful.

Tags: etech etech08 product+management storytelling