A Django site.
June 24, 2008

Phil Windley
pjw
Phil Windley's Technometria
» Velocity 08: High Performance AJAX Applications

Julien Lecomte from Yahoo! is speaking about creating performant AJAX applications. The most important point: plan for performance from day 1. Interestingly many of his initial points are about telling the developer to work with the product manager and not just say "no."

Julien references an Web Site Optimization: 13 Simple Steps by Stoyan Stefanov. Here's some tips:

Less is more. Don't do unnecessary things.

Break rules. Make compromises and break best practices when needed. For example, you might decide to forgo CSS. Especially CSS expressions.

Work on improving perceived performance. Cheat by making users think things are done before they are.

Measure performance. Test using a setup similar to the user's configuration. Profile your code during development. Automate profiling and performance testing. Keep historical records of how feature perform.

Minify CSS and Javascript files. Use something like the YUI Compressor. Stay away from compression schemes that require run time compression. You can also combine the CSS and JAvascript files. Optimize images.

Loading and parsing HTML, CSS, and JavaScript code is costly. Be concise and write less code. Make good use of libraries. Splitting JS libraries into bundles for specific uses might save time.

Close HTML tags. Unclosed tags take longer to parse. Load assets (even images) on demands.

Most DOM events can be accomplished before the onload even has fired. You can also load the scripts after the page has fully loaded.

In JavaScript a lookup is done each time a variable is accessed. Declaring variables with the var keyword and making them local helps. Avoid global variables at all costs. Avoid with. You can use a local variable to "cache" the value of a variable outside the current scope when it's going to be accessed repeatedly.

Limit the number of event handlers. Attaching a even handler to hundreds of elements is very costly and can be the source memory leaks.

Reflows happen when the DOM tree is manipulated. You can minimize reflows by taking advantage of browser built-in optimizations. For example, modifying an invisible element doesn't trigger reflow.

Use onmousedown instead of onclick to take advantage of the time between the start of the button press and the release.

Avoid using JavaScript for layout. Use CSS where possible.

Never resort to a synchronous XHR. Asynchronous programming is more complicated but it's worth it. Deal with network timeouts programmatically.

If you validate user data on the browser, 99.9% of the time, the request will succeed, so lock the affected elements, let the user know something's happening, and process the request while the user continues to use the application.

Use JSON rather than XML. Consider local storage and just process diffs. Multiplex AJAX requests where you can.

Tags: velocity08 ajax performance browsers

June 23, 2008

Phil Windley
pjw
Phil Windley's Technometria
» Velocity 08: Jiffy: Instrumenting and Measuring Web Performance

Scott Ruthfield from WhitePages.com is announcing a new open-source projects called Jiffy, a tool for measuring the end-to-end performance of Web sites (PDF slides). Jiffy provides real data about performance that is more complete and more fine grained than what you might get from Keynote or Gomez. Jiffy has four goals:

  • Real data at scale - track 100% of page views
  • Measure anything - pre load data access, each add, brand, when the form is ready, and so on
  • Real-time reporting
  • No impact on page performance

Jiffy comprises a JavaScript library that instruments the pages, an Apache proxy, a tool for putting log data into a database (Oracle for now), and reporting roll-up code and UI.

The basic idea is "mark and measure." You can set a mark and then make any number of measurements of how much time has elapsed from the mark. You can do immediate or batch submits depending on the requirements or your site and how much bandwidth with want to consume.

Bill Scott of Netflix has written an extension to Firebug for Jiffy.

Tags: performance web velocity08

April 11, 2008

Dennis Muhlestein
nonic
All My Brain
» Using YUI components in a templated environment

If you develop sites anything like I do, you'll end up setting up a site wide layout and theme before you start coding any individual pages. I like to add YUI components where they are useful, but I've come across a couple little quirks that were annoying me. Here are my observations and [...]

April 2, 2008

Dennis Muhlestein
nonic
All My Brain
» Serving JavaScript Fast

I found this excellent writeup on serving JavaScript files posted on Digg.com. I think I'll convert some of those ideas to Python but I thought it worth posting here in the mean time with the link to the story. The next generation of web apps make heavy use of JavaScript and CSS. We’ll show you [...]

February 21, 2008

Pat Eyler
pate
On Ruby
» JRuby 1.1 RC2 Real World Performance

With the recent release of JRuby 1.1 RC2 I’ve rerun my ‘Real World Performance’ benchmarks. In addition to my normal run I’ve added a second longer run, and the results surprised me. Let me tell you what I did, and then we can take a look at the results. In both cases I use an application that builds a catalog of good and bad patterns. Then it reads in syslog output and compares the log

January 11, 2008

Pat Eyler
pate
On Ruby
» Real World Rubinius Performance

Well, I have good and bad news from the Rubinius front. This morning, I built the latest Rubinius from the git repository, and gave LogWatchR another try … and it worked! This is a huge step forward from my perspective, since I’ve had all kinds of wierd failures in the past. The bad news (well, bad is an overstatement, let’s say ‘not so good news’) is that the performance is pretty bad at this

January 8, 2008

Pat Eyler
pate
On Ruby
» JRuby 1.1RC1 Real World Performance Update

With the release of JRuby 1.1RC1 I’ve run a new set of LogWatchR performance tests. This time, I’ve only run a set of 1.0 and 1.1 versions of JRuby. If you’re really interested in seeing more about 1.8 and 1.9 performance, you can always go look at my older post on the topic. As a bit of clarification up front, this test measures the execution time of a simple minded log analysis tool I wrote/

December 26, 2007

Pat Eyler
pate
On Ruby
» Real World Performance Profiling

It looks like my post on Ruby 1.9.0 performance is drawing some criticism over on on reddit. I already updated the original post to deal with a comment by ‘gravity’. In a later comment, ‘Ganryu’ wrote “I don’t get it… isn’t this mainly IO dependent?” Since he doesn’t have access to the code for LogWatchR he can’t do the profiling to find this out, but it’s the same kind of assumption that a

» Real World Performance On Boxing Day

Well, Ruby 1.9.0 landed yesterday, as expected. I’d be remiss if I didn’t start out by thanking matz, ko1, and all the other hackers involved in getting this milestone release out the door. It’s a great step for Ruby, and one that we’ve been waiting a long time for. The bad news is that 1.9.0 is just a development release branch leading up to 2.0, and it doesn’t yet run Rails or Mongrel.

December 4, 2007

Dennis Muhlestein
nonic
All My Brain
» Making WP-Super-Cache gzip compression work

I was pretty excited to see an update to WP-Cache. The first thing I noticed is that when I enabled the new super cache compression option, I started getting a file save as dialog instead of my pages. As of the current version of WP-Super-Cache, the readme.txt file states that if you get [...]

» WP Super Cache - The Ultimate WordPress Caching Plugin

I’ve upgraded my old WP-Cache plugin to this one that I found on Digg.com today. From the Digg.com Post: Tired of clicking a link off the Digg front page only to find a crashed or mortally lagged site on the other side? Finally, Donncha (one of the main WordPress developers) has solved the problem once and [...]

November 5, 2007

Hans Fugal
no nic
The Fugue :
» X-Sendfile

I'm writing a little photo gallery of my own, because everything out there stinks. But sending big images files in Rails (using send_file and send_data) is slow, mostly because you tie up a whole rails process just feeding data to the web. Web servers like Apache, Lighttpd, and Mongrel are good at serving static files, let them do it.

That's the idea behind X-Sendfile. If you send an X-Sendfile header with the path of the file you want to send, then a supporting webserver will do the dirty work and do it fast, and you can get on with serving other requests.

That's the theory anyway, but there's some bumps in the road. First, AFAICT mongrel doesn't support X-Sendfile. This is fine when mongrel is running behind an Apache proxy which does, but kind of throws a wet blanket on development and apachephobes like myself. Ok, apachephobe might be a bit strong, but I don't want to set that monster on my laptop just for some rails development. So mongrel's out. Correct me if I'm wrong.

Lighttpd supposedly invented X-Sendfile, but 1.4.x and earlier don't seem to support it. Instead, you have to use the header X-LIGHTTPD-send-file. Also, it doesn't work unless Content-Length is properly set (or perhaps if it's absent). This is bad news for rails users, since a bug in rails causes the Content-Length header to be set to the content, which is not the file. If you do render :nothing => true, then the content is one space character, and the Content-Length is 1, and Lighttpd defiantly refuses to fix it. So you either have to work around the rails bug, or upgrade to lighttpd version 1.5.x (now in release candidate) which supposedly works (I haven't tested it—I can't get it to compile on Leopard). I say bug in rails, but frankly I'm more inclined to consider this bad behavior on the part of lighttpd. In that vein, here is a patch for lighttpd version 1.4.18 that will enable both X-LIGHTTPD-send-file and X-Sendfile headers with rails 1.2.3 which has the Content-Length resetting behavior. It makes lighttpd set the Content-Length on its own. Thanks to stbuehler for the patch.

--- src/mod_fastcgi.c.orig      2007-11-05 13:52:47.000000000 -0700
+++ src/mod_fastcgi.c   2007-11-05 13:55:17.000000000 -0700
@@ -2530,22 +2530,28 @@
                }

                if (host->allow_xsendfile &&
-                                   NULL != (ds = (data_string *) array_get_element(con->response.headers, "X-LIGHTTPD-send-file"))) {
+                                   ((NULL != (ds = (data_string *) array_get_element(con->response.headers, "X-LIGHTTPD-send-file")))
+                                     || (NULL != (ds = (data_string *) array_get_element(con->response.headers, "X-Sendfile"))))) {
                    stat_cache_entry *sce;

                                         if (HANDLER_ERROR != stat_cache_get_entry(srv, con, ds->value, &sce)) {
-                                               /* found */
-                                                con->parsed_response &= ~HTTP_CONTENT_LENGTH;
-
+                                               data_string *dcls = data_string_init();
+                                                /* found */
                        http_chunk_append_file(srv, con, ds->value, 0, sce->st.st_size);
                        hctx->send_content_body = 0; /* ignore the content */
                        joblist_append(srv, con);
-                                       }
-                                        else
-                                        {
-                                               log_error_write(srv, __FILE__, __LINE__, "sb",
-                                                       "send-file error: couldn't get stat_cache entry for:",
-                                                       ds->value);
+
+                                               buffer_copy_string_len(dcls->key, "Content-Length", sizeof("Content-Length")-1);
+                                               buffer_copy_long(dcls->value, sce->st.st_size);
+                                               dcls = (data_string*) array_replace(con->response.headers, (data_unset *)dcls);
+                                               if (dcls) dcls->free((data_unset*)dcls);
+
+                                               con->parsed_response |= HTTP_CONTENT_LENGTH;
+                                               con->response.content_length = sce->st.st_size;
+                                       } else {
+                                               log_error_write(srv, __FILE__, __LINE__, "sb",
+                                                       "send-file error: couldn't get stat_cache entry for:",
+                                                       ds->value);
                                         }
                }
--- src/response.c.orig 2007-11-05 14:08:26.000000000 -0700
+++ src/response.c      2007-11-05 14:04:49.000000000 -0700
@@ -59,7 +59,8 @@
    ds = (data_string *)con->response.headers->data[i];

    if (ds->value->used && ds->key->used &&
-                   0 != strncmp(ds->key->ptr, "X-LIGHTTPD-", sizeof("X-LIGHTTPD-") - 1)) {
+                   0 != strncmp(ds->key->ptr, "X-LIGHTTPD-", sizeof("X-LIGHTTPD-") - 1) &&
+                   0 != strncmp(ds->key->ptr, "X-Sendfile", sizeof("X-Sendfile") - 1)) {
            if (buffer_is_equal_string(ds->key, CONST_STR_LEN("Date"))) have_date = 1;
            if (buffer_is_equal_string(ds->key, CONST_STR_LEN("Server"))) have_server = 1;

Then, you need to configure your lighttpd server. Run script/server lighttpd once to generate config/lighttpd.conf, and add this bit to the fastcgi.server section:

    "allow-x-send-file" => "enable"

Finally, use it—either by setting the X-Sendfile header manually or by using the rails x_send_file plugin (I recommend the latter).

Here's some links for more reading:

November 2, 2007

Dennis Muhlestein
nonic
All My Brain
» Fixing Slow Resizing of Windows with Compiz and Emerald

One of the 1st things I noticed after upgrading to AIGLX with the new ATI drivers was that window resizing was incredibly slow. A quick search on Google yielded a LOT of results for the same problem. The first thing I noticed however, was that they were OLD forum threads. They did [...]

October 14, 2007

Dennis Muhlestein
nonic
All My Brain
» Customize your laptop speed for temperature and performance

I while ago, I found a great article on Slashdot that shows how Windows XP manages variable speed CPUs. Well, at least it applies to Intel Speedstep technology. If you have an Intel processor (like the Core 2 Duo T7200 in my laptop), you can take full advantage of the different CPU frequency [...]

December 3, 2007

Pat Eyler
pate
On Ruby
» JRuby 1.0.1 - Real World Performance

The other day, Thomas Enebo and the JRuby gang cut a 1.0.1 release of JRuby and I finally got around to benchmarking it against my LogWatchR app. I used the same data set and Ruby versions as previously (I need to upgrade my 1.9.0 install to see how that work's been going, but it won't happen today). (You can see the previous version of this benchmark here.) This time around, JRuby showed

April 26, 2007

Pat Eyler
pate
On Ruby
» Some Real World Performance notes on Ruby 1.8 and 1.9

One of the knocks against Antonio's Ruby Performance Shootout is that it uses a synthetic set of tests (the one's Ko1 wrote to exercise the parts of YARV he'd already done some optimization on). In an effort to get a 'more real'

April 25, 2007

Pat Eyler
pate
On Ruby
» More Real World Performance Data

With the new release of JRuby, I’decided to rerun my LogWatchR benchmark for ruby 1.8.5p12, ruby 1.9.0, and JRuby 0.9.9. Neither XRuby nor rubinius can run the YAML library from the Standard Library yet, so neither of them will be included in these benchmarks yet. In running this set of tests, I ran into some anomolies with last weeks data. I’ve figured it out now, and need to make a