echo "hey, it works" > /dev/null

just enough to be dangerous

It's back to the drawing board for ACCC - theage.com.au


[The Australian Competition and Consumer Commission] claimed Google did not clearly distinguish between regular - "organic" - search results and ads at the top of the results page, which Google calls "Sponsored Links".

Granted I may not be your average user, but I think this is just wrong headed on the ACCC's part. First, I think it's obvious they're ads. Second, there are plenty of other dastardly things going on that the ACCC should be focussing on. Google may have a lot to answer for, but I don't think this issue is one of them.

Habari love


I hung out in #habari yesterday, and I have to say it was a lot of fun. Everyone was extremely welcoming. I wasn't even allowed to lurk, as freakerz said 'hi' to the stranger, which was great. I'm not sure I would have contributed to the discussion if not. Habari will be a better piece of software because of this attitude.

Lessons of stupidity #1


If you're going to be using Apache rewrites in .htaccess files, don't have AllowOverride All in Directory / and then have AllowOverride None in Directory /usr/local/www/.

WordPress 2.3, the Connections theme, and tags


I've just upgraded to WordPress 2.3 and, while the process was pretty painless, for some reason tags weren't working, even after I added tag support to the theme using the_tags() in The Loop. The problem turned out to be that The Loop in the Connections theme uses deprecated function calls. To enable tagging in the Connections theme edit index.php and find the following code.

<?php if ($posts) : foreach ($posts as $post) : start_wp(); ?> 
    <div class="post">
        <?php require('post.php'); ?>
        <?php comments_template(); // Get wp-comments.php template ?>
    </div>
<?php endforeach; else: ?>
    <p><?php _e('Sorry, no posts matched your criteria.'); ?></p>
<?php endif; ?>

This is The Loop. You need to replace it with this code.

<?php if (have_posts()) : ?>
  <?php while (have_posts()) : the_post(); ?>
    <div class="post">
      <?php require('post.php'); ?>
      <?php comments_template(); // Get wp-comments.php template ?>
    </div>
  <?php endwhile; ?>
<?php else: ?>
  <p><?php _e('Sorry, no posts matched your criteria.'); ?></p>
<?php endif; ?>

You need to make a similar change to single.php, category.php, date.php and search.php.

Finally, you need to edit post.php and add a call to thetags() somewhere. You can add it anywhere, depending on how you want the post laid out. I added thetags(', tagged ', ', ', ''); after the call to the_category().

How should AtomPub servers handle bad requests?


freakerz asked an interesting question on the habari issues tracker.

How should we handle this kind of behavior? If the client does not use the APP properly, should we prevent the loss of information by catching/handling bad requests?

Of course, this is time to mention Postel's law, "be conservative in what you do, be liberal in what you accept from others." Again, of course, this solves nothing because there is an enormous amount of argument about what that means for implementors. There's already been a lot of discussion of this in relation to how aggregators should handle broken feeds.

The AtomPub spec does have something to say about how consumers should behave when they receive non-conforming content.

The Atom Protocol imposes few restrictions on the actions of servers. Unless a constraint is specified here, servers can be expected to vary in behavior, in particular around the manipulation of Atom Entries sent by clients. [...] Servers can choose to accept, reject, delay, moderate, censor, reformat, translate, relocate or re-categorize the content submitted to them. [...] The same series of requests to two different publishing sites can result in a different series of HTTP responses, different resulting feeds or different entry contents.

Aside: The anchor to this section is named "lark's vomit." Those wacky spec authors, I did but chuckle.

This doesn't answer the question at all for server implementers. It simply says you can do whatever you like when you receive stuff. The end user doesn't care, but they don't want to have their content rejected. This is a competitive business, and you don't want to lose clients, and you will if a competitor accepts something that you don't. People will switch.

Well-formedness is a minimum, so stuff that's not well-formed should stop processing. But in a nice way, that upsets the user as little as possible. So, we're really talking about invalid entries. Should there be specific code for producers that are known to create borked entries? More usefully, like testing for browser behaviour rather than a specific user agent, consumers should do their best to catch classes of errors and deal with them appropriately.

The server should accept all content and fail gracefully when it can't be consumed properly. When broken content is received server implementers have a responsibility to do everything they possibly can to work with the producer's builder to fix the problem, but in the meantime they should try to handle the broken content.

dive into mark » There are no exceptions to Postel’s Law


Another suggestion was that we do away with the Atom autodiscovery <link> element and just use an HTTP header, because parsing HTML is perceived as being hard and parsing HTTP headers is perceived as being simple. This does not work for Bob either, because he has no way to set arbitrary HTTP headers. It also ignores the fact that the HTML specification explicitly states that all HTTP headers can be replicated at the document level with the <meta http-equiv="..."> element. So instead of requiring clients to parse HTML, we should just require them to parse HTTP headers… and HTML.

Implementing WSSE authentication in WordPress


I was excited to learn recently that the Nokia N73 can speak AtomPub, and that a friend of mine owns one. I thought I'd try to make it talk to the new AtomPub implementation in WordPress, but reading through the N73 documentation I found that it only supports WSSE authentication, and WordPress only speaks HTTP Basic Authentication. I'd never heard of WSSE, but Mark Pilgrim has a good write up on XML.com, and the Ape has the ability to speak WSSE, so I thought I'd implement it in WordPress. Bear in mind that I'm not writing this from a security point of view, I'm just looking at authentication as a necessary evil to get cool AtomPub things working. And there's a spoiler: it can't be done :)

A WSSE client will send an Authorization header which, as we know, will get dropped if Apache is passing the request off to a CGI, and a X-WSSE header that looks like this:

X-WSSE: UsernameToken Username="USERNAME", PasswordDigest="PASSWORDDIGEST", Nonce="NONCE", Created="2007-09-08T05:52:36Z"

PasswordDigest is a base64 encoded SHA1 digest of the concatenation of the nonce, the timestamp and the password. The nonce is of course some random string.

So, to add WSSE into WordPress AtomPub, we can add some code to the authentication function in wp-app.php.

First, we check if the client is trying to authenticate using WSSE by looking for a X-WSSE header.

if(isset($username<em>token = $</em>SERVER['HTTP<em>X</em>WSSE'])) {

We then take the Username Token contained therein and split out the user, digest, nonce, created information sent by the client. There are probably nicer ways to do this.

$wsse = array( 'user' => "", 'digest' => "", 'nonce' => "",
                   'created' => "", 'password' => "");
    $tokens = explode(", ", trim(strstr(stripslashes($username_token), " ")));
    foreach ($tokens as $token) {
        $pivot = strpos($token, '=');
        $key = substr($token, 0, $pivot);
        $value = trim(substr($token, $pivot + 1), '"');
        switch ($key) {
        case "Username":
            $wsse['user'] = $value;
            break;
        case "PasswordDigest":
            $wsse['digest'] = $value;
            break;
        case "Nonce":
            $wsse['nonce'] = $value;
            break;
        case "Created":
            $wsse['created'] = $value;
            break;
        }
    }

Finally, we recreate the digest on the server, and compare it to what was sent, and close the if.

$wsse['password'] = get<em>password</em>by<em>login($wsse['user']);
    $server</em>digest = base64['encode(pack("H*", sha1($wsse['nonce'] . $wsse['created'] . $wsse['password'])));
    if ($server_digest == $wsse['digest']) {
        $login_data = array('login' => $wsse['user'], 'password' => $wsse['password']);
    }
}

If you have familiarity with WordPress's code, you might be saying something like, "WTF is this getpasswordby_login() function call? I've never seen such a thing!" Good question. And the dirty little secret is that no such function exists. A weakness of the WSSE authentication scheme appears to be that to recalculate the digest the password needs to be stored in plain text on the server. This is probably at least as bad as sending the password in plain text over the wire, the thing that WSSE is trying to avoid. WordPress, sensibly, does not store passwords in plain text, but computes an md5 hash of them and stores that.

So, as far as I can tell, there is no way to implement WSSE in WordPress in any sensible way.

One little word on security. If we could implement WSSE, the code should keep track of nonces and make sure they aren't repeated, and should reject UsernameTokens created more than a couple of minutes ago (leaving aside any discussion of synchronisation of your client's clock with my server).

P.S. I hadn't read Joe Cheng's comment or Joseph Scott's reply in the comments of the post I linked to above before I started off on this wild goose chase.

Image search


[Update: A friend from Microsoft says, "The UI switches to a more basic scheme if you're using a browser that's not recognised, for example the Firefox 3 and IE8 betas. What are you using - can this be the explanation?" Ah, indeed it is, no need to panic.]

[Update: Live seemed to have turned off all the cool features and are back to boring image search. What happened?]

Of the three main search engines (Google, Yahoo! and Live) Live search is streets ahead when it comes to image search. They also appear to be the only ones who are doing much innovation in this space.

So, why do I say such a thing? I'm not really talking about the quality of their results, because they're all about on a par. Live has a much better user experience. The first thing you notice is that they have done away with the concept of paged results. No more clicking 'Next', you just keep on scrolling down and more images appear. This makes a lot of sense for images for three reasons. First, image search isn't great, so users are much more likely to want to see a lot more results than a normal search for a web page. Lots of clicky clicky next just gets tedious.

Second, a thumbnail is a much better summary of the actual resource than the summaries currently generated by search engines. This means that users are effectively looking at the answer as they scroll through the results. They don't have to click on the result link and go off to evaluate a document and then come back, they can evaluate the image in situ, and therefore evaluate a much larger number of resources.

Third, it's more likely that users are not looking for an individual image, but for different images of the same thing. When they find an image they like, it's quite possible that isn't the end of their search. This brings me to another plus of Live; the scratchpad. When you find an image you like, Live allows you to put that image in a scratchpad by clicking on an image and selecting the appropriate link. Once the scratchpad is open, you can drag and drop to it as well. You can then save the images as collections.

Other interesting things to note include the resizable thumbnail slider and the image size filter (including wallpaper-sized images). Last, try filter:face and filter:portrait.