echo "hey, it works" > /dev/null

just enough to be dangerous

Is Linking to Yourself the Future of the Web? - O'Reilly Radar


At the time, I noted the way that more and more information that was once delivered by independent web sites was now being delivered directly by search engines, and that rather than linking out to others, there were strong signs of a trend towards keeping the link flow to themselves.

Is Linking to Yourself the Future of the Web? - O'Reilly Radar

I may exist in a bubble1 but it seems to me that an opposing tension is emerging. Twitter has been used2 by many people as a search and question answering service, as has FriendFeed more recently. Aparently people find things like Digg useful too3. By their nature, the organic links in these serves are mostly external, though of course both services are somewhat limited by the number of followers someone has.

  1. Yes, I exist in a bubble.
  2. Yes, when it worked, fine.
  3. Though for the life of me, I cannot work out why.

What the hell is that song?


Prediction of the day. Within 5 years--let's say by Christmas 2012--it will be possible to search for a song by playing or singing a bit of the song. People will be able to sing into their computer and be taken to search results where they can play and purchase the song. Or they'll be able to hold their phone up to the radio, and get a text message back with information on the song and where to buy it.

Maybe 5 years is even a bit on the long side.

Search APIs


There's quite a difference between the terms of use of Google's and Yahoo!'s search APIs. While both say you're not allowed to do anything illegal or mission critical (I especially like Yahoo! saying you can't rely on the API if you're operating nuclear facilities), they differ in terms of what you're allowed to do with the results.

Google spends a lot of time saying what you're not allowed to do once you get results (you can only retrieve a small number of results (8 at the moment), you can't use them as the main content on your site, you can't modify them, you can't use a robot or spider to retrieve results). Possibly fair and reasonable, though, for my current (non-commercial, research) project, 8 results won't do and I want to break all those rules.

Yahoo! on the other hand, says absolutely nothing on the subject of what you can't do with the results. All they say is don't whack the server too much.

Image search


[Update: A friend from Microsoft says, "The UI switches to a more basic scheme if you're using a browser that's not recognised, for example the Firefox 3 and IE8 betas. What are you using - can this be the explanation?" Ah, indeed it is, no need to panic.]

[Update: Live seemed to have turned off all the cool features and are back to boring image search. What happened?]

Of the three main search engines (Google, Yahoo! and Live) Live search is streets ahead when it comes to image search. They also appear to be the only ones who are doing much innovation in this space.

So, why do I say such a thing? I'm not really talking about the quality of their results, because they're all about on a par. Live has a much better user experience. The first thing you notice is that they have done away with the concept of paged results. No more clicking 'Next', you just keep on scrolling down and more images appear. This makes a lot of sense for images for three reasons. First, image search isn't great, so users are much more likely to want to see a lot more results than a normal search for a web page. Lots of clicky clicky next just gets tedious.

Second, a thumbnail is a much better summary of the actual resource than the summaries currently generated by search engines. This means that users are effectively looking at the answer as they scroll through the results. They don't have to click on the result link and go off to evaluate a document and then come back, they can evaluate the image in situ, and therefore evaluate a much larger number of resources.

Third, it's more likely that users are not looking for an individual image, but for different images of the same thing. When they find an image they like, it's quite possible that isn't the end of their search. This brings me to another plus of Live; the scratchpad. When you find an image you like, Live allows you to put that image in a scratchpad by clicking on an image and selecting the appropriate link. Once the scratchpad is open, you can drag and drop to it as well. You can then save the images as collections.

Other interesting things to note include the resizable thumbnail slider and the image size filter (including wallpaper-sized images). Last, try filter:face and filter:portrait.

The system of use is the repository


In the beginning, Web search was based on aspects of the resources being indexed, like term frequency and inverse document frequency. That's of course still important but things really took off when search engines started to treat the Web as a connected graph, placing as much importance on the interconnectedness of the resources as what was in the resources themselves. Google made a huge jump forward in this area with its PageRank algorithm.

As things stand, search for digital material for teaching and learning (or learning objects or open educational resources) focuses on the resources themselves (or metadata about them), just as early search engines did. These resources are usually stored separately from the systems in which they are used, in disconnected repositories.

Erik Duval and his team are doing some interesting work in this area with what they call attention metadata. It's a good start, but until we do away with repository silos and start using the system of use as the repository, this area of search will move slowly. If this is done, there will be all sorts of other information to use in heuristics to optimise results.

Feeds in search results


I'm finding more and more feeds are being returned from Google. This seems like stupid behaviour to me. You use a search engine to meet an information need. I want to know something about Ferret, I'm likely to use a search engine to find it. Do I want to subscribe to a feed? No, that's way too much commitment. If I find what I'm after and the site looks interesting I'll poke around a bit and then I might decide it's interesting enough to subscribe to.

The correct behaviour would be for Google to return a link to the site from which the feed originates.