When you use a website's Search function, you expect it to show you everything that matches the Search terms and settings you give it. It will of course return items that aren't really relevant, but you expect it not to miss anything that is relevant.
Search facilities that miss relevant items are bad for two reasons. First, they mislead people into believing that everything of interest has been found, when it hasn't. Second, they risk losing users' confidence. If Web users see evidence that a website's Search facility misses relevant items, they lose confidence in it -- and perhaps in the site or company as a whole -- and therefore use it less.
Until recently, Yahoo Weather used to have this blooper. Searching for "new york city" using its Location Search box produced a list of cities that did not include New York City (see below). New York City couldn't be found by that name.
Fortunately, that flaw has now been corrected.
Newspaper websites seem especially prone to this blooper. One such site is that of the Christian Science Monitor. The print version of the newspaper had a story about vacationing in areas hit by the recent Asian tsunami. Looking for the article on their website on the day the article was printed (Jan. 12), I typed its exact title into the Search box. It converted the title to Search-gibberish code and searched, but found nothing (see below).
However, the article was there (see below). I just had to find it by browsing the current issue. This means that it was not indexed -- for search purposes -- by its own title. Blooper!
The Monitor is only one of many newspaper and magazine sites where the Search function sometimes misses articles when given their exact titles or authors.
Search functions that miss relevant items can also be found at e-commerce websites. One example is GoodGuys.com, an electronics store. They sell cellular phones and phone service, but searching their site for "cell phones" finds nothing (see below).
By browsing the site, one can eventually find that they do offer cell phones (see below).
The solution to this blooper focuses on the back-end -- the servers that store and retrieve data for a website -- rather than on the front-end -- the design and organization of the site's pages.
Overlooked data usually results from one of two problems in a website's database:
One obvious solution to such problems is to index the content more carefully, thoroughly, and consistently. With multiple people adding content, the chances of them doing it inconsistently or insufficiently are higher, leading to diminished search accuracy. Indexing content carefully, thoroughly, and consistently means adding content -- along with the data required to index it -- to the back-end system in a controlled way.
For example, when adding content-items to the database, designers or content editors should try to anticipate and include, as keywords attached to the items, all the terms people might use to search for them. A site keyword lexicon listing allowable keywords and their meanings -- and indicating which ones are synonyms -- can help content-editors choose consistent, predictable keywords for new content (Rosenfeld and Morville, Information Architecture for the World-Wide Web, 2002).
During site design and development, conduct tests to see how typical visitors to the site will look for items. Such testing need not be expensive: early testing can be done very cheaply, without a computer, using questions on paper, such as:
Suppose you wanted to find an article about XYZ. What search-terms would you use to find it?
After a site is in use, it is of course not feasible to pre-test keywords whenever new content is added. However, it is feasible and advisable to evaluate how easily site-users find what they are looking for and what sorts of search terms they use. This can be done either by conducting periodic user-tests, or by having the site monitor usage of its Search function.
Another solution is to use stronger search methods. Some methods rely completely on keywords. If keyword-only search misses too much of your site's data, don't rely solely on it. Search the actual text of the content, or at least the title and abstract or lead paragraph. And if a user types partial words for search-terms, find everything that matches them, within reason, of course. The goal is to maximize the Search function's ability to find relevant items without significantly increasing its tendency to return irrelevant ones.