| Being Specific
Perhaps you've already noticed our query subject bird* is contained on more than 1 million documents (in AltaVista alone). It would be a little difficult to review all of those documents at one sitting. THE MOST CRITICAL PROBLEM IN ALL QUERIES IS FINDING THE RIGHT LEVEL OF SPECIFICITY FOR THE SUBJECT QUERY TERM(S). Too broad a keyword specification, and too many results are returned; too narrow a specification, and too few are returned. All information is classifiable and amenable to structure. We are all familiar with dictionaries, which classify words alphabetically. However, an alphabetical structure is not of much use to query formulation. But there are many other classification schemes used for information which CAN help find the right level, or specificity, for your keywords. A few examples appropriate to our mystery bird search are presented in this topic. Our first example classification presents the structure of the animal kingdom [1]:
As we will see, our initial keyword term of bird* is at least three levels off of where it should be. Using bird* as is would lead to massive results sets from the search engines and virtually no likelihood that we will find the information we're looking for.
Another way to classify information is shown by the encyclopedia, (the above example is from Microsoft's Encarta 96 [2] - the actual encyclopedia doesn't matter; we're only illustrating a point). As a very different example, the chart below shows how the word "fast" is placed within the structure of a thesaurus [3]:
As noted, search 'directories' also apply a classification structure for how they organize and present Web sites. The structure for the largest and best known of these directories, Yahoo, with some 2000-odd individual categories, is shown on the next page. Like the first animal phylum example above, bird* in the Yahoo! example is about three or four levels off from where our subject keyword should be. Finding the right level may involve your personal knowledge and experience, doing a preliminary search or consulting other references. In the case of Jan and the mystery bird, looking in a bird book was sufficient to match pictures with the bird seen as a peregrine falcon. The time spent in finding how to characterize your subject at the proper level is definitely well spent, as these document counts from AltaVista illustrate: bird* 1,834,510By identifying our mystery bird as a peregrine falcon, we've narrowed the search by 99%! Remember, at 30 seconds to 2.5 minutes per document reviewed, the effort spent in zeroing in on the bird of interest has saved us tremendous overall search time. The critical point about finding the right "level" in your keywords is that words at levels higher than where you should be return way too many results; those at levels lower than where you should be return too few or no results. This phenomenon is due to the fact that "things" at lower levels tend to "rollup" and sum into "things" at higher levels. Philosophers, epistemologists, taxonomists, linguists and others can argue for centuries about "proper" ways to classify information. That is not our concern. Rather, the point is that keyword objects can be placed into a structure at various levels. Always keeping forefront whether your query subject is at the right level or not in those structures can bring big benefits in faster, and more accurate, searches.
Footnotes:
|