CHAPTER IV

The methods by which search failures are studied in document retrieval systems were discussed in Chapter III along with a critical review of the literature. In this chapter we present a model which enables us to explicate all types of search failures occurring in online catalogs. The concept of "search failure" will be used in its broadest possible sense in this presentation in order to illustrate a wide range of search failures.

In Chapter II, the major components of an online document retrieval system were given as a store of document representations, a population of users, a retrieval rule and a user interface. The roles of indexing, query formulation, user interface, and retrieval rules are explained in more detail. Moreover, it was pointed out that a search and retrieval process takes place by matching query term(s) entered by users with the document representations on the basis of the retrieval rule.

Searching for and retrieval of information is inherently a complex process. Borgman (1986, p.388) summarizes this process as follows:

Users have to make decisions and relevance judgments during search and retrieval process. Although it is the users who must initiate actions in most cases, they may not have total control over, nor understanding of, all the steps that have to take place in this process. For instance, users are confined with the capabilities of the search and retrieval subsystem. Furthermore, the assumptions the users make and the background knowledge they possess may not always help them in their search endeavor.

In the following sections we first present a model to describe search failures in online catalogs. We then discuss in detail the types of search failures that occur in online catalogs along with the description of what causes them and why they occur.

It was pointed out in Chapter III that a considerable percentage of online catalog searches fail to satisfy users' information needs. In order to perform successful online catalog searches, users have to engage in an intellectual undertaking. They have to overcome numerous hurdles before they retrieve relevant records. Some hurdles are easier to conquer than others. Some may be invisible to experienced users while users may have no control over some others. Users who accomplish to overcome all hurdles are rewarded with successful search results.

A successful online catalog search process can be likened to climbing a ladder with uneven steps, each step representing the likely places where search failures may occur. To put it differently, each step can be a stumbling block for unprepared users. Figure 4.1 depicts such a ladder with four uneven steps where the size of each step is arbitrary. The degree of difficulty that may be encountered in each step may change from search to search and from one user to the other.

Failures Caused by Faulty Query Formulation	Failures Caused by User Interfaces and Mechanical Failures	Failures Caused by Retrieval Rules, Stemming and Clustering Algorithms	Failures Caused by Ineffective Retrieval Results
			Øzero retrievals Øtoo much information (e.g., information overload) Øtoo little information
	ØFailures caused by character-based, menu-driven, touch-screen, fill-in-the-blank, and graphical user interfaces Øparsing failures (natural language vs. Boolean queries) Ølack of on-screen help Øcluttered screen layout Øunclear and context insensitive error messages Ømechanical mistakes Ømisspelling and mistyping errors Ølogon failures	Øfailures caused by use of extensive lists of stop words Østemming algorithm failures Øclustering failures Øfailures caused by retrieval rules (e.g., Boolean vs. probabilistic) Øranking failures Øprecision failures Ørecall failures Øfallout failures Øfailures due to relevance feedback methods	Øtoo much information of the wrong kind Øcollection failures Øfailures due to out-of-domain search queries Øvocabulary mismatches Øindexing failures (e.g., specificity or exhaustivity of indexing) Øfalse drops
Øill-articulated queries Øscope (broad vs. specific queries Øincomplete query formulation Øinsufficient help Øquery language (natural vs. Boolean)

As Fig. 4.1 suggests, search failures in online catalogs can be categorized under four broad groups:

Each category of search failures is discussed below. The discussion follows the logical progression of an hypothetical search query and points out the possible places where events leading to search failures may occur. First, we define major types of search failures.

The first step in the ladder is the formulation of a formal search query. Several factors play significant roles in formulating successful search statements: the user's background knowledge on the topic for which more information is sought, the document database in use, the query languages available to interrogate the database, and the retrieval rules.

A search query may fail to retrieve records if the search statement contains errors or if it does not describe the user's information need adequately. Typographical errors in search statements or vague, incomplete, too specific or too broad search queries are examples.

Online catalog studies carried out in the past have shown that users experience a wide range of difficulties in formulating their search queries. They have to articulate their search statements and come up with well-thought-out plans so that successful searches can take place. This is not the case for majority of the users, however. Most users formulate their search queries "on-the-fly" and they type in whatever pop into their minds (Markey, 1984, p.70). In some cases they tend to enter incomplete search queries or queries which may not necessarily reflect their real information needs.

The scope of the search queries does not always illustrate whether the user is interested in a broad or specific search on a given topic. Users often issue broad queries and then indicate that they were looking for more specific sources. As they do not know much about collection characteristics, it is understandable to a certain extent that users issue broad queries because they are initially concerned with retrieving "something." Yet very few attempt to revise their original queries.

"Scope failures" also occur when users approach the online catalog unaware of its capabilities. For instance, if the system offers no subject or call number searching and the user attempts to perform this type of searching, the search query will fail. Similarly, if the database contains only monographs and the user expects to retrieve periodical articles, a scope failure will occur.

The availability of printed manuals or on-screen instructions as to how to formulate a successful search query tends to improve things very little. This is not surprising in that very few users are aware of or regularly use such tools. It is open to conjecture whether the poor design of help screens or manual instructions may have something to do with this low level use.

Failures occurring in the course of communicating with the system through user interfaces (e.g., entering and modifying search queries, displaying results) constitute the second category of search failures in online catalogs. Failures due to the user interface of the online catalog stem from the nature of the interface; the nature of the dialog; and the availability of on-screen, context sensitive assistance through the user interface. In other words, interface failures occur when the interface gets in the user's way and prevents the user from finding what he or she is looking for in the online catalog.

When interfaces involving touch screens are used, the dialog is extremely rigid because there is no flexibility for the user to enter anything but what is on the screen. Searching becomes tedious as the database size grows and users cannot issue search queries involving more than one concept at a time using touch-sensitive screens. Menu-driven interfaces are the preferred method of search query entry mode of novice users who are either unable or unwilling to invest time to learn the command language of the online catalog. Yet, menu systems offer less capabilities compared with command languages, and users may not enter complex search queries involving the use of more than one indexes (e.g., an author and title search).

Interfaces based on command languages where users have to formulate their search queries by complying with the strict syntactic rules of the command language, are probably the most common method of interrogating online catalogs. Mastering the use of any character-based command language requires some understanding of the various commands along with their capabilities. The functionality of the online catalog can only be tapped when the user knows which command to use and how to use it.

The search query entered by the user is "parsed" by the system in order to identify the components of the search statement (e.g., command, index type, search terms, Boolean operators), which enables the system to `understand' what the user tries to do and thus to take needed actions. Parsing process in character-based interfaces causes several search failures, however. Unless the search terms are entered by observing the rigid syntax rules, parser cannot identify the components of the search query accurately (e.g., command, index to be searched, and search terms). A considerable percentage of search queries submitted to online catalogs through menu-driven interfaces fails because users are often unaware of the fact that the components of a search query must be entered in the exact predetermined order. Otherwise, the interface produces an error message, which may not necessarily be intelligible to the novice user. When this happens, users simply re-enter the same search query without thinking about why the search failed. In fact, such search failures occur frequently in online catalogs with command languages. For instance, almost 10% of all commands that users submitted to a large multi-campus online catalog contained mechanical errors. Of these, almost 50% can be considered parser failures where the system did not recognize the search statements because the components of the search query were not entered in a predetermined order (i.e., first word of command (e.g., FIND) was invalid in 32% of the cases, and an invalid index name was used in 15.3% of the cases.)

When a user must combine search terms to form a query, many errors can be introduced. They include errors related to a complex syntax, errors related to the meaning of Boolean operators (AND, OR, and NOT), and errors stemming from ambiguous parsing of Boolean search requests. More often than not, users do not know how to use Boolean operators correctly. They do not know how Boolean operators may actually affect their search outcome. Most users are unaware of the use of the "implied" Boolean AND operator when the search query contains more than one search terms (or concepts).

Users can be shielded, to a certain extent, from the difficulties with regard to using Boolean operators as part of the query language syntax. Some systems allow users to enter their search queries by filling in the blanks. For example, if a user wishes to find all the titles written by a given author on a given subject, the author name can be entered in the author `field' and the subject in the subject `field.' Thus, the system combines these two pieces of information and performs a Boolean search. Note, however, that search failures may still occur in systems with the fill-in-the-blank-type user interfaces due to inherent problems with Boolean logic (see section 4.6).

Fill-in-the-blank-type search query entry mode is more commonly used in CD-ROM databases. Most online catalogs have yet to allow their users to manipulate the full screen, rather than a single line, to enter their search queries.

More recently, some third generation online catalogs with more advanced retrieval techniques began to allow users to enter their search queries using natural language (Doszkocs, 1983; Mitev et al., 1985, Larson, 1989, 1991a). That is to say, the user is not bound with the syntax of the command language or the use of menus or Boolean operators: he or she simply starts describing the search query using the preferred terms. The lack of a formal query language syntax, it is believed, facilitates the user to express the search query more effectively. It was pointed out earlier that users unaware of the existence of the command language may keep re-entering the same query despite the error message. Such users may benefit from natural language user interfaces where simply typing in the search statement will suffice to retrieve some records out of the database in most cases. Moreover, other users who tend to type in whatever pop into their minds may also benefit from entering their search queries in natural language form.

Search failures occur in full-text systems or online catalogs with natural language user interfaces. The main cause of search failures that occur in natural language user interfaces is their lack of natural language understanding capabilities. Retrieval-worthy search terms are often ignored because the parser cannot distinguish them properly. For instance, a search query such as "I'm interested in books on user interfaces but not graphical user interfaces" might retrieve books on user interfaces as well as graphical user interfaces. For, parser may not have any understanding of what "not" means in natural language (Krovetz & Croft, 1992, p.128).

Third generation online catalogs with natural language interfaces do not, unlike second generation online catalogs, provide a wide variety of search options such as author/title searching or Boolean searching, although they handle topical search queries fairly effectively.

From the users' vantage point, there appears to be some differences in terms of formulating and submitting search queries to online catalogs using different types of interfaces. As Buchanan (1992) pointed out, it is not "self-evident that users will submit exactly the same set of keywords to an interface that invites `natural language' input as to an interface that requires Boolean set construction." As we shall see later (section 4.6), retrieval and display rules may also have some impact on the choice of the user interface.

At present, users are constrained by only a single type of user interface on a given online catalog. Even though there may exist more than one type of user interface for the same online catalog (i.e., menu-driven vs. command language), they may not necessarily co-exist on the same system for the user to select. The availability of several different user interfaces on different systems makes the user's task more difficult. Users may be unfamiliar with all types of user interfaces. An experienced user of a command language with Boolean capabilities may have problems in the course of entering his or her query in an online catalog which accepts queries in natural language form. In other words, searching skills acquired using one type of interface may not necessarily be transferable to other types of interfaces, which is likely to cause an increase in the number of search failures. It is expected that future online catalogs will be equipped with "mapping" facilities to convert a query from one user interface mode to another.

It was pointed out earlier that commands that users enter tend to contain mechanical errors. In fact, a majority of such commands, unless intercepted by the user interface, retrieve nothing. Search failures due to mechanical errors are very common and not limited to the ones committed during the query entry process. Such errors may occur anywhere during the retrieval process: from logon procedures to entering commands, from displaying search results to interpreting system prompts.

In the simplest case, users may not know how to perform a search or how to proceed. Clearly, they may be aware of the fact that "one has to tell the system something in order to get anything out" (Bates, 1989b, p.405). Yet, what it is they should do may not be clear. If this is the case, users tend to make a lot of mechanical errors. Unless the user interface provides some clues as to what the next step is, they may feel helpless. In such instances, needless to say, search queries often fail to retrieve any records from the database.

Despite the availability of on-screen help, users may have difficulty figuring out what advice they are given and what they are supposed to do. To put it differently, help screens provided by user interfaces are often not "context sensitive." They offer "boiler plate" explanations and tend to read like an "essay." More often than not such explanations fill the full screen. They are usually ignored by the users because of the poor display layout.

It goes without saying that clear and understandable explanations offered by the user interface system as to the use of each command enable users to improve their expertise in using more advanced features of the online catalogs. In general, very few commands are used regularly by the users, which means that the full functionality of a given online catalog remains unexplored for most users. If the user interface provides help with the more advanced features available in the system, search success will also increase.

Poorly-worded error messages may also discourage some users because few users would be pleased to be reminded that they made an error. Judgmental error messages that offer no further help are especially damaging in that errors, especially mechanical ones, tend to beget new errors, sometimes compelling users to abandon their searches (Penniman, 1975a, 1975b; Penniman & Dominic, 1980; Cooper, M., 1991).

Misspellings and typographical errors constitute a considerable percentage of search failures occurring in online catalogs. Such search failures can be avoided provided simple programs that check spelling available in the user interfaces.

Several types of search failures occur due to awkward user interfaces. Users may not necessarily know how to formulate search queries or how the retrieval rules work. They may not necessarily be aware of the database characteristics, system limitations, and many other things. Yet they may still be able to perform successful searches provided they get help from the user interface.

Online catalogs are used by people with a wide range of skills. Some know nothing about their search topics while others know nothing about the online catalog in use, or vice versa. Some are first-time users while others are experienced in online searching. A well-designed user interface is the one which accommodates the needs of all types of users and guide them step-by-step if necessary. It reveals the structure of the system as users get more experienced so that they can understand what the outcomes of their actions will be.

Some argue that if the user interface is intuitive and "user-friendly," an inexperienced user should be able to figure out how to use the system and get results in about 10 minutes. There is no doubt that searching online catalogs for information is a complicated task for some users. A context-sensitive user interface takes the level of user expertise (both system and subject expertise) into account when dispensing information or offering help. Online catalog user interfaces claiming to support the needs of all types of users generally offer inferior service to inexperienced users. For instance, novice users who choose to use the menu-based version of a given interface usually cannot use the more advanced features that are available to experienced users. In such cases, designing "easy-to-learn" user interfaces with context sensitive help features becomes an extremely useful asset in online catalogs.

To sum up, then, a well-designed user interface is one of the most notable components of an online catalog. It is no exaggeration to suggest that the quality of the user interface often determines the success or failure of searches users perform in online catalogs.

Failures caused by retrieval rules constitute the third category of search failures. This type of failure occurs when the user is unfamiliar with the search and retrieval logic of the system or when the search statement entered by the user gets misinterpreted by the system. Stemming algorithm failures, failures caused by Boolean and probabilistic retrieval rules, clustering and ranking failures, precision, recall and fallout failures can be grouped under this category.

What takes place in the course of retrieving information from online catalogs is often unknown to the user. In the eyes of most users, an online catalog is often seen as a "black box." They may have an understanding of the main function of the online catalog in terms of matching their search queries with document representations in the database. Yet they may not know what types of activities take place and how retrieval rules are applied.

Various retrieval rules used in document retrieval systems were briefly explained in Chapter II. Search failures caused by constructing Boolean search queries were addressed in section 4.4. The difficulties that the use of Boolean logic impose on the effective use of document retrieval systems are discussed in the literature (see, for instance, Cooper, W. (1988), and Blair & Maron (1985).) In view of the inherent difficulties in its use, Boolean logic was referred to as "the Curse of Boole" (Bing, 1987). In a recent survey, conducted at Indiana State University Libraries, 40% of the respondents did not answer one of the questions on Boolean searching. Their comments indicated that "they did not know what Boolean operators are, and it is likely that some of the respondents who did answer the question did not know much about Boolean operators" (Ensor, 1992, p.215). Such figures seem to indicate that Boolean logic as a retrieval rule is responsible to a certain extent for search failures occurring in online catalogs.

As discussed in Chapter III, precision, recall and fallout failures occur in document retrieval systems. To reiterate, precision failures occur when the system fails to retrieve relevant documents only whereas recall failures occur when the system fails to retrieve all relevant documents. Fallout failures occur when the system retrieves a lot of nonrelevant documents (i.e., false drops).

Search failures also occur in document retrieval systems where probabilistic retrieval rules are applied. In fact, one of the major objectives of the present study is to document search failures that occur in a probabilistic document retrieval system. The results of our analysis are presented in detail in the following chapters. Suffice to say here that all types of search failures mentioned in this chapter may also be encountered in probabilistic document retrieval systems. In addition, we mention three types of search failures, caused by clustering, ranking, and relevance feedback techniques, that occur primarily in probabilistic online catalogs.

Document clustering techniques were briefly discussed in Chapter II (section 2.7.1). Some probabilistic online catalogs preprocess search queries by clustering similar records together and presenting the contextual information to the user before they actually retrieve individual bibliographic records. Contextual information in this case can be subject headings and classification numbers attached to the documents. Thus, the user would be able to identify the clusters that matches his or her query best, thereby eliminating the ones that might otherwise retrieve useless bibliographic records.

The success of the clustering process depends on how well the search statement describes the user's information needs and how well the clustering technique works. If the search statement fails to describe correctly what the user is looking for, the retrieved clusters may not be very relevant. On the other hand, if, despite the correct query formulation, the system retrieves broad, unpromising, ambiguous cluster records, this may further confuse the user as to the function of the clustering and the overall retrieval process.

Cluster failures then occur when the user selects none of the retrieved cluster records as relevant as the query expansion is based on the subject headings and classification numbers extracted from selected records.

Ranking failures occur, on the other hand, when less promising records are presented at the top of the list due to imprecise query description or term weighting formulae used in probabilistic online catalogs. Users quickly reach their "futility points" and give up displaying records once they encounter nonrelevant records in the retrieved list. Failures occurring during relevance feedback searches are somewhat similar to ranking failures in that retrieved records that are based on user feedback may not necessarily be what the user wants. In fact, research that was carried out on a probabilistic online catalog with relevance feedback mechanism showed that there was a high proportion of false drops among the records retrieved during relevance feedback searches. "The reason appeared to be connected with the fact that too many irrelevant terms were being used in the feedback" (Walker, S. & Hancock-Beaulieu, 1991, p.62).

The question to ask at this point is how much impact does the users' system knowledge (mechanical aspects as well as the retrieval rules) have on search failures? It is generally believed that the users' search success in online catalogs depends very much on their previous experience. They feel in control when they know what they are doing even though they may still not know much about how it is that the system retrieves what it actually retrieves.

Experienced users tend to question the search results more often. They modify their search queries depending on the search outcome and adapt to the system easily (since most, if not all, current online catalogs cannot adapt, or be adapted, according to the needs of different types of users). This is certainly an important factor in reducing search failures.

Conversely, users seeing the online catalog as a "black box" tend not to question the search results. They trust the system and readily accept the results. For instance, if they do not succeed retrieving anything in their first try due to various reasons (e.g., zero retrieval, misspelling), they come to the conclusion very quickly that the items they seek are not listed in the catalog. Such confidence in the online catalog may sometimes work against their finding the desired records in the database.

However, that does not necessarily mean that all search queries issued by inexperienced users should be seen as potential failures. For instance, an inexperienced user looking for books on domestic animals may be tempted to enter a query such as "FIND SUBJECT DOGS CATS PARROTS" if he or she is unaware of how (implied) Boolean AND may affect the search outcome. If all user wants is a book that talks about all three animals listed in the query, there should be nothing wrong with it. Yet, due to the current indexing practices, the probability of a book being assigned all three index terms is pretty slim, not to mention the possibility of retrieving nothing.

It is also likely that the user may not know the difference between Boolean AND and OR operators. This user may be surprised, then, to learn that there is no books on either of those animals. Similarly, the use of AND in real life is quite different from its use as part of the Boolean retrieval rule, which confuses many users (e.g., "welfare AND housing of primates in zoos").

Several activities, which are transparent to users, take place between the submission of the search request and retrieval of the results. Stop lists, stemming algorithms, clustering and ranking techniques that are in use in online catalogs constitute the basic components of the retrieval rule. Each component may affect the search outcome either directly or indirectly.

The search terms entered by the users are subject to preprocessing in most, if not all, online catalogs prior to the application of the retrieval rule. Excluding "stop" words from the search queries through stop lists is the most commonly applied preprocessing activity in online catalogs. Function words such as "the," "of" and "to" are eliminated from queries because such words are not retrieval-worthy. Most quantifiers such as "all," "few," "little" and "much" are eliminated because they "are not helpful as indicators of word relation." Similarly, general words such as "above," "again," "always," and "already" are also excluded from the search queries as they "have no technical meaning within the subject domain" (Vickery & Vickery, 1992, pp.261-262).

The use of stop lists usually do not impair the search results. Yet there may be some situations wherein users would have liked to be aware of the existence of such a stop list. For instance, a naïve-looking search query consisting of the title words "to be or not to be" retrieved almost 30,000 records in a large online catalog which all share the word "be" in their titles. On the other hand, the same query used to produce zero results before it was fixed in the early CD-ROM version of the New Oxford English Dictionary because the text retrieval engine, which used stemming, treated all words including the "be" in this query as stop words.

Similarly, the stemming algorithms or automatic truncation used in online catalogs may cause search failures. This type of search failure occurs when the stemming algorithm fails to recognize some or all of the search terms correctly. Search terms submitted to the system are stemmed to their root form using weak or strong stemming algorithms to improve the recall rate. However, in some cases increasing recall through stemming may retrieve useless records, thereby cluttering the relevant records with nonrelevant ones. Furthermore, "[t]he stemming rules need to include look up of a table of exceptions, listing words that should not be stemmed, for example analysis, gas, chaos, axes, matrices, mechanics, porous, quantify, rabies" (Vickery & Vickery, 1992, p.262). For example, single character search terms or little-known abbreviations are often ignored.

Stop lists are used in both second- and third generation online catalogs to eliminate words that are not retrieval-worthy. After excluding stop words, catalogs with Boolean search capabilities accept the search query terms as is and treat each term equally retrieval-worthy. On the other hand, third generation online catalogs accepting queries in natural language form treat the search terms rather differently. In addition to stemming words to their root form, search query terms are indexed in order to identify retrieval-worthy terms. Each retrieval-worthy term is weighted on the basis of search query and document database characteristics and the retrieval results are ranked. But because most, if not all, third generation online catalogs lack natural language understanding capabilities, they sometimes attribute undue weight to an otherwise useless term in a natural language query, thereby causing search failures. For instance, a user issuing a search query such as "I'd like to see books about . . ." is unlikely to be interested in "books" per se. Yet this term (books) will be regarded as highly retrieval-worthy in a database concentrating on library and information studies.

Most catalogs accept acronyms as regular search terms rather than attempting to replace them with the spelled out forms. A few catalogs can automatically replace a search query entered as, say, "ALA" with "American Library Association," which would improve search results to a certain degree if the expansion is correctly inferred. However, it may not always be desirable to supply spelled out forms automatically, especially in large collections with documents on several disciplines. For instance, "ALA" may also stand for, say, "American Lutherans Association." It is best to consult the user beforehand to prevent such search failures before they occur.

Synonyms exhibit the same problems as acronyms. Again, the lack of natural language understanding capabilities is the main reason behind search failures caused by synonymity. For instance, a user interested in "The Netherlands" is most likely to be interested in "Holland," too.

So far our discussion on retrieval rule has concentrated on preprocessing activities that may take place before the retrieval of records, such as the use of stop lists, stemming algorithms and clustering techniques. It should be stressed, however, that users play a most significant role in the evaluation of retrieval results. To put it differently, no matter how advanced and sophisticated they may be, retrieval rules in and of themselves cannot affirm the relevance of retrieved records. As Van Rijsbergen (1981, p.40) put it forthrightly, ". . .a retrieval system cannot be all things to all men." We now turn our attention to retrieval results and discuss the types of search failures that take place after the system retrieves some records in response to the user's query.

Failures caused by ineffective retrieval results are the most important type for the user, which make up the last category of search failures in online catalogs (Fig. 4.1).

Zero retrievals ("no information") occur when the system retrieves nothing in response to the user's query. This may happen due to a variety of reasons: collection failures, mismatch between the user's vocabulary and that of the system, misspellings or typographical errors, to name but a few. Searches that fail to retrieve any information are relatively easy to identify through transaction monitoring.

Zero retrievals due to collection failures occur when the requested item(s) is not owned by the library and thus not listed in the database. Collection failures may occur regardless of the search rule or retrieval mechanism used. Such failures can be minimized through collection development efforts only.

Zero retrievals due to vocabulary mismatch occur when the user-entered search terms fail to match the authority records or controlled vocabulary of the system. There are several ways to prevent collection failures that occur due to vocabulary mismatch. The use of cross-references (see and see also in LCSH) in controlled vocabularies or authority records, and the creation of a Superthesaurus (Bates, 1989b) can be given as examples. Authority control of personal names is relatively straightforward compared to the vocabulary control of subject headings.

Zero retrievals due to misspelled or mistyped query terms occur because such terms do not match the system vocabulary. To minimize these failures, some online catalogs implement semi-automatic spell-checkers to scan the query for mistakes. Yet this is the exception rather than the rule. Most online catalogs accept the user-entered query terms without checking for mistakes or typographical errors.

Collection failures will occur if the database contains no bibliographic records on a given topic and retrieves nothing (zero retrieval) or retrieves no relevant records in response to a search query. "Out-of-domain" search queries are not considered collection failures. For instance, a search query on "classification of materials on gay and lesbian studies" can be categorized as collection failure if it retrieves no relevant records in a library and information studies database, whereas "blood transfusion" will be considered an out-of-domain search query.

Information overload, or too much information, can also cause search failures. In most cases, users are not interested in retrieving all the relevant sources (e.g., high recall) on a certain topic but, presumably, only the good ones (Wilson, 1983). Yet online catalogs cannot distinguish the good ones from not-so-good ones; they retrieve records that fit the user's query description using the retrieval rule provided. They do not contain evaluative information about the items they list, either. Thus each user has to judge by himself or herself whether the sources retrieved are good or not. As the database size grows, retrieved sets can get very large, thereby causing the user either to scan several records or to abandon his or her search.

Various types of search failures may occur due to information overload. In some cases, the user may simply stop after displaying a given number of records without seeing all the records in the retrieved set. This in itself cannot be seen as a search failure. If the user identifies some acceptable records among the records displayed, then the search, as far as the user is concerned, is a success. The records displayed may not necessarily be the best ones among the retrieved.

Search failure in this case may occur when the user fails to identify at least some good records among those displayed. It could be that the search query entered is too broad so that retrieved records are general. This happens frequently in online catalogs. If this is the case, the search system cannot be seen as the cause of the failure, but rather the query may have been formulated poorly, or the query term truncated too generously.

As the database size grows, search failures caused by information overload become inevitable because of the nature of the probability of assigning terms to documents: a relatively few number of broad index terms are assigned to a large number of documents in the database whereas a majority of index terms get assigned to only a handful of documents. It is not unusual to retrieve thousands of records in large online catalogs for broad subject searches such as "history," "education," or "medicine." Although it may seem unreasonable, such broad queries may describe exactly what a few users want. Yet, it is scarcely conceivable that overwhelming majority of users issuing such broad queries would be interested in seeing, let alone evaluating, all the retrieved records. In fact, some online catalogs began to restrict the use of such broad terms in search queries as they are costly and slow down the system for everyone.

Information overload can cause search failures in known-item searches as well as subject searches. There may be too many postings attached to some titles and author names (including corporate authors) (e.g., "Bible" or "Shakespeare"). Obviously, it is easier to deal with information overload occurring in known-item searches than in subject searches.

Zero retrievals and "information overload" are seen the most compelling types of search failures in online catalogs. Larson (1991c) compares these two types of search failures to the twin monsters Scylla and Charybdis and emphasizes that users must navigate skillfully in online catalogs to avoid "smashing onto Scylla's rock (search failure) and being pulled into Charybdis' whirlpool (information overload)" (p.182). As discussed in Chapter III, existing online catalogs cannot deal successfully with either Scylla or Charybdis. Take, for instance, the information overload problem. In order to solve the information overload problem in Boolean systems, the search query is usually made more specific by adding more terms conjunctively because it reduces the number of retrieved records to a manageable size. But this strategy also excludes many potentially relevant documents from the retrieved set (Blair & Maron, 1985, p.297). In fact, adding search terms liberally (and, thus, using Boolean AND) may quickly deteriorate the search outcome to the point where the query may retrieve nothing (Scylla). Attempting to solve the information overload problem by using Boolean operator AND was likened to "entering into some sort of Boolean lottery, where it is quite incidental whether he [the user] actually wins a relevant document as a prize" (Bing, 1987, p.200).

Third generation online catalogs with ranked retrievals provide a more sophisticated solution to the information overload problem. As the sources retrieved are ranked according to the degree of match between the search query terms and titles and subject headings of the documents in the collection, users, it is assumed, can notice the difference between top-ranked and lower-ranked retrievals in terms of their relevance. That is to say, they can discontinue searching once retrieved records become less and less relevant, safely assuming that records farther down the list contain no more promising ones. This is not the case in Boolean searching where records at the top and bottom all have equal chance of being relevant. Users cannot assume, unlike in probabilistic systems, that rest of the retrieved records contain no relevant ones.

Note, however, that information overload caused by single term broad search queries may not necessarily be handled more successfully by probabilistic online catalogs, either. In other words, a broad, single term search query such as "history" and "education" would match several records equally well and the weakly ordered ranked retrievals may not forcefully separate more relevant records from less relevant ones.

Search failures caused by retrieving "too little information" differ slightly from zero retrievals or information overload in that it is difficult for users to determine if they retrieved too little information. From the user's vantage point it can be more detrimental in some cases to retrieve "too little information." They may not necessarily know that they missed some relevant documents. Yet the consequences of retrieving too little information as a type of search failure cannot be overlooked, especially in medicine, legal research and patent searching.

Although it takes longer for users to scan retrieved records, "too much information" (high recall) does not hurt as much as "too little information" (low recall) does. Users can always stop scanning records once they find some that are relevant to their needs. Furthermore, online catalogs that rank retrieved records on the basis of their similarity to the search query facilitate the scanning process. It was pointed out in Chapter III that most users do not consider recall failures as search failures because they cannot tell from the retrieved records that they are missing some other relevant ones. In most cases, unretrieved but relevant records would not hurt them if they are satisfied with what they have already seen.

False drops will occur when the system cannot distinguish the subject domain in which a given search term is being used and thus retrieve all the records indexed under this term. Incorrect term relationships may also cause false drops. Such failures mainly stem from the lack of natural language understanding capabilities in present document retrieval systems.

False drops may clutter the retrieved set of records and cause search failures, especially in broad or vague search queries. The main cause of false drops in online catalogs is that the system has no way of distinguishing records with the same titles or subject headings unless further information is provided by the user. As the retrieval rule is based on simple keyword matching, the system cannot differentiate terms that are semantically unrelated. A user interested in natural disasters like "fires" would be hard-pressed to make the connection when presented with Lakoff's monograph titled Women, Fire, and Dangerous Things (unless the user is either knowledgeable about categorization and psycholinguistics, or he or she speaks Dyrbal, an aboriginal language of Australia). Similarly, a simple search on the subject "rubbish" may retrieve several unrelated records ranging from rubbish theory to intellectual rubbish to rubbish filtering.

Indexing failures occur when documents are assigned incorrect index terms. They also occur when indexers fail to assign no index terms at all. Indexing failures may also explain some of the search failures caused by ineffective retrieval results (zero retrievals, too much information, too little information, false drops). Furthermore, precision, recall, and fallout failures are mainly caused by indexing practices. For instance, precision failures occur in part when documents are assigned broad index terms. Recall failures occur when relevant documents are not assigned appropriate index terms.

Indexing practices thus play a significant role in search success. Assigning index terms exhaustively to reduce recall failures, for instance, may cause other types of search failures such as information overload. As discussed earlier, second generation online catalogs with Boolean searching capabilities cannot deal with information overload in large retrieved sets successfully.

Vocabulary failures occur when the user-entered search terms fail to match the system's vocabulary (i.e., titles and subject headings of the bibliographic records in the database). In other words, presence or absence of certain index terms and title words may determine whether the user will success in his or her search endeavor.

At this point, it is essential to indicate the role of clustering techniques not only in decreasing false drops but also vocabulary mismatches. False drops occur when the system cannot determine the context in which the search terms are used and therefore retrieves nonrelevant records. To prevent false drops, some third generation online catalogs display the context in which the search terms are used before retrieving the bibliographic records. Vocabulary mismatches, on the other hand, occur when the user-selected terms do not match that of the controlled vocabulary of the system or the titles of documents in the database. Clustering techniques help users match their terminology with that of the system by checking the occurrence of search terms both in title and subject indexes. This is advantageous compared to Boolean searching in that the user is automatically given an extra chance to be able to match his or her terms rather than facing zero retrievals due to vocabulary mismatch. For instance, a user performing a subject search under "sarcophagi" in a second generation online catalog will not be well-served because overwhelming majority of such items were cataloged under "sepulchral monuments." Yet, this search will in third generation online catalogs retrieve records that have "sarcophagi" in either their titles or subject headings (or both). In other words, the user will be automatically directed to the titles which were not cataloged under "sarcophagi" but nevertheless are likely to be relevant (e.g., cataloged under "sepulchral monuments").

That clustering techniques help reduce search failures due to vocabulary mismatches should be seen as a considerable achievement. For, controlled vocabularies such as LCSH have for long been criticized for being, among other things, outdated, obscure, biased, and scarcely applied (Chan, 1986a, 1986b). Users experience the most persisting problems with subject searching because of the constraints of controlled vocabularies.

Categories of search failures are analyzed by means of a four-step ladder model in this chapter. Each step represents a category of search failures. It was suggested that in order to perform successful searches users have to climb all four steps. The roles of query formulation process, user interfaces, retrieval rules, and ineffective retrieval results in search failures are thus addressed using the ladder model. The types of search failures discussed under each category are as follows: 1) Failures caused by faulty query formulation: specific or broad search queries; vague search statements; scope failures; lack of on-screen help in query formulation process; 2) Failures caused by user interfaces and mechanical failures: failures caused by menu-driven, touch-screen, command language, and natural language user interfaces; parsing failures; users' familiarity (or lack thereof) with Boolean searching; poor screen layout or error messages; 3) Failures caused by retrieval rules: Boolean searching failures; precision, recall, fallout failures; failures caused by clustering, ranking and relevance feedback techniques; failures caused by stemming algorithms; failures caused by the use of acronyms or synonyms; 4) Failures caused by ineffective retrieval results: collection failures; information overload failures; zero retrievals; retrieving too little information; false drops; failures caused by indexing and vocabulary mismatch.