Hacettepe
University Department of Library Science
DOK 324
Principles of Information Retrieval (Spring
2001) Yaşar Tonta
HOMEWORK: WEB SEARCH ENGINES
(Due Date: 25 May 2001 09:15)
Web
search engines (Yahoo!, AltaVista, Excite, Lycos, HotBot, etc.) may permit
several different kinds of searches, from a general search for documents with
words in a given list, to searches using a Boolean expression, to searches
constrained within some hierarchy of documents. For each of the following queries, investigate the response of
six different Web search engines. These search engines are: AltaVista,
Excite, Hotbot, Infoseek, Northern Light, and Google. For each query
performed on each search engine, you are to calculate the precision ratio (for
example, if you obtained 100 documents from AltaVista for the first query,
reviewed first 10 of them and found 5 of them relevant, the precision ratio for
that search on AltaVista will be 50%) and show your relevance judgments on the
printout by marking each. Then, note
all the unique relevant documents retrieved for each search query by all search
engines and count them. Based on all
the relevant documents retrieved by all search engines, calculate the recall
ratio for each query performed on each search engine. Note the duplicates and broken links. You are to repeat this for the same query on other four search
engines, too. Also note the following
information:
·
name
of the search engine,
·
the
type of search done (simple, advanced, Boolean),
·
any
special features used (i.e., truncation),
·
the
number of documents or identified document surrogates (i.e., abstracts or
summaries),
·
the
number of documents or document surrogates you examined (you should at least
examine 10 surrogates for each search),
·
the
number of relevant documents you found within the first 10 documents (and thus
precision),
·
the
number relevant documents that each search engine found among all relevant
documents retrieved by all six search engines,
·
your
performance evaluation of each search engine along with your impression of the
search engine (user satisfaction).
Do
you think that other (better) documents were not found? Should the search have been done without
using the Web, and why? Attach the
printed copies of the searches that you performed along with the relevance
judgments and precision/recall ratios for each query and search engine.
Here
is a step-by-step explanation of what you are required to do:
Queries
1.
I
am looking for information on the British musical band called "Divine
Comedy". I am not interested in
the famous book with the same title.
2.
My
professor asked me to find sites and documents on the Internet that have
information on performance evaluation of search engines. Can you help?
3.
I
am writing a paper on the "Internet and ethics". Relevant papers,
sites, documents, etc. are most welcome.
Have
a good hunting!