- Hacettepe University Department of
Information Management
- PBBY 220 Principles of
Information Retrieval (Spring 2006) Yaşar
Time and place: Wednesdays 8:30-11:20 (Amfi 1)
Objectives Course
Baeza-Yates, R. and Ribeiro. Modern
Information Retrieval, Baeza-Yates and Ribeiro, Addison Wesley, 1999.
(See also http://www.sims.berkeley.edu/~hearst/irbook.)
Brown, J.S. & Duguid, P. The Social Life of
Information. Boston, MA: Harvard Business School Press, 2000.
Chu, Heting. Information Representation and Retrieval in
the Digital Age. Medford, NJ: ASIST, 2003.
Buckland, M.K. Information and Information
Systems. New York: Praeger, 1991.
- Korfhage, Robert R. Information Storage and
Retrieval. New York: Wiley, 1997.
- information about the
- glossary of IR
- Korfhage's course syllabus
on "Information Storage and Retrieval")
Lancaster, F.W. Information Retrieval Systems.
(2d ed.). Wiley, 1979.
Lesk, M. Digital Libraries. San Francisco:
Morgan Kaufmann, 2001.
Rowley, J. Bilginin Düzenlenmesi: Bilgi
Erişime Giriş. (Çev. S. Karakaş ve başk.) Ankara: TKD Ankara Şubesi,
Salton, G. Automatic text
processing: the transformation, analysis, and retrieval of information by
computer. Reading, Mass. : Addison-Wesley, 1988.
Salton, G. and McGill, M.J. Introduction to Modern
Information Retrieval. New York: McGraw-Hill, 1983.
Sparck Jones, K. and Willett, P.
Readings in Information Retrieval. San Francisco : Morgan Kaufmann,
Svenonius, E. The Intellectual Foundation of Information
Organization. Cambridge, MA: MIT Press, 2000.
Taylor, A.G. The Organization of Information.
Englewood, CO: Libraries Unlimited, 1999.
Tonta, Y., Bitirim, Y. ve Sever, H. Türkçe Arama
Motorlarında Performans Değerlendirme. (Performance Evaluation of Turkish
Search Engines). Ankara: Total Bilişim Ltd. Şti., 2002. xvi, 152 s. (ISBN 975
92923-0-0). (
Özet ) (
Summary ) ( Tam metin
(PDF) )
Van Rijsbergen, C.J. Information Retrieval.
(2d ed.) London: Butterworths, 1979. (The full-text of the book is available
both as HTML and PDF formats. Just follow the above link.)
Alkan, N. “Bilgi Taramalarının Nitelik Açısından
Değerlendirilmesinde ‘Kesin İsabet’ (Kİ-Precision) ve ‘Erişim İsabeti’
(Eİ-Recall) Oranları”, Türk Kütüphaneciliği 8(4): 254-265,
Alkan, N. “Bilgi Taramalarında Temel Başarısızlık
Nedenleri”, Türk Kütüphaneciliği 9(2): 91-102, 1995.
Bitirim, Y., Tonta, Y. ve Sever, H.
“Information Retrieval Effectiveness of Turkish Search Engines,” In Tatyana
M. Yakhno (Ed.): Advances in Information Systems: Second International
Conference, ADVIS 2002, İzmir, Turkey, October 23-25, 2002, Proceedings.
(pp. 93-103). Berlin: Springer-Verlag, 2002. (PDF
Blair, D.C. & Maron, M.E. "An Evaluation of
Retrieval Effectiveness for a Full-Text Document Retrieval System,"
Communications of the ACM 28(3): 289-299, 1985.
Borges, J.L. "The Library of Babel", Labyrinths: Selected Stories &
Other Writings.
Bush, V. "As We May Think" (http://www.theatlantic.com/unbound/flashbks/computer/bushf.htm)
(Originally appeared in Atlantic Monthly, 1945.) See also:
Croft, B. "What Do People Want From Information Retrieval?" D-Lib Magazine (November 1995).
Cooper, W.S. "Getting Beyond Boole." Information
Processing & Management 24(3): 243-248, 1988.
Davis, C.
"Beyond Boole: The Next Logical Step." ASIS
Bulletin June 1995
Gudivada, V.N. et al. "Information
Retrieval on the World Wide Web," IEEE Internet Computing 1(5):
58-68, 1997.
Soydal, İrem, Umut Al ve Umut Sezen,
Tabanlı Görüntü Erişim Sistemleri" Bilgi Dünyası, 6(2):
155-170, Ekim 2005.
Swanson, D.R. (1988). Historical note: information retrieval and the future
of an illusion. Journal of the American Society for Information Science,
39 (2), 92-98.
Tonta, Y. “Bilgi Erişim Sistemleri”, Türk
Kütüphaneciliği 9(3): 302-314, Eylül 1995.
Tonta, Y. "
Bilgi erişim sorunu" Ed. Tülay Fenerci ve Oya Gürdal, 21. Yüzyıla
Girerken Enformasyon Olgusu: Ulusal Sempozyum Bildirileri, 19-20 Nisan 2001,
Hatay içinde (s. 198-206). Ankara: Türk Kütüphaneciler Derneği,
Tonta, Y. Bilgi
Erişim Sorunları ve Internet". 37. Kütüphane Haftası Bildirileri, 26 Mart
- 1 Nisan 2001, Milli Kütüphane, Ankara. Ankara: TKD, 2002.
Tonta, Y. Bilgi yönetiminin kavramsal tanımı
ve uygulama alanları. (bildiri) Kütüphaneciliğin Destanı Uluslararası
Sempozyumu, 21-24 Ekim 2004, Ankara (Bildiriler) içinde (55-68). Ankara: AÜ
DTCF Bilgi ve Belge Yönetimi Bölümü, 2004.(PDF
copy) (Sunuş
slaytları )
Tonta, Y. Bibliyografik Kayıtlar için İşlevsel
Gerekler (FRBR) Kavramsal Modeli. Prof. Dr. Nilüfer Tuncer'e
Armağan içinde (ss. 278-290).
(Haz. M. Emin Küçük)
Ankara: TKD, 2005. (PDF)
Wells, H.G. "World Brain: The Idea of a Permanent
World Encyclopedia," (http://art-bin.com/art/obrain.html).
(Originally appeared in Encyclopédie Française, August
Additional Readings
These a readings for background and discussion. Some of these will be
assigned for class discussion, others are for those who wish to dig further into
particular subjects. The list is based on the textbook of readings on IR by
Peter Willett and Karen Sparck Jones, with some additional items.
- * HISTORICAL: These items cover some early ideas and implementations
that provide some of the foundations of information retrieval theory and
Luhn, H.P. (1957). A Statistical Approach to Mechanized Encoding and
Searching of Literary Information. IBM Journal of Research and
Development, 1, 309-317.
Fairthorne, R.A. (1958). Automatic Retrieval of Recorded Information.
Computer Journal, 1, 36-41. (Also in Fairthorne, R.A. (1961).
Towards information retrieval. London: Butterworths).
Joyce, T. and Needham, R.M. (1958). The thesaurus approach to information
retrieval. American Documentation, 9 (3), 192-197.
Luhn, H.P. (1961). The automatic derivation of information retrieval
encodements from machine-readable texts. Information retrieval and machine
translation (Ed A. Kent), Vol 3, Pt 2, 1021-1028; reprinted in C.K.
Schultz, Ed, H.P. Luhn: Pioneer of information science, New York:
Spartan Books, 1968,
Maron, M.E. and Kuhns, J.L. (1960). On relevance, probabilistic indexing
and information retrieval. Journal of the Association for Computing
Machinery, 7, 216-244.
Maron, M.E. (1961). Automatic indexing: an experimental inquiry.
Journal of the Association for Computing Machinery, 8, 404-417.
Doyle, L.B. (1962). Indexing and abstracting by association. Part 1.
SP-718/001/00, System Development Corporation, Santa Monica CA.
Maron, M.E. (1965). Mechanised documentation: the logic behind a
probabilistic interpretation. Statistical methods for mechanised
documentation (Ed M.E. Stevens, V.E. Giuliano and L.B. Heilprin), National
Bureau of Standards Miscellaneous Publication 269, Washington DC: US
Government Printing Office, 9-13.
Cleverdon, C.W. (1967). The Cranfield tests on index language devices.
Aslib Proceedings, 19, 1967, 173-192.
Salton, G, and Lesk, M.E. (1968). Computer evaluation of indexing and text
processing. Journal of the ACM, 15 (1), 8-36; reprinted in G. Salton,
Ed, The SMART retrieval system, Englewood Cliffs NJ: Prentice-Hall,
1971, 143-180.
* KEY CONCEPTS: These papers examine the nature of documents,
aboutness, indexing and index languages, requests, relevance, users and
searching. Note this section deals with these topics primarily in an
analytical and descriptive style, rather than by wholesale modelling of the
retrieval process, covered in a later section.
Hutchins, W.J. (1978). The concept of `aboutness' in subject indexing.
Aslib Proceedings, 30. 172-181.
Cleverdon, C.W. and Mills, J. (1963). The testing of index language
devices. Aslib Proceedings, 15 (4), 106-130; reprinted in L.M. Chan,
P.A. Richmond and E. Svenonius, Eds, Theory of Subject Analysis,
Littleton CO: Libraries Unlimited, 1986, 223-246.
Foskett, D.J. (1980). Thesaurus. in A. Kent. H. Lancour and J.E. Daily,
Eds, Encyclopedia of Library and Information Science, Vol 30, New York:
Marcel Dekker, 416-462; reprinted in E.D. Dym, Ed, Subject and information
analysis, New York: Marcel Dekker, 1985, 270-316.
Daniels, P.J., Brooks, H.M. and Belkin, N.J. (1985). Using problem
structures for driving human-computer dialogues. RIAO-85, Actes: Recherche
d'Informations Assistee par Ordinateur, Grenoble: IMAG, 645-660.
Saracevic, T. (1975). Relevance: a review of and a framework for the
thiniking on the notion in information science. Journal of the American
Society for Information Science, 39 (3) 321-343.
* EVALUATION: These papers cover the notions of performance issues,
criteria for performance evaluation, test design and methodology, with
examples illustrating the methods.
Saracevic, T. et al (1988). A study of information seeking and retrieving,
Parts 1,2,3. Journal of the American Society for Information Science,
39 (3), 161-216. Pt 1 only
Cooper, W.S. (1973). On selecting a measure of retrieval effectiveness. Pt
1. Journal of the American Society for Information Science, 24 (?2),
Tague-Sutcliffe, J. (1992). The pragmatics of information retrieval
experimentation, revisited. Information Processing and Management, 28
(4), 467-490.
Keen, E.M. (1992). Presenting results if experimental retrieval
comparisons. Information Processing and Management, 28 (4), 491-502.
Lancaster, W.F. (1969). MEDLARS: Report on the evaluation of its operating
efficiency. American Documentation, 20 (2), 119-142; reprinted in T.
Saracevic, Ed, Introduction to Information Science, New York: Bowker,
1970, 640-664.
Blair, D.C. and Maron. M.E. (1985). An evaluation of retrieval
effectiveness for a full-text document retrieval system. Communications of
the ACM, 28 (??), 289-299.
Salton, G. (1986). Another look at text-retrieval systems.
Communications of the ACM, 29(7), 648-656.
Blair, D.C. and Maron, M.E. (1990). Full text information retrieval:
further analysis and clarification. Information Processing and
Management, 26, 437-447.
Blair, D.C. (1996). STAIRS redux: thoughts on the STAIRS evaluations, ten
years after. Journal of the American Society for Information Science,
47, 4-22.
Harman, D. (1995). The TREC Conferences. Hypertext - information
retrieval - multimedia: synergieeffekte elektronischer informationssysteme,
HIM '95, Proceedings (Ed R. Kuhlen and M. Rittberger), Konstanz:
Universitaetsforlag Konstanz, 9-28.
* BASIC IR MODELS: These papers cover models of IR, both qualitative
and quantitative (eg cognitive, statistical), concentrating on the general
notions of the main IR models. Implementation issues are described later in
Robertson, S.E. (1977). Theories and models in information retrieval.
Journal of Documentation, 33, 126-148.
Belkin, N.J., Oddy, R.N. and Brooks, H.M. (1982). ASK for information
retrieval: part 1. Background and theory. Journal of Documentation, 38,
Cooper, W.S. (1988). Getting beyond Boole. Information Processing and
Management, 24, 243-248.
Robertson, S.E. (1977). The probability ranking principle in IR.
Journal of Documentation, 33, 294-304.
Salton, G. Wong, A. and Yang, C.S. (1975). A vector space model for
automatic indexing. Communications of the ACM, 18 (11), 613-620.
Robertson, S.E., Maron, M.E. and Cooper, W.S. (1982). Probability of
Relevance: A Unification of Two Competing Models for Document Retrieval.
Information Technology: Research and Development, 1, 1-21.
Turtle, H.R. and Croft, W.B. (1990). Inference networks for document
retrieval. Proceedings of the 13th International Conference on Research and
Development in Information Retrieval, 1-24, 1990.
van Rijsbergen, C.J. (1986). A non-classical logic for information
retrieval. Computer Journal, 29, 481-485, 1986.
*IR TECHNIQUES: These papers examine the details of various models and
other specific techniques and technologies, including reports of testing.
Belkin, N.J. and Croft, W.B. (1987). Retrieval Techniques. Annual
Review of Information Science and Technology, 22, 109-145.
Robertson, S.E. and Sparck Jones, K. (1976). Relevance Weighting of Search
Terms. Journal of the American Society for Information Science, 27(3),
Croft, W.B. and Harper, D.J. (1979). Using probabilistic models of
document retrieval without relevance information. Journal of
Documentation, 35, 285-295.
Porter, M.F. (1980). An algorithm for suffix stripping. Program,
14, 130-137.
Robertson, S.E. and Walker, S. (1994). Some simple effective
approximations to the 2 Poisson model for probabilistic weighted retrieval.
SIGIR 94 - Proceedings of the Seventeenth Annual International ACM SIGIR
Conference on Research and Development in Information Retrieval, 232-241.
Salton, G. and Buckley, C. (1988). Term weighting approaches in automatic
text retrieval. Information Processing and Management, 24, 513-523.
Salton, G. and Buckley, C. (1990). Improving retrieval performance by
relevance feedback, Journal of the American Society for Information
Science, 41, 288-297, 1990.
Sparck Jones, K. (1979). Search term relevance weighting given little
relevance information. Journal of Documentation, 35 (1), 30-48.
Strzalkowski, T. (1994) Robust text processing in automated information
retrieval. Proceedings of the 4th Conference on Applied Natural Language
Processing (stuttgart), Association for Computational Lingustics, 168-173.
Griffiths, A., Luckhurst, H.C. and Willett, P. (1986). Using interdocument
similarity information in document retrieval systems. Journal of the
American Society for Information Science, 37 (1), 3-11.
Belkin, N.J. and Croft, W.B. (1992). Information filtering and information
retrieval: two sides of the same coin? Communications of the ACM,
35(12), 29-38.
* SYSTEMS: This section includes papers describing complete IR systems,
focussing on those embodying modern views of what such systems should be like,
but also illustrating the status of more `conventional' systems.
Salton, G. and McGill, M.J. (1983). The SMART and SIRE experimental
retrieval systems. In Introduction To Information Retrieval, New York,
McGraw-Hill, pp 118-156.
Harman, D. (1992). User-friendly systems instead of user-friendly
front-ends. Journal of the American Society for Information Science, 43
(?), 164-174.
Walker, S. (1989). The Okapi online catalogue research projects. in The
online catalogue: developments and directions (Ed C. Hildreth), London:
The Library Association, 84-106.
Callan, J.; Croft, W.B. and Broglio, J. (1995). TREC and TIPSTER
experiments with INQUERY. Information Processing and Management, 31
Fox, E.A. and France, R.K. (1987). Architecture of an expert system for
composite document analysis, representation and retrieval. Journal of
Approximate Reasoning, 1, 151-175.
Fox, E.A. and Koll, M.B. (1988). Practical enhanced Boolean retrieval:
experiences with the SMART and SIRE systems. Information Processing and
Management, 24, 257-267.
McCune, B.P., Tong, R. and Dean, J. (1985). RUBRIC, a system for
rule-based information retrieval. IEEE Transactions on Software
Engineering. SE11-9, 939-944.
Jacobs, P.S. and Rau, L.F. (1990). SCISOR: extracting information from
on-line news. Communications of the ACM, 33(11), 88-97.
Larson, R.R., McDonough, J., Kuntz, L., O'Leary, P. and Moon, R.
``Cheshire II: Designing a Next-Generation Online Catalog.'' Journal of the
American Society for Information Science, 47(7) (July 1996), p. 555-567.
Tenopir, C. and Cahn, P. (1994). TARGET and FREESTYLE: DIALOG and Mead
join the relevance ranks. Online, 18 (3), 31-47. (shorter after ads
* EXTENSIONS: These papers move outwards from the classical text
document/single query situation to consider other types of `document' and
other versions and aspects of the information access task. The object is to
illustrate the scope of information retrieval viewed more broadly, and to draw
attention to the links between retrieval and other information processing
activities. At the same time, since some of the ideas and work covered here
also reflect new challenges and possibilities stemming from recent technology
developments, this section has papers to be taken as initial leads into the
future, rather than as authoritative guides to the established wisdom.
``Hypertext and Information Retrieval: Towards the Next Generation of
Information Systems''. In: Borgman, C. L. and Pai, E. Y. H. (Eds.)
Information and Technology: Proceedings of the 51st ASIS Annual
Meeting, Medford, NJ: Learned Information, Inc., 1988.
Agosti, M. Gradenigo, G. and Marchetti, P.G. (1992). A hypertext
environment for interacting with large databases Information Processing and
Management, 28 93), 371-387.
Salton, G., Allan, J., Buckley, C. and Singhal, A. (1994). Automatic
analysis, theme generation, and summarisation of machine-readable texts.
Science, 264, 3 June, 1421-1426.
Hull, D.A. and Grefenstette, G. (1996). Experiments in multilingual
retrieval. Proceedings of the 19th Annual International ACM SIGIR
Conference on Research and Development in Information Retrieval.
Rose, R.C. (1991). Techniques for information retrieval from speech
messages. Lincoln Laboratory Journal, 4 (1), 45-59.
Zhang, H.J., Low, C.Y., Smoliar, S.W. and Wu, J.H. (1995). Video parsing,
retrieval and browsing: an integrated and content-based solution.
Proceedings of ACM Multimedia '95, 15-24; reprinted in Intelligent
multimedia information retrieval (Ed M. Maybury).
Biebricher, B. et al (1988). The automatic indexing system AIR/PHYS - from
research to application. Eleventh International Conference on Research and
Development in Information Retrieval, 333-342.
Hayes, P.J., Knecht, L. and Cellio, M. (1988). A news story categorisation
system. Proceedings of the Second Conference on Applied Natural Language
Processing, Association for Computational Linguistics, 9-17.
Rau, L.F. (1988). Conceptual information extraction and retrieval from
natural language input. RIAO 88, 424-437.
Marsh, E., Hamburger, H. and Grishman, R. (1984). A production rule system
for message summarisation. AAAI-84, Proceedings, American Association
for Artificial Intelligence, 243-246.
Johnson, F.C., Paice, C.D., Black, W.J. and Neal, A.P. (1993). The
application of linguistic processing to automatic abstract generation.
Journal of Document and Text Management, 1 (3), 215-241.
resources on information retrieval (web pages, articles,
textbooks, etc.) (prepared by the IR Research Group of UIUC). (The
original page is accessible at http://leep.lis.uiuc.edu/spring98/lis329/links.html)
- Yaşar Tonta
- tonta@hacettepe.edu.tr
- Last updated: Feb 19, 2006