- Hacettepe University Department of
Information Management
- PBBY 220 Principles of
Information Retrieval (Spring 2006) Yaşar
Tonta
Time and place: Wednesdays 8:30-11:20 (Amfi 1)
Course
Objectives Course
Schedule
SUGGESTED READINGS
Textbooks
Baeza-Yates, R. and Ribeiro. Modern
Information Retrieval, Baeza-Yates and Ribeiro, Addison Wesley, 1999.
(See also http://www.sims.berkeley.edu/~hearst/irbook.)
Brown, J.S. & Duguid, P. The Social Life of
Information. Boston, MA: Harvard Business School Press, 2000.
Chu, Heting. Information Representation and Retrieval in
the Digital Age. Medford, NJ: ASIST, 2003.
Buckland, M.K. Information and Information
Systems. New York: Praeger, 1991.
- Korfhage, Robert R. Information Storage and
Retrieval. New York: Wiley, 1997.
- information about the
book
- glossary of IR
terms
- Korfhage's course syllabus
on "Information Storage and Retrieval")
Lancaster, F.W. Information Retrieval Systems.
(2d ed.). Wiley, 1979.
Lesk, M. Digital Libraries. San Francisco:
Morgan Kaufmann, 2001.
Rowley, J. Bilginin Düzenlenmesi: Bilgi
Erişime Giriş. (Çev. S. Karakaş ve başk.) Ankara: TKD Ankara Şubesi,
1996.
Salton, G. Automatic text
processing: the transformation, analysis, and retrieval of information by
computer. Reading, Mass. : Addison-Wesley, 1988.
Salton, G. and McGill, M.J. Introduction to Modern
Information Retrieval. New York: McGraw-Hill, 1983.
Sparck Jones, K. and Willett, P.
Readings in Information Retrieval. San Francisco : Morgan Kaufmann,
1997.
Svenonius, E. The Intellectual Foundation of Information
Organization. Cambridge, MA: MIT Press, 2000.
Taylor, A.G. The Organization of Information.
Englewood, CO: Libraries Unlimited, 1999.
Tonta, Y., Bitirim, Y. ve Sever, H. Türkçe Arama
Motorlarında Performans Değerlendirme. (Performance Evaluation of Turkish
Search Engines). Ankara: Total Bilişim Ltd. Şti., 2002. xvi, 152 s. (ISBN 975
92923-0-0). (
Özet ) (
Summary ) ( Tam metin
(PDF) )
Van Rijsbergen, C.J. Information Retrieval.
(2d ed.) London: Butterworths, 1979. (The full-text of the book is available
both as HTML and PDF formats. Just follow the above link.)
Articles
Alkan, N. “Bilgi Taramalarının Nitelik Açısından
Değerlendirilmesinde ‘Kesin İsabet’ (Kİ-Precision) ve ‘Erişim İsabeti’
(Eİ-Recall) Oranları”, Türk Kütüphaneciliği 8(4): 254-265,
1994.
Alkan, N. “Bilgi Taramalarında Temel Başarısızlık
Nedenleri”, Türk Kütüphaneciliği 9(2): 91-102, 1995.
Bitirim, Y., Tonta, Y. ve Sever, H.
“Information Retrieval Effectiveness of Turkish Search Engines,” In Tatyana
M. Yakhno (Ed.): Advances in Information Systems: Second International
Conference, ADVIS 2002, İzmir, Turkey, October 23-25, 2002, Proceedings.
(pp. 93-103). Berlin: Springer-Verlag, 2002. (PDF
copy)
Blair, D.C. & Maron, M.E. "An Evaluation of
Retrieval Effectiveness for a Full-Text Document Retrieval System,"
Communications of the ACM 28(3): 289-299, 1985.
Borges, J.L. "The Library of Babel", Labyrinths: Selected Stories &
Other Writings.
Bush, V. "As We May Think" (http://www.theatlantic.com/unbound/flashbks/computer/bushf.htm)
(Originally appeared in Atlantic Monthly, 1945.) See also:
[Reader]
Croft, B. "What Do People Want From Information Retrieval?" D-Lib Magazine (November 1995).
Cooper, W.S. "Getting Beyond Boole." Information
Processing & Management 24(3): 243-248, 1988.
Davis, C.
"Beyond Boole: The Next Logical Step." ASIS
Bulletin June 1995
Gudivada, V.N. et al. "Information
Retrieval on the World Wide Web," IEEE Internet Computing 1(5):
58-68, 1997.
Soydal, İrem, Umut Al ve Umut Sezen,
"İçerik
Tabanlı Görüntü Erişim Sistemleri" Bilgi Dünyası, 6(2):
155-170, Ekim 2005.
Swanson, D.R. (1988). Historical note: information retrieval and the future
of an illusion. Journal of the American Society for Information Science,
39 (2), 92-98.
Tonta, Y. “Bilgi Erişim Sistemleri”, Türk
Kütüphaneciliği 9(3): 302-314, Eylül 1995.
Tonta, Y. "
Bilgi erişim sorunu" Ed. Tülay Fenerci ve Oya Gürdal, 21. Yüzyıla
Girerken Enformasyon Olgusu: Ulusal Sempozyum Bildirileri, 19-20 Nisan 2001,
Hatay içinde (s. 198-206). Ankara: Türk Kütüphaneciler Derneği,
2001.
Tonta, Y. Bilgi
Erişim Sorunları ve Internet". 37. Kütüphane Haftası Bildirileri, 26 Mart
- 1 Nisan 2001, Milli Kütüphane, Ankara. Ankara: TKD, 2002.
Tonta, Y. Bilgi yönetiminin kavramsal tanımı
ve uygulama alanları. (bildiri) Kütüphaneciliğin Destanı Uluslararası
Sempozyumu, 21-24 Ekim 2004, Ankara (Bildiriler) içinde (55-68). Ankara: AÜ
DTCF Bilgi ve Belge Yönetimi Bölümü, 2004.(PDF
copy) (Sunuş
slaytları )
Tonta, Y. Bibliyografik Kayıtlar için İşlevsel
Gerekler (FRBR) Kavramsal Modeli. Prof. Dr. Nilüfer Tuncer'e
Armağan içinde (ss. 278-290).
(Haz. M. Emin Küçük)
Ankara: TKD, 2005. (PDF)
Wells, H.G. "World Brain: The Idea of a Permanent
World Encyclopedia," (http://art-bin.com/art/obrain.html).
(Originally appeared in Encyclopédie Française, August
1937.)
Additional Readings
These a readings for background and discussion. Some of these will be
assigned for class discussion, others are for those who wish to dig further into
particular subjects. The list is based on the textbook of readings on IR by
Peter Willett and Karen Sparck Jones, with some additional items.
- * HISTORICAL: These items cover some early ideas and implementations
that provide some of the foundations of information retrieval theory and
practice.
LUHN57
Luhn, H.P. (1957). A Statistical Approach to Mechanized Encoding and
Searching of Literary Information. IBM Journal of Research and
Development, 1, 309-317.
FAIR58
Fairthorne, R.A. (1958). Automatic Retrieval of Recorded Information.
Computer Journal, 1, 36-41. (Also in Fairthorne, R.A. (1961).
Towards information retrieval. London: Butterworths).
JOYC58
Joyce, T. and Needham, R.M. (1958). The thesaurus approach to information
retrieval. American Documentation, 9 (3), 192-197.
LUHN61
Luhn, H.P. (1961). The automatic derivation of information retrieval
encodements from machine-readable texts. Information retrieval and machine
translation (Ed A. Kent), Vol 3, Pt 2, 1021-1028; reprinted in C.K.
Schultz, Ed, H.P. Luhn: Pioneer of information science, New York:
Spartan Books, 1968,
MARO60
Maron, M.E. and Kuhns, J.L. (1960). On relevance, probabilistic indexing
and information retrieval. Journal of the Association for Computing
Machinery, 7, 216-244.
MARO61
Maron, M.E. (1961). Automatic indexing: an experimental inquiry.
Journal of the Association for Computing Machinery, 8, 404-417.
DOYL62
Doyle, L.B. (1962). Indexing and abstracting by association. Part 1.
SP-718/001/00, System Development Corporation, Santa Monica CA.
MARO65
Maron, M.E. (1965). Mechanised documentation: the logic behind a
probabilistic interpretation. Statistical methods for mechanised
documentation (Ed M.E. Stevens, V.E. Giuliano and L.B. Heilprin), National
Bureau of Standards Miscellaneous Publication 269, Washington DC: US
Government Printing Office, 9-13.
CLEV67
Cleverdon, C.W. (1967). The Cranfield tests on index language devices.
Aslib Proceedings, 19, 1967, 173-192.
SALT68
Salton, G, and Lesk, M.E. (1968). Computer evaluation of indexing and text
processing. Journal of the ACM, 15 (1), 8-36; reprinted in G. Salton,
Ed, The SMART retrieval system, Englewood Cliffs NJ: Prentice-Hall,
1971, 143-180.
* KEY CONCEPTS: These papers examine the nature of documents,
aboutness, indexing and index languages, requests, relevance, users and
searching. Note this section deals with these topics primarily in an
analytical and descriptive style, rather than by wholesale modelling of the
retrieval process, covered in a later section.
HUTC78
Hutchins, W.J. (1978). The concept of `aboutness' in subject indexing.
Aslib Proceedings, 30. 172-181.
CLEV63
Cleverdon, C.W. and Mills, J. (1963). The testing of index language
devices. Aslib Proceedings, 15 (4), 106-130; reprinted in L.M. Chan,
P.A. Richmond and E. Svenonius, Eds, Theory of Subject Analysis,
Littleton CO: Libraries Unlimited, 1986, 223-246.
FOSK80
Foskett, D.J. (1980). Thesaurus. in A. Kent. H. Lancour and J.E. Daily,
Eds, Encyclopedia of Library and Information Science, Vol 30, New York:
Marcel Dekker, 416-462; reprinted in E.D. Dym, Ed, Subject and information
analysis, New York: Marcel Dekker, 1985, 270-316.
DANI85
Daniels, P.J., Brooks, H.M. and Belkin, N.J. (1985). Using problem
structures for driving human-computer dialogues. RIAO-85, Actes: Recherche
d'Informations Assistee par Ordinateur, Grenoble: IMAG, 645-660.
SARA75
Saracevic, T. (1975). Relevance: a review of and a framework for the
thiniking on the notion in information science. Journal of the American
Society for Information Science, 39 (3) 321-343.
* EVALUATION: These papers cover the notions of performance issues,
criteria for performance evaluation, test design and methodology, with
examples illustrating the methods.
SARA88
Saracevic, T. et al (1988). A study of information seeking and retrieving,
Parts 1,2,3. Journal of the American Society for Information Science,
39 (3), 161-216. Pt 1 only
COOP73
Cooper, W.S. (1973). On selecting a measure of retrieval effectiveness. Pt
1. Journal of the American Society for Information Science, 24 (?2),
87-100.
TAGU92
Tague-Sutcliffe, J. (1992). The pragmatics of information retrieval
experimentation, revisited. Information Processing and Management, 28
(4), 467-490.
KEEN92
Keen, E.M. (1992). Presenting results if experimental retrieval
comparisons. Information Processing and Management, 28 (4), 491-502.
LANC69
Lancaster, W.F. (1969). MEDLARS: Report on the evaluation of its operating
efficiency. American Documentation, 20 (2), 119-142; reprinted in T.
Saracevic, Ed, Introduction to Information Science, New York: Bowker,
1970, 640-664.
BLAI85
Blair, D.C. and Maron. M.E. (1985). An evaluation of retrieval
effectiveness for a full-text document retrieval system. Communications of
the ACM, 28 (??), 289-299.
SALT86
Salton, G. (1986). Another look at text-retrieval systems.
Communications of the ACM, 29(7), 648-656.
BLAI90
Blair, D.C. and Maron, M.E. (1990). Full text information retrieval:
further analysis and clarification. Information Processing and
Management, 26, 437-447.
BLAI96
Blair, D.C. (1996). STAIRS redux: thoughts on the STAIRS evaluations, ten
years after. Journal of the American Society for Information Science,
47, 4-22.
HARM95
Harman, D. (1995). The TREC Conferences. Hypertext - information
retrieval - multimedia: synergieeffekte elektronischer informationssysteme,
HIM '95, Proceedings (Ed R. Kuhlen and M. Rittberger), Konstanz:
Universitaetsforlag Konstanz, 9-28.
* BASIC IR MODELS: These papers cover models of IR, both qualitative
and quantitative (eg cognitive, statistical), concentrating on the general
notions of the main IR models. Implementation issues are described later in
Techniques.
ROBE77b
Robertson, S.E. (1977). Theories and models in information retrieval.
Journal of Documentation, 33, 126-148.
BELK82
Belkin, N.J., Oddy, R.N. and Brooks, H.M. (1982). ASK for information
retrieval: part 1. Background and theory. Journal of Documentation, 38,
61-71.
COOP88
Cooper, W.S. (1988). Getting beyond Boole. Information Processing and
Management, 24, 243-248.
ROBE77b
Robertson, S.E. (1977). The probability ranking principle in IR.
Journal of Documentation, 33, 294-304.
SALT75
Salton, G. Wong, A. and Yang, C.S. (1975). A vector space model for
automatic indexing. Communications of the ACM, 18 (11), 613-620.
ROBE82
Robertson, S.E., Maron, M.E. and Cooper, W.S. (1982). Probability of
Relevance: A Unification of Two Competing Models for Document Retrieval.
Information Technology: Research and Development, 1, 1-21.
TURT90
Turtle, H.R. and Croft, W.B. (1990). Inference networks for document
retrieval. Proceedings of the 13th International Conference on Research and
Development in Information Retrieval, 1-24, 1990.
VANR86
van Rijsbergen, C.J. (1986). A non-classical logic for information
retrieval. Computer Journal, 29, 481-485, 1986.
*IR TECHNIQUES: These papers examine the details of various models and
other specific techniques and technologies, including reports of testing.
BELK87
Belkin, N.J. and Croft, W.B. (1987). Retrieval Techniques. Annual
Review of Information Science and Technology, 22, 109-145.
ROBE76
Robertson, S.E. and Sparck Jones, K. (1976). Relevance Weighting of Search
Terms. Journal of the American Society for Information Science, 27(3),
129-146.
CROF79
Croft, W.B. and Harper, D.J. (1979). Using probabilistic models of
document retrieval without relevance information. Journal of
Documentation, 35, 285-295.
PORT80
Porter, M.F. (1980). An algorithm for suffix stripping. Program,
14, 130-137.
ROBE94
Robertson, S.E. and Walker, S. (1994). Some simple effective
approximations to the 2 Poisson model for probabilistic weighted retrieval.
SIGIR 94 - Proceedings of the Seventeenth Annual International ACM SIGIR
Conference on Research and Development in Information Retrieval, 232-241.
SALT88
Salton, G. and Buckley, C. (1988). Term weighting approaches in automatic
text retrieval. Information Processing and Management, 24, 513-523.
SALT90
Salton, G. and Buckley, C. (1990). Improving retrieval performance by
relevance feedback, Journal of the American Society for Information
Science, 41, 288-297, 1990.
SPAR79
Sparck Jones, K. (1979). Search term relevance weighting given little
relevance information. Journal of Documentation, 35 (1), 30-48.
STRZ94
Strzalkowski, T. (1994) Robust text processing in automated information
retrieval. Proceedings of the 4th Conference on Applied Natural Language
Processing (stuttgart), Association for Computational Lingustics, 168-173.
GRIF86
Griffiths, A., Luckhurst, H.C. and Willett, P. (1986). Using interdocument
similarity information in document retrieval systems. Journal of the
American Society for Information Science, 37 (1), 3-11.
BELK92
Belkin, N.J. and Croft, W.B. (1992). Information filtering and information
retrieval: two sides of the same coin? Communications of the ACM,
35(12), 29-38.
* SYSTEMS: This section includes papers describing complete IR systems,
focussing on those embodying modern views of what such systems should be like,
but also illustrating the status of more `conventional' systems.
SALT83
Salton, G. and McGill, M.J. (1983). The SMART and SIRE experimental
retrieval systems. In Introduction To Information Retrieval, New York,
McGraw-Hill, pp 118-156.
HARM92
Harman, D. (1992). User-friendly systems instead of user-friendly
front-ends. Journal of the American Society for Information Science, 43
(?), 164-174.
WALK89
Walker, S. (1989). The Okapi online catalogue research projects. in The
online catalogue: developments and directions (Ed C. Hildreth), London:
The Library Association, 84-106.
CALL95
Callan, J.; Croft, W.B. and Broglio, J. (1995). TREC and TIPSTER
experiments with INQUERY. Information Processing and Management, 31
(3).
FOX87
Fox, E.A. and France, R.K. (1987). Architecture of an expert system for
composite document analysis, representation and retrieval. Journal of
Approximate Reasoning, 1, 151-175.
FOX88
Fox, E.A. and Koll, M.B. (1988). Practical enhanced Boolean retrieval:
experiences with the SMART and SIRE systems. Information Processing and
Management, 24, 257-267.
MCCU85
McCune, B.P., Tong, R. and Dean, J. (1985). RUBRIC, a system for
rule-based information retrieval. IEEE Transactions on Software
Engineering. SE11-9, 939-944.
JACO90
Jacobs, P.S. and Rau, L.F. (1990). SCISOR: extracting information from
on-line news. Communications of the ACM, 33(11), 88-97.
LARS96
Larson, R.R., McDonough, J., Kuntz, L., O'Leary, P. and Moon, R.
``Cheshire II: Designing a Next-Generation Online Catalog.'' Journal of the
American Society for Information Science, 47(7) (July 1996), p. 555-567.
TENO94
Tenopir, C. and Cahn, P. (1994). TARGET and FREESTYLE: DIALOG and Mead
join the relevance ranks. Online, 18 (3), 31-47. (shorter after ads
deleted)
* EXTENSIONS: These papers move outwards from the classical text
document/single query situation to consider other types of `document' and
other versions and aspects of the information access task. The object is to
illustrate the scope of information retrieval viewed more broadly, and to draw
attention to the links between retrieval and other information processing
activities. At the same time, since some of the ideas and work covered here
also reflect new challenges and possibilities stemming from recent technology
developments, this section has papers to be taken as initial leads into the
future, rather than as authoritative guides to the established wisdom.
LARS88
``Hypertext and Information Retrieval: Towards the Next Generation of
Information Systems''. In: Borgman, C. L. and Pai, E. Y. H. (Eds.)
Information and Technology: Proceedings of the 51st ASIS Annual
Meeting, Medford, NJ: Learned Information, Inc., 1988.
AGOS92
Agosti, M. Gradenigo, G. and Marchetti, P.G. (1992). A hypertext
environment for interacting with large databases Information Processing and
Management, 28 93), 371-387.
SALT94
Salton, G., Allan, J., Buckley, C. and Singhal, A. (1994). Automatic
analysis, theme generation, and summarisation of machine-readable texts.
Science, 264, 3 June, 1421-1426.
HULL96
Hull, D.A. and Grefenstette, G. (1996). Experiments in multilingual
retrieval. Proceedings of the 19th Annual International ACM SIGIR
Conference on Research and Development in Information Retrieval.
ROSE91
Rose, R.C. (1991). Techniques for information retrieval from speech
messages. Lincoln Laboratory Journal, 4 (1), 45-59.
ZHAN95
Zhang, H.J., Low, C.Y., Smoliar, S.W. and Wu, J.H. (1995). Video parsing,
retrieval and browsing: an integrated and content-based solution.
Proceedings of ACM Multimedia '95, 15-24; reprinted in Intelligent
multimedia information retrieval (Ed M. Maybury).
BIEB88
Biebricher, B. et al (1988). The automatic indexing system AIR/PHYS - from
research to application. Eleventh International Conference on Research and
Development in Information Retrieval, 333-342.
HAYE88
Hayes, P.J., Knecht, L. and Cellio, M. (1988). A news story categorisation
system. Proceedings of the Second Conference on Applied Natural Language
Processing, Association for Computational Linguistics, 9-17.
RAU88
Rau, L.F. (1988). Conceptual information extraction and retrieval from
natural language input. RIAO 88, 424-437.
MARS84
Marsh, E., Hamburger, H. and Grishman, R. (1984). A production rule system
for message summarisation. AAAI-84, Proceedings, American Association
for Artificial Intelligence, 243-246.
JOHN93
Johnson, F.C., Paice, C.D., Black, W.J. and Neal, A.P. (1993). The
application of linguistic processing to automatic abstract generation.
Journal of Document and Text Management, 1 (3), 215-241.
More
resources on information retrieval (web pages, articles,
textbooks, etc.) (prepared by the IR Research Group of UIUC). (The
original page is accessible at http://leep.lis.uiuc.edu/spring98/lis329/links.html)
- Yaşar Tonta
- tonta@hacettepe.edu.tr
- Last updated: Feb 19, 2006