http://http:/yunus.hacettepe.edu.tr/~tonta/courses/spring2007/bby220/rdnglist.htm


Hacettepe University Department of Information Management

 PBBY 220 Principles of Information Retrieval (Spring 2007) Yaşar Tonta


Time and place: Tuesdays 9:30-12:20 (Amfi 1)

Instructor: Yaşar Tonta (e-mail: tonta@hacettepe.edu.tr; tel: 297 82 04) 

Course web site: http://http:/yunus.hacettepe.edu.tr/~tonta/courses/spring2007/bby220/rdnglist.htm 


Course Objectives    Course Schedule

 

SUGGESTED READINGS

Textbooks

Baeza-Yates, R. and Ribeiro. Modern Information Retrieval, Baeza-Yates and Ribeiro, Addison Wesley, 1999. (See also http://www.sims.berkeley.edu/~hearst/irbook.)

Brown, J.S. & Duguid, P. The Social Life of Information. Boston, MA: Harvard Business School Press, 2000.

Chu, Heting. Information Representation and Retrieval in the Digital Age. Medford, NJ: ASIST, 2003.

Buckland, M.K. Information and Information Systems. New York: Praeger, 1991.

Korfhage, Robert R. Information Storage and Retrieval. New York: Wiley, 1997. 

information about the book 

glossary of IR terms 

Korfhage's course syllabus on "Information Storage and Retrieval")

Lancaster, F.W. Information Retrieval Systems. (2d ed.). Wiley, 1979.

Lesk, M. Digital Libraries. San Francisco: Morgan Kaufmann, 2001.

Rowley, J. Bilginin Düzenlenmesi: Bilgi Erişime Giriş. (Çev. S. Karakaş ve başk.) Ankara: TKD Ankara Şubesi, 1996.

Salton, G. Automatic text processing: the transformation, analysis, and retrieval of information by computer. Reading, Mass. : Addison-Wesley, 1988.

Salton, G. and McGill, M.J. Introduction to Modern Information Retrieval. New York: McGraw-Hill, 1983.

Sparck Jones, K. and Willett, P. Readings in Information Retrieval. San Francisco : Morgan Kaufmann, 1997.

Svenonius, E. The Intellectual Foundation of Information Organization. Cambridge, MA: MIT Press, 2000.

Taylor, A.G. The Organization of Information. Englewood, CO: Libraries Unlimited, 1999.

Tonta, Y., Bitirim, Y. ve Sever, H. Türkçe Arama Motorlarında Performans Değerlendirme. (Performance Evaluation of Turkish Search Engines). Ankara: Total Bilişim Ltd. Şti., 2002. xvi, 152 s. (ISBN 975 92923-0-0). ( Özet ) ( Summary ) ( Tam metin (PDF) )

Van Rijsbergen, C.J. Information Retrieval. (2d ed.) London: Butterworths, 1979. (The full-text of the book is available both as HTML and PDF formats. Just follow the above link.)

Articles

Alkan, N. “Bilgi Taramalarının Nitelik Açısından DeğerlendirilmesindeKesin İsabet’ (Kİ-Precision) veErişim İsabeti’ (Eİ-Recall) Oranları”, Türk Kütüphaneciliği 8(4): 254-265, 1994.

Alkan, N. “Bilgi Taramalarında Temel Başarısızlık Nedenleri”, Türk Kütüphaneciliği 9(2): 91-102, 1995.

Bitirim, Y., Tonta, Y.  ve Sever, H.  “Information Retrieval Effectiveness of Turkish Search Engines,” In Tatyana M. Yakhno (Ed.): Advances in Information Systems: Second International Conference, ADVIS 2002, İzmir, Turkey, October 23-25, 2002, Proceedings. (pp. 93-103). Berlin: Springer-Verlag, 2002. (PDF copy)

Blair, D.C. & Maron, M.E. "An Evaluation of Retrieval Effectiveness for a Full-Text Document Retrieval System," Communications of the ACM 28(3): 289-299, 1985.

Borges, J.L. "The Library of Babel", Labyrinths: Selected Stories & Other Writings.

Bush, V. "As We May Think" (http://www.theatlantic.com/unbound/flashbks/computer/bushf.htm) (Originally appeared in Atlantic Monthly, 1945.)  See also:  [Reader]

Croft, B. "What Do People Want From Information Retrieval?" D-Lib Magazine (November 1995).

Cooper, W.S. "Getting Beyond Boole." Information Processing & Management 24(3): 243-248, 1988.

Davis, C. "Beyond Boole: The Next Logical Step." ASIS Bulletin June 1995

Gudivada, V.N. et al. "Information Retrieval on the World Wide Web," IEEE Internet Computing 1(5): 58-68, 1997.

Soydal, İrem, Umut Al ve Umut Sezen, "İçerik Tabanlı Görüntü Erişim Sistemleri"  Bilgi Dünyası, 6(2): 155-170, Ekim 2005.

Swanson, D.R. (1988). Historical note: information retrieval and the future of an illusion. Journal of the American Society for Information Science, 39 (2), 92-98.

Tonta, Y. “Bilgi Erişim Sistemleri”, Türk Kütüphaneciliği 9(3): 302-314, Eylül 1995.

Tonta, Y. " Bilgi erişim sorunu" Ed. Tülay Fenerci ve Oya Gürdal, 21. Yüzyıla Girerken Enformasyon Olgusu: Ulusal Sempozyum Bildirileri, 19-20 Nisan 2001, Hatay içinde (s. 198-206). Ankara: Türk Kütüphaneciler Derneği, 2001.

Tonta, Y. Bilgi Erişim Sorunları ve Internet".

37. Kütüphane Haftası Bildirileri, 26 Mart - 1 Nisan 2001, Milli Kütüphane, Ankara. Ankara: TKD, 2002.

Tonta, Y. Bilgi yönetiminin kavramsal tanımı ve uygulama alanları. (bildiri) Kütüphaneciliğin Destanı Uluslararası Sempozyumu, 21-24 Ekim 2004, Ankara (Bildiriler) içinde (55-68). Ankara: AÜ DTCF Bilgi ve Belge Yönetimi Bölümü, 2004.(PDF copy) (Sunuş slaytları )

Tonta, Y. Bibliyografik Kayıtlar için İşlevsel Gerekler (FRBR) Kavramsal Modeli. Prof. Dr. Nilüfer Tuncer'e Armağan içinde (ss. 278-290). (Haz. M. Emin Küçük) Ankara: TKD, 2005. (PDF)

Wells, H.G. "World Brain: The Idea of a Permanent World Encyclopedia," (http://art-bin.com/art/obrain.html). (Originally appeared in Encyclopédie Française, August 1937.)

Additional Readings

These a readings for background and discussion. Some of these will be assigned for class discussion, others are for those who wish to dig further into particular subjects. The list is based on the textbook of readings on IR by Peter Willett and Karen Sparck Jones, with some additional items.

* HISTORICAL: These items cover some early ideas and implementations that provide some of the foundations of information retrieval theory and practice.

 

LUHN57

Luhn, H.P. (1957). A Statistical Approach to Mechanized Encoding and Searching of Literary Information. IBM Journal of Research and Development, 1, 309-317.

FAIR58

Fairthorne, R.A. (1958). Automatic Retrieval of Recorded Information. Computer Journal, 1, 36-41. (Also in Fairthorne, R.A. (1961). Towards information retrieval. London: Butterworths).

JOYC58

Joyce, T. and Needham, R.M. (1958). The thesaurus approach to information retrieval. American Documentation, 9 (3), 192-197.

LUHN61

Luhn, H.P. (1961). The automatic derivation of information retrieval encodements from machine-readable texts. Information retrieval and machine translation (Ed A. Kent), Vol 3, Pt 2, 1021-1028; reprinted in C.K. Schultz, Ed, H.P. Luhn: Pioneer of information science, New York: Spartan Books, 1968,

MARO60

Maron, M.E. and Kuhns, J.L. (1960). On relevance, probabilistic indexing and information retrieval. Journal of the Association for Computing Machinery, 7, 216-244.

MARO61

Maron, M.E. (1961). Automatic indexing: an experimental inquiry. Journal of the Association for Computing Machinery, 8, 404-417.

DOYL62

Doyle, L.B. (1962). Indexing and abstracting by association. Part 1. SP-718/001/00, System Development Corporation, Santa Monica CA.

MARO65

Maron, M.E. (1965). Mechanised documentation: the logic behind a probabilistic interpretation. Statistical methods for mechanised documentation (Ed M.E. Stevens, V.E. Giuliano and L.B. Heilprin), National Bureau of Standards Miscellaneous Publication 269, Washington DC: US Government Printing Office, 9-13.

CLEV67

Cleverdon, C.W. (1967). The Cranfield tests on index language devices. Aslib Proceedings, 19, 1967, 173-192.

SALT68

Salton, G, and Lesk, M.E. (1968). Computer evaluation of indexing and text processing. Journal of the ACM, 15 (1), 8-36; reprinted in G. Salton, Ed, The SMART retrieval system, Englewood Cliffs NJ: Prentice-Hall, 1971, 143-180.

 

* KEY CONCEPTS: These papers examine the nature of documents, aboutness, indexing and index languages, requests, relevance, users and searching. Note this section deals with these topics primarily in an analytical and descriptive style, rather than by wholesale modelling of the retrieval process, covered in a later section.

 

HUTC78

Hutchins, W.J. (1978). The concept of `aboutness' in subject indexing. Aslib Proceedings, 30. 172-181.

CLEV63

Cleverdon, C.W. and Mills, J. (1963). The testing of index language devices. Aslib Proceedings, 15 (4), 106-130; reprinted in L.M. Chan, P.A. Richmond and E. Svenonius, Eds, Theory of Subject Analysis, Littleton CO: Libraries Unlimited, 1986, 223-246.

FOSK80

Foskett, D.J. (1980). Thesaurus. in A. Kent. H. Lancour and J.E. Daily, Eds, Encyclopedia of Library and Information Science, Vol 30, New York: Marcel Dekker, 416-462; reprinted in E.D. Dym, Ed, Subject and information analysis, New York: Marcel Dekker, 1985, 270-316.

DANI85

Daniels, P.J., Brooks, H.M. and Belkin, N.J. (1985). Using problem structures for driving human-computer dialogues. RIAO-85, Actes: Recherche d'Informations Assistee par Ordinateur, Grenoble: IMAG, 645-660.

SARA75

Saracevic, T. (1975). Relevance: a review of and a framework for the thiniking on the notion in information science. Journal of the American Society for Information Science, 39 (3) 321-343.

 

* EVALUATION: These papers cover the notions of performance issues, criteria for performance evaluation, test design and methodology, with examples illustrating the methods.

 

SARA88

Saracevic, T. et al (1988). A study of information seeking and retrieving, Parts 1,2,3. Journal of the American Society for Information Science, 39 (3), 161-216. Pt 1 only

COOP73

Cooper, W.S. (1973). On selecting a measure of retrieval effectiveness. Pt 1. Journal of the American Society for Information Science, 24 (?2), 87-100.

TAGU92

Tague-Sutcliffe, J. (1992). The pragmatics of information retrieval experimentation, revisited. Information Processing and Management, 28 (4), 467-490.

KEEN92

Keen, E.M. (1992). Presenting results if experimental retrieval comparisons. Information Processing and Management, 28 (4), 491-502.

LANC69

Lancaster, W.F. (1969). MEDLARS: Report on the evaluation of its operating efficiency. American Documentation, 20 (2), 119-142; reprinted in T. Saracevic, Ed, Introduction to Information Science, New York: Bowker, 1970, 640-664.

BLAI85

Blair, D.C. and Maron. M.E. (1985). An evaluation of retrieval effectiveness for a full-text document retrieval system. Communications of the ACM, 28 (??), 289-299.

SALT86

Salton, G. (1986). Another look at text-retrieval systems. Communications of the ACM, 29(7), 648-656.

BLAI90

Blair, D.C. and Maron, M.E. (1990). Full text information retrieval: further analysis and clarification. Information Processing and Management, 26, 437-447.

BLAI96

Blair, D.C. (1996). STAIRS redux: thoughts on the STAIRS evaluations, ten years after. Journal of the American Society for Information Science, 47, 4-22.

HARM95

Harman, D. (1995). The TREC Conferences. Hypertext - information retrieval - multimedia: synergieeffekte elektronischer informationssysteme, HIM '95, Proceedings (Ed R. Kuhlen and M. Rittberger), Konstanz: Universitaetsforlag Konstanz, 9-28.

 

* BASIC IR MODELS: These papers cover models of IR, both qualitative and quantitative (eg cognitive, statistical), concentrating on the general notions of the main IR models. Implementation issues are described later in Techniques.

 

ROBE77b

Robertson, S.E. (1977). Theories and models in information retrieval. Journal of Documentation, 33, 126-148.

BELK82

Belkin, N.J., Oddy, R.N. and Brooks, H.M. (1982). ASK for information retrieval: part 1. Background and theory. Journal of Documentation, 38, 61-71.

COOP88

Cooper, W.S. (1988). Getting beyond Boole. Information Processing and Management, 24, 243-248.

ROBE77b

Robertson, S.E. (1977). The probability ranking principle in IR. Journal of Documentation, 33, 294-304.

SALT75

Salton, G. Wong, A. and Yang, C.S. (1975). A vector space model for automatic indexing. Communications of the ACM, 18 (11), 613-620.

ROBE82

Robertson, S.E., Maron, M.E. and Cooper, W.S. (1982). Probability of Relevance: A Unification of Two Competing Models for Document Retrieval. Information Technology: Research and Development, 1, 1-21.

TURT90

Turtle, H.R. and Croft, W.B. (1990). Inference networks for document retrieval. Proceedings of the 13th International Conference on Research and Development in Information Retrieval, 1-24, 1990.

VANR86

van Rijsbergen, C.J. (1986). A non-classical logic for information retrieval. Computer Journal, 29, 481-485, 1986.

 

*IR TECHNIQUES: These papers examine the details of various models and other specific techniques and technologies, including reports of testing.

 

BELK87

Belkin, N.J. and Croft, W.B. (1987). Retrieval Techniques. Annual Review of Information Science and Technology, 22, 109-145.

ROBE76

Robertson, S.E. and Sparck Jones, K. (1976). Relevance Weighting of Search Terms. Journal of the American Society for Information Science, 27(3), 129-146.

CROF79

Croft, W.B. and Harper, D.J. (1979). Using probabilistic models of document retrieval without relevance information. Journal of Documentation, 35, 285-295.

PORT80

Porter, M.F. (1980). An algorithm for suffix stripping. Program, 14, 130-137.

ROBE94

Robertson, S.E. and Walker, S. (1994). Some simple effective approximations to the 2 Poisson model for probabilistic weighted retrieval. SIGIR 94 - Proceedings of the Seventeenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 232-241.

SALT88

Salton, G. and Buckley, C. (1988). Term weighting approaches in automatic text retrieval. Information Processing and Management, 24, 513-523.

SALT90

Salton, G. and Buckley, C. (1990). Improving retrieval performance by relevance feedback, Journal of the American Society for Information Science, 41, 288-297, 1990.

SPAR79

Sparck Jones, K. (1979). Search term relevance weighting given little relevance information. Journal of Documentation, 35 (1), 30-48.

STRZ94

Strzalkowski, T. (1994) Robust text processing in automated information retrieval. Proceedings of the 4th Conference on Applied Natural Language Processing (stuttgart), Association for Computational Lingustics, 168-173.

GRIF86

Griffiths, A., Luckhurst, H.C. and Willett, P. (1986). Using interdocument similarity information in document retrieval systems. Journal of the American Society for Information Science, 37 (1), 3-11.

BELK92

Belkin, N.J. and Croft, W.B. (1992). Information filtering and information retrieval: two sides of the same coin? Communications of the ACM, 35(12), 29-38.

 

* SYSTEMS: This section includes papers describing complete IR systems, focussing on those embodying modern views of what such systems should be like, but also illustrating the status of more `conventional' systems.

 

SALT83

Salton, G. and McGill, M.J. (1983). The SMART and SIRE experimental retrieval systems. In Introduction To Information Retrieval, New York, McGraw-Hill, pp 118-156.

HARM92

Harman, D. (1992). User-friendly systems instead of user-friendly front-ends. Journal of the American Society for Information Science, 43 (?), 164-174.

WALK89

Walker, S. (1989). The Okapi online catalogue research projects. in The online catalogue: developments and directions (Ed C. Hildreth), London: The Library Association, 84-106.

CALL95

Callan, J.; Croft, W.B. and Broglio, J. (1995). TREC and TIPSTER experiments with INQUERY. Information Processing and Management, 31 (3).

FOX87

Fox, E.A. and France, R.K. (1987). Architecture of an expert system for composite document analysis, representation and retrieval. Journal of Approximate Reasoning, 1, 151-175.

FOX88

Fox, E.A. and Koll, M.B. (1988). Practical enhanced Boolean retrieval: experiences with the SMART and SIRE systems. Information Processing and Management, 24, 257-267.

MCCU85

McCune, B.P., Tong, R. and Dean, J. (1985). RUBRIC, a system for rule-based information retrieval. IEEE Transactions on Software Engineering. SE11-9, 939-944.

JACO90

Jacobs, P.S. and Rau, L.F. (1990). SCISOR: extracting information from on-line news. Communications of the ACM, 33(11), 88-97.

LARS96

Larson, R.R., McDonough, J., Kuntz, L., O'Leary, P. and Moon, R. ``Cheshire II: Designing a Next-Generation Online Catalog.'' Journal of the American Society for Information Science, 47(7) (July 1996), p. 555-567.

TENO94

Tenopir, C. and Cahn, P. (1994). TARGET and FREESTYLE: DIALOG and Mead join the relevance ranks. Online, 18 (3), 31-47. (shorter after ads deleted)

 

* EXTENSIONS: These papers move outwards from the classical text document/single query situation to consider other types of `document' and other versions and aspects of the information access task. The object is to illustrate the scope of information retrieval viewed more broadly, and to draw attention to the links between retrieval and other information processing activities. At the same time, since some of the ideas and work covered here also reflect new challenges and possibilities stemming from recent technology developments, this section has papers to be taken as initial leads into the future, rather than as authoritative guides to the established wisdom.

 

LARS88

``Hypertext and Information Retrieval: Towards the Next Generation of Information Systems''. In: Borgman, C. L. and Pai, E. Y. H. (Eds.) Information and Technology: Proceedings of the 51st ASIS Annual Meeting, Medford, NJ: Learned Information, Inc., 1988.

AGOS92

Agosti, M. Gradenigo, G. and Marchetti, P.G. (1992). A hypertext environment for interacting with large databases Information Processing and Management, 28 93), 371-387.

SALT94

Salton, G., Allan, J., Buckley, C. and Singhal, A. (1994). Automatic analysis, theme generation, and summarisation of machine-readable texts. Science, 264, 3 June, 1421-1426.

HULL96

Hull, D.A. and Grefenstette, G. (1996). Experiments in multilingual retrieval. Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.

ROSE91

Rose, R.C. (1991). Techniques for information retrieval from speech messages. Lincoln Laboratory Journal, 4 (1), 45-59.

ZHAN95

Zhang, H.J., Low, C.Y., Smoliar, S.W. and Wu, J.H. (1995). Video parsing, retrieval and browsing: an integrated and content-based solution. Proceedings of ACM Multimedia '95, 15-24; reprinted in Intelligent multimedia information retrieval (Ed M. Maybury).

BIEB88

Biebricher, B. et al (1988). The automatic indexing system AIR/PHYS - from research to application. Eleventh International Conference on Research and Development in Information Retrieval, 333-342.

HAYE88

Hayes, P.J., Knecht, L. and Cellio, M. (1988). A news story categorisation system. Proceedings of the Second Conference on Applied Natural Language Processing, Association for Computational Linguistics, 9-17.

RAU88

Rau, L.F. (1988). Conceptual information extraction and retrieval from natural language input. RIAO 88, 424-437.

MARS84

Marsh, E., Hamburger, H. and Grishman, R. (1984). A production rule system for message summarisation. AAAI-84, Proceedings, American Association for Artificial Intelligence, 243-246.

JOHN93

Johnson, F.C., Paice, C.D., Black, W.J. and Neal, A.P. (1993). The application of linguistic processing to automatic abstract generation. Journal of Document and Text Management, 1 (3), 215-241.

More resources on information retrieval  (web pages, articles, textbooks, etc.) (prepared by the IR Research Group of UIUC).  (The original page is accessible at http://leep.lis.uiuc.edu/spring98/lis329/links.html)

Yaşar Tonta

tonta@hacettepe.edu.tr

Last updated: Feb 12, 2007