http://http:/yunus.hacettepe.edu.tr/~tonta/courses/spring2007/bby220/rdnglist.htm
Hacettepe
University Department of Information Management
PBBY 220 Principles of
Information Retrieval (Spring 2007) Yaşar Tonta
Time and place: Tuesdays 9:30-12:20 (Amfi 1)
Instructor: Yaşar Tonta (e-mail: tonta@hacettepe.edu.tr;
tel: 297 82 04)
Course web site: http://http:/yunus.hacettepe.edu.tr/~tonta/courses/spring2007/bby220/rdnglist.htm
Course
Objectives Course
Schedule
SUGGESTED READINGS
Textbooks
Baeza-Yates, R. and Ribeiro. Modern Information Retrieval, Baeza-Yates and Ribeiro, Addison
Wesley, 1999. (See also http://www.sims.berkeley.edu/~hearst/irbook.)
Brown, J.S. & Duguid, P. The Social Life of Information. Boston, MA: Harvard Business School Press, 2000.
Chu, Heting. Information Representation and Retrieval in the Digital Age. Medford, NJ: ASIST, 2003.
Buckland,
M.K. Information and Information Systems. New York: Praeger,
1991.
Korfhage, Robert R. Information Storage
and Retrieval.
New
York: Wiley, 1997.
information
about the book
glossary
of IR terms
Korfhage's
course syllabus on "Information Storage and Retrieval")
Lancaster, F.W. Information
Retrieval Systems. (2d
ed.). Wiley, 1979.
Lesk, M. Digital Libraries. San Francisco: Morgan Kaufmann, 2001.
Rowley, J. Bilginin Düzenlenmesi: Bilgi Erişime Giriş.
(Çev. S. Karakaş ve başk.)
Ankara: TKD Ankara Şubesi, 1996.
Salton, G. Automatic text processing: the transformation,
analysis, and retrieval of information by computer. Reading, Mass. : Addison-Wesley, 1988.
Salton, G. and McGill, M.J. Introduction
to Modern Information Retrieval. New York: McGraw-Hill, 1983.
Sparck Jones, K. and Willett, P. Readings in
Information Retrieval. San Francisco : Morgan Kaufmann, 1997.
Svenonius, E. The Intellectual Foundation of
Information Organization. Cambridge, MA: MIT Press, 2000.
Taylor, A.G. The Organization of Information. Englewood, CO: Libraries Unlimited, 1999.
Tonta, Y., Bitirim, Y. ve Sever, H. Türkçe Arama Motorlarında Performans Değerlendirme.
(Performance Evaluation of Turkish Search Engines). Ankara: Total Bilişim
Ltd. Şti.,
2002. xvi, 152 s. (ISBN 975 92923-0-0). ( Özet )
( Summary ) ( Tam metin (PDF) )
Van Rijsbergen,
C.J. Information Retrieval. (2d ed.) London: Butterworths,
1979. (The full-text of the book is available both as HTML and PDF formats.
Just follow the above link.)
Articles
Alkan, N. “Bilgi
Taramalarının Nitelik Açısından Değerlendirilmesinde
‘Kesin İsabet’
(Kİ-Precision) ve ‘Erişim
İsabeti’ (Eİ-Recall) Oranları”,
Türk Kütüphaneciliği
8(4): 254-265, 1994.
Alkan, N. “Bilgi
Taramalarında Temel Başarısızlık Nedenleri”,
Türk Kütüphaneciliği 9(2):
91-102, 1995.
Bitirim, Y., Tonta, Y. ve
Sever, H. “Information Retrieval Effectiveness of Turkish Search
Engines,” In Tatyana M. Yakhno (Ed.): Advances
in Information Systems: Second International Conference, ADVIS 2002, İzmir, Turkey, October
23-25, 2002, Proceedings.
(pp. 93-103). Berlin: Springer-Verlag,
2002. (PDF
copy)
Blair, D.C. & Maron, M.E. "An Evaluation of Retrieval
Effectiveness for a Full-Text Document Retrieval System," Communications
of the ACM 28(3): 289-299, 1985.
Borges, J.L. "The Library of Babel",
Labyrinths: Selected Stories & Other Writings.
Bush, V. "As We May Think" (http://www.theatlantic.com/unbound/flashbks/computer/bushf.htm)
(Originally appeared in Atlantic Monthly, 1945.) See also: [Reader]
Croft, B. "What Do People Want From Information Retrieval?" D-Lib Magazine (November 1995).
Cooper, W.S. "Getting
Beyond Boole." Information Processing &
Management 24(3): 243-248, 1988.
Davis, C. "Beyond Boole: The Next Logical Step." ASIS Bulletin
June 1995
Gudivada, V.N. et al. "Information
Retrieval on the World Wide Web," IEEE Internet Computing 1(5):
58-68, 1997.
Soydal, İrem,
Umut Al ve Umut Sezen, "İçerik Tabanlı Görüntü Erişim Sistemleri" Bilgi
Dünyası, 6(2): 155-170, Ekim
2005.
Swanson, D.R. (1988).
Historical note: information retrieval and the future of an illusion. Journal
of the American Society for Information Science, 39 (2), 92-98.
Tonta, Y. “Bilgi
Erişim Sistemleri”,
Türk Kütüphaneciliği
9(3): 302-314, Eylül 1995.
Tonta, Y. " Bilgi erişim
sorunu" Ed. Tülay Fenerci ve Oya
Gürdal, 21. Yüzyıla Girerken Enformasyon
Olgusu: Ulusal Sempozyum Bildirileri, 19-20
Nisan 2001, Hatay içinde (s. 198-206). Ankara: Türk Kütüphaneciler Derneği, 2001.
Tonta, Y. Bilgi Erişim Sorunları
ve Internet".
37. Kütüphane Haftası Bildirileri, 26 Mart - 1 Nisan 2001, Milli Kütüphane, Ankara. Ankara:
TKD, 2002.
Tonta, Y. Bilgi yönetiminin kavramsal tanımı ve uygulama alanları. (bildiri) Kütüphaneciliğin
Destanı Uluslararası Sempozyumu, 21-24 Ekim 2004, Ankara (Bildiriler) içinde
(55-68). Ankara: AÜ DTCF Bilgi ve Belge
Yönetimi Bölümü, 2004.(PDF copy) (Sunuş slaytları
)
Tonta, Y. Bibliyografik Kayıtlar için İşlevsel Gerekler
(FRBR) Kavramsal Modeli. Prof. Dr. Nilüfer Tuncer'e Armağan içinde (ss. 278-290). (Haz. M. Emin Küçük) Ankara: TKD, 2005.
(PDF)
Wells, H.G.
"World Brain: The Idea of a Permanent World
Encyclopedia," (http://art-bin.com/art/obrain.html). (Originally appeared in Encyclopédie
Française, August 1937.)
These a readings for
background and discussion. Some of these will be assigned for class discussion, others are for those who wish to dig further
into particular subjects. The list is based on the textbook of readings on IR
by Peter Willett and Karen Sparck Jones, with some
additional items.
* HISTORICAL: These items cover some
early ideas and implementations that provide some of the foundations of
information retrieval theory and practice.
LUHN57
Luhn, H.P. (1957). A Statistical Approach to
Mechanized Encoding and Searching of Literary Information. IBM
Journal of Research and Development, 1, 309-317.
FAIR58
Fairthorne, R.A. (1958). Automatic Retrieval
of Recorded Information. Computer Journal, 1, 36-41. (Also in Fairthorne, R.A. (1961). Towards information retrieval. London: Butterworths).
JOYC58
Joyce, T. and Needham, R.M. (1958). The thesaurus approach to
information retrieval. American Documentation, 9 (3), 192-197.
LUHN61
Luhn, H.P. (1961). The automatic derivation
of information retrieval encodements from
machine-readable texts. Information retrieval and machine translation
(Ed A. Kent), Vol 3, Pt 2, 1021-1028; reprinted in
C.K. Schultz, Ed, H.P. Luhn: Pioneer of
information science, New York: Spartan Books, 1968,
MARO60
Maron, M.E. and Kuhns, J.L. (1960). On relevance,
probabilistic indexing and information retrieval. Journal
of the Association for Computing Machinery, 7, 216-244.
MARO61
Maron, M.E. (1961).
Automatic indexing: an experimental inquiry. Journal of
the Association for Computing Machinery, 8, 404-417.
DOYL62
Doyle, L.B.
(1962). Indexing and abstracting by association. Part 1.
SP-718/001/00, System
Development Corporation, Santa Monica CA.
MARO65
Maron, M.E. (1965). Mechanised documentation: the logic behind a probabilistic
interpretation. Statistical methods for mechanised
documentation (Ed M.E. Stevens, V.E. Giuliano and
L.B. Heilprin), National Bureau of Standards
Miscellaneous Publication 269, Washington DC: US
Government Printing Office, 9-13.
CLEV67
Cleverdon, C.W. (1967). The Cranfield
tests on index language devices. Aslib
Proceedings, 19, 1967, 173-192.
SALT68
Salton, G, and Lesk, M.E. (1968). Computer evaluation of indexing and
text processing. Journal of the ACM, 15 (1), 8-36; reprinted in
G. Salton, Ed, The SMART retrieval system,
Englewood Cliffs NJ: Prentice-Hall, 1971, 143-180.
* KEY CONCEPTS: These papers examine
the nature of documents, aboutness, indexing and
index languages, requests, relevance, users and searching. Note this section
deals with these topics primarily in an analytical and descriptive style,
rather than by wholesale modelling of the retrieval
process, covered in a later section.
HUTC78
Hutchins,
W.J. (1978). The concept of `aboutness'
in subject indexing. Aslib Proceedings, 30. 172-181.
CLEV63
Cleverdon, C.W. and
Mills, J. (1963). The testing of index language
devices. Aslib Proceedings, 15
(4), 106-130; reprinted in L.M. Chan, P.A. Richmond and E. Svenonius,
Eds, Theory of Subject Analysis, Littleton CO:
Libraries Unlimited, 1986, 223-246.
FOSK80
Foskett, D.J. (1980). Thesaurus. in A. Kent. H. Lancour and J.E.
Daily, Eds, Encyclopedia of Library and
Information Science, Vol 30, New York: Marcel Dekker, 416-462; reprinted in E.D. Dym,
Ed, Subject and information analysis, New York: Marcel Dekker, 1985, 270-316.
DANI85
Daniels, P.J., Brooks, H.M. and Belkin, N.J. (1985). Using problem structures for driving
human-computer dialogues. RIAO-85, Actes: Recherche d'Informations Assistee par Ordinateur, Grenoble: IMAG, 645-660.
SARA75
Saracevic, T. (1975). Relevance: a
review of and a framework for the thiniking on the
notion in information science. Journal of the American Society for
Information Science, 39 (3) 321-343.
* EVALUATION: These papers cover the
notions of performance issues, criteria for performance evaluation, test design
and methodology, with examples illustrating the methods.
SARA88
Saracevic, T. et al
(1988). A study of information seeking and retrieving, Parts 1,2,3. Journal of the American Society for Information
Science, 39 (3), 161-216. Pt 1 only
COOP73
Cooper, W.S.
(1973). On selecting a measure of retrieval effectiveness.
Pt 1. Journal of the American Society for
Information Science, 24 (?2), 87-100.
TAGU92
Tague-Sutcliffe, J. (1992). The pragmatics of information
retrieval experimentation, revisited. Information Processing and Management,
28 (4), 467-490.
KEEN92
Keen, E.M. (1992). Presenting results if experimental retrieval
comparisons. Information Processing and Management, 28 (4), 491-502.
LANC69
Lancaster,
W.F. (1969). MEDLARS: Report on the evaluation of its operating efficiency. American
Documentation, 20 (2), 119-142; reprinted in T. Saracevic,
Ed, Introduction to Information Science, New York: Bowker,
1970, 640-664.
BLAI85
Blair, D.C. and Maron. M.E. (1985). An evaluation of retrieval
effectiveness for a full-text document retrieval system. Communications
of the ACM, 28 (??), 289-299.
SALT86
Salton, G. (1986). Another look at
text-retrieval systems. Communications of the ACM, 29(7), 648-656.
BLAI90
Blair, D.C. and Maron, M.E. (1990). Full text
information retrieval: further analysis and clarification. Information
Processing and Management, 26, 437-447.
BLAI96
Blair, D.C.
(1996). STAIRS redux: thoughts on the STAIRS
evaluations, ten years after. Journal of the American Society for
Information Science, 47, 4-22.
HARM95
Harman, D.
(1995). The TREC Conferences. Hypertext -
information retrieval - multimedia: synergieeffekte elektronischer informationssysteme,
HIM '95, Proceedings (Ed R. Kuhlen and M. Rittberger), Konstanz: Universitaetsforlag Konstanz,
9-28.
* BASIC IR MODELS: These papers cover
models of IR, both qualitative and quantitative (eg
cognitive, statistical), concentrating on the general notions of the main IR
models. Implementation issues are described later in Techniques.
ROBE77b
Robertson,
S.E. (1977). Theories and models in information retrieval.
Journal of Documentation, 33, 126-148.
BELK82
Belkin, N.J., Oddy, R.N. and Brooks, H.M. (1982). ASK for
information retrieval: part 1. Background and theory. Journal
of Documentation, 38, 61-71.
COOP88
Cooper, W.S.
(1988). Getting beyond Boole.
Information Processing and Management, 24, 243-248.
ROBE77b
Robertson,
S.E. (1977). The probability ranking principle in IR. Journal
of Documentation, 33, 294-304.
SALT75
Salton, G. Wong, A.
and Yang, C.S. (1975). A vector space model for automatic
indexing. Communications of the ACM, 18 (11), 613-620.
ROBE82
Robertson, S.E., Maron, M.E. and Cooper, W.S.
(1982). Probability of Relevance: A Unification of Two Competing
Models for Document Retrieval. Information Technology: Research and
Development, 1, 1-21.
TURT90
Turtle, H.R. and Croft, W.B. (1990). Inference networks for
document retrieval. Proceedings of the 13th International
Conference on Research and Development in Information Retrieval, 1-24,
1990.
VANR86
van Rijsbergen, C.J. (1986). A non-classical logic for information retrieval. Computer Journal, 29, 481-485, 1986.
*IR TECHNIQUES: These papers
examine the details of various models and other specific techniques and
technologies, including reports of testing.
BELK87
Belkin, N.J. and
Croft, W.B. (1987). Retrieval Techniques. Annual
Review of Information Science and Technology, 22, 109-145.
ROBE76
Robertson, S.E. and Sparck Jones, K. (1976). Relevance Weighting of Search Terms. Journal of the
American Society for Information Science, 27(3), 129-146.
CROF79
Croft, W.B. and Harper, D.J. (1979). Using
probabilistic models of document retrieval without relevance information.
Journal of Documentation, 35, 285-295.
PORT80
Porter, M.F.
(1980). An algorithm for suffix stripping. Program,
14, 130-137.
ROBE94
Robertson,
S.E. and Walker, S. (1994). Some simple effective
approximations to the 2 Poisson model for probabilistic weighted retrieval.
SIGIR 94 - Proceedings of the Seventeenth Annual International ACM SIGIR
Conference on Research and Development in Information Retrieval, 232-241.
SALT88
Salton, G. and
Buckley, C. (1988). Term weighting approaches in automatic text
retrieval. Information Processing and Management, 24, 513-523.
SALT90
Salton, G. and
Buckley, C. (1990). Improving retrieval performance by
relevance feedback, Journal of the American Society for Information Science,
41, 288-297, 1990.
SPAR79
Sparck Jones, K. (1979). Search term relevance weighting given
little relevance information. Journal of Documentation, 35 (1), 30-48.
STRZ94
Strzalkowski, T. (1994) Robust text
processing in automated information retrieval. Proceedings of the 4th
Conference on Applied Natural Language Processing (stuttgart), Association for Computational Lingustics, 168-173.
GRIF86
Griffiths, A., Luckhurst, H.C. and Willett, P.
(1986). Using interdocument
similarity information in document retrieval systems. Journal of the
American Society for Information Science, 37 (1), 3-11.
BELK92
Belkin, N.J. and Croft,
W.B. (1992). Information filtering and information retrieval: two
sides of the same coin? Communications of the ACM, 35(12), 29-38.
* SYSTEMS: This section includes
papers describing complete IR systems, focussing on
those embodying modern views of what such systems should be like, but also
illustrating the status of more `conventional' systems.
SALT83
Salton, G. and
McGill, M.J. (1983). The SMART and SIRE experimental
retrieval systems. In Introduction To
Information Retrieval, New York,
McGraw-Hill, pp 118-156.
HARM92
Harman, D.
(1992). User-friendly systems instead of user-friendly
front-ends. Journal of the American Society for Information Science,
43 (?), 164-174.
WALK89
Walker, S. (1989).
The Okapi online catalogue research projects. in
The online catalogue: developments and directions (Ed C. Hildreth),
London: The Library
Association, 84-106.
CALL95
Callan, J.; Croft,
W.B. and Broglio, J. (1995). TREC and TIPSTER experiments with INQUERY. Information Processing and Management, 31 (3).
FOX87
Fox, E.A. and France, R.K.
(1987). Architecture of an expert system for
composite document analysis, representation and retrieval. Journal of
Approximate Reasoning, 1, 151-175.
FOX88
Fox, E.A. and Koll, M.B. (1988). Practical
enhanced Boolean retrieval: experiences with the SMART and SIRE systems. Information
Processing and Management, 24, 257-267.
MCCU85
McCune, B.P., Tong, R. and Dean, J. (1985). RUBRIC, a system for rule-based information retrieval. IEEE Transactions on Software Engineering. SE11-9,
939-944.
JACO90
Jacobs, P.S. and Rau, L.F. (1990). SCISOR: extracting information from
on-line news. Communications of the ACM, 33(11), 88-97.
LARS96
Larson, R.R.,
McDonough, J., Kuntz, L., O'Leary, P. and Moon, R. ``Cheshire II:
Designing a Next-Generation Online Catalog.'' Journal of the American
Society for Information Science, 47(7) (July 1996), p. 555-567.
TENO94
Tenopir, C. and Cahn, P. (1994). TARGET and FREESTYLE: DIALOG and
Mead join the relevance ranks. Online, 18 (3), 31-47. (shorter after ads deleted)
* EXTENSIONS: These papers move
outwards from the classical text document/single query situation to consider
other types of `document' and other versions and aspects of the information
access task. The object is to illustrate the scope of information retrieval
viewed more broadly, and to draw attention to the links between retrieval and
other information processing activities. At the same time, since some of the
ideas and work covered here also reflect new challenges and possibilities
stemming from recent technology developments, this section has papers to be
taken as initial leads into the future, rather than as authoritative guides to
the established wisdom.
LARS88
``Hypertext
and Information Retrieval: Towards the Next Generation of Information
Systems''. In: Borgman, C. L. and Pai,
E. Y. H. (Eds.) Information and Technology: Proceedings of the 51st ASIS
Annual Meeting, Medford, NJ: Learned
Information, Inc., 1988.
AGOS92
Agosti, M. Gradenigo, G. and Marchetti, P.G.
(1992). A hypertext environment for interacting
with large databases Information Processing and Management, 28 93),
371-387.
SALT94
Salton, G., Allan,
J., Buckley, C. and Singhal, A. (1994). Automatic analysis, theme generation, and summarisation
of machine-readable texts. Science, 264, 3 June, 1421-1426.
HULL96
Hull, D.A. and Grefenstette, G. (1996). Experiments in multilingual retrieval. Proceedings
of the 19th Annual International ACM SIGIR Conference on Research and
Development in Information Retrieval.
ROSE91
Rose, R.C. (1991). Techniques for
information retrieval from speech messages. Lincoln Laboratory
Journal, 4 (1), 45-59.
ZHAN95
Zhang, H.J., Low, C.Y., Smoliar, S.W. and Wu,
J.H. (1995). Video parsing, retrieval and browsing: an integrated and
content-based solution. Proceedings of ACM Multimedia '95, 15-24;
reprinted in Intelligent multimedia
information retrieval (Ed M. Maybury).
BIEB88
Biebricher, B. et al (1988). The automatic
indexing system AIR/PHYS - from research to application. Eleventh
International Conference on Research and Development in Information Retrieval,
333-342.
HAYE88
Hayes, P.J., Knecht, L. and Cellio,
M. (1988). A news story categorisation
system. Proceedings of the Second Conference on Applied Natural
Language Processing, Association for Computational Linguistics, 9-17.
RAU88
Rau, L.F.
(1988). Conceptual information extraction and retrieval from natural language
input. RIAO 88, 424-437.
MARS84
Marsh, E., Hamburger, H. and Grishman, R.
(1984). A production rule system for message summarisation. AAAI-84, Proceedings, American
Association for Artificial Intelligence, 243-246.
JOHN93
Johnson, F.C., Paice, C.D., Black, W.J. and
Neal, A.P. (1993). The application of linguistic
processing to automatic abstract generation. Journal of Document and
Text Management, 1 (3), 215-241.
More resources on
information retrieval
(web pages, articles, textbooks, etc.) (prepared
by the IR Research Group of UIUC). (The original page is accessible at http://leep.lis.uiuc.edu/spring98/lis329/links.html)
Yaşar Tonta
tonta@hacettepe.edu.tr
Last updated: Feb 12, 2007