Information Retrieval
Spring 2015
9:00 ~12:00 AM, Tuesdays
Prof. Berlin Chen (陳柏琳)

Tentative List of Topics:


Course Overview & Introduction

Book Chapter: Modern Information Retrieval, Ch. 1
Paper: The History of Information Retrieval Research
03/10   Classical Models cf. Modern Information Retrieval, Ch.3
03/17   Evaluation Metrics cf. Modern Information Retrieval, Ch.4
03/24   Benchmark Collections cf. Modern Information Retrieval, Ch.4
Homework #1 :Evaluation Metrics for IR
Homework #2 : Retrieval Models
03/31   Extensions of Classical (Set, Algebra & Probabilistic) Models  
04/07   Relevance Feedback and Query Expansion Homework #3 : Relevance Feedback and Query Expansion
04/14   Relevance Feedback and Query Expansion  
04/28   Latent Semantic Analysis  
05/05   Language Modeling for Information Retrieval  
05/12   Clustering: Metrics and Techniques  
05/19   Clustering: Metrics and Techniques  
05/26   Indexing And Searching
Homework #4 : Latent Semantic Analysis (Document Embedding) for IR
06/02   Paper Presentation
簡偉智 ACL 2014: Two-Stage Hashing for Fast Document Retrieval
許曜麒 SIGIR 2014: A Collective Topic model for Milestone Paper Discovery
陳佳均 ICALP 2013: Information Retrieval Model Combining Sentence Level Retrieval
廖柏翔 ICTIR 2013: A Standard Document Score for Information Retrieval
陳之中 SIGIR 2014: Hashtag Recommendation for Hyperlinked Tweets
江浩群 ICSE 2012: Where Should the Bugs Be Fixed? More Accurate Information Retrieval-Based Bug Localization Based on Bug Reports
06/09   陳盈君 CIKM 2014: Spatial Verification for Scalable Mobile Image Retrieval
王    涵 SIGIR 2014: On Measuring Social Friend Interest Similarities in Recommender System
蔡謹安 KDD 2014 KDD: Grouping Students in Educational Settings
林祺傑 SIGIR 2014: Mobile Query Reformulations
楊明翰 SIGIR 2014: The Role of Network Distance in LinkedIn People Search
張書堯 ICMC SMC 2014: Query-by-Multiple-Examples: Content-Based Search in Computer-Assisted Sound-Based Musical Composition
王鼎中 SIGIR 2014: A Simple Term Frequency Transformation Model for Effective Pseudo Relevance Feedback
06/16   Consultations on Homework Assignments (Room 203 of the CSIE Department)  
    User Interface for Search  
    Web Search Basics  
    Brief Overview of Automatic Summarization  


R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval: The Concepts and Technology behind Search (2nd Edition), ACM Press, 2011

Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press, 2008
W. Bruce Croft, Donald Metzler, and Trevor Strohman, Search Engines: Information Retrieval in Practice, Addison Wesley, 2009


C. C. Aggarwal, ,C.X. Zhai (eds.), Mining Text Data, Springer, 2012.
W. B. Frakes and R. Baeza-Yates, Information Retrieval: Data Structures & Algorithms,  Prentice-Hall, 1992.
C.X. Zhai, Statistical Language Models for Information Retrieval (Synthesis Lectures Series on Human Language Technologies), Morgan & Claypool Publishers, 2008)
W. B. Frakes and R. Baeza-Yates, Information Retrieval: Data Structures & Algorithms,  Prentice-Hall, 1992.

T. K. Landauer, D. S. McNamara, S. Dennis, W. Kintsch (eds.) , Handbook of Latent Semantic Analysis, Lawrence Erlbaum, 2007
D. A. Grossman, O. Frieder, Information Retrieval: Algorithms and Heuristics, Springer, 2004.
 I. H. Witten, A. Moffat, and T. C. Bell, Managing Gigabytes: Compressing and Indexing Documents and Images, Morgan Kaufmann Publishing, 1999.
C. Manning and H. Schutze, Foundations of Statistical Natural Language Processing, MIT Press, 1999.
D. Jurafsky and J. H. Martin, Speech and Language Processing, Prentice-Hall, 2000.
W.B. Croft and J. Lafferty (eds.), Language Models for Information Retrieval, Kluwer International Series on Information Retrieval, Volume 13, Kluwer Academic Publishers, 2002.
Stephen Robertson and Hugo Zaragoza, The Probabilistic Relevance Framework: BM25 and Beyond. Foundations and Trends in Information Retrieval 3 no. 4, 333-389 (2009).
D. Carmel and E. Yom-Tov , "Estimating the Query Difficulty for Information Retrieval," Synthesis Lectures on Information Concepts, Retrieval, and Services, Morgan & Claypool Publishers, 2010.
Juan-Manuel Torres-Moreno , "Automatic Text Summarization," Wiley-ISTE, 2014.


M. Sanderson and W. B. Croft, "The history of information retrieval research," Proceedings of the IEEE, Vol. 100, pp. 1444 - 1451, May 2012.
O. Kolomiyets, M.-F. Moens, "A survey on question answering technology from an information retrieval perspective," Information Sciences 181 (2011) 5412–5434
Johan Schalkwyk et al., "Google Search by Voice: A case study," 2010.
D. Blei, A. Ng, and M. Jordan, "Latent Dirichlet allocation,"  Journal of Machine Learning Research, 3:993-1022, January 2003.
V. Lavrenko and W.B. Croft, "Relevance-Based Language Models"  ACM SIGIR 2001.
C. H. Papadimitriou, P. Raghavan, H. Tamaki, S. Vempala, "Latent semantic indexing: A probabilistic analysis,'' analyzes an information retrieval technique related to principle components analysis.
Liu, X. and Croft, W.B., "Statistical Language Modeling For Information Retrieval,"  the Annual Review of Information Science and Technology, vol. 39, 2005
Lan Huang. A Survey On Web Information Retrieval Technologies. 2000.
Karen Spa¨rck Jones, "Some Points in a Time," Computational Linguistics, Vol. 31, No. 1, 2005.
D. Hiemstra, "Information Retrieval Model," In: A. Goker, J. Davies, and M. Graham (eds.), Information Retrieval: Searching in the 21st Century, Wiley, 2009
M. Steyvers, T. Griffiths,  "Probabilistic Topic Models," In T. K. Landauer, D. S. McNamara, S. Dennis, W. Kintsch (eds.). Handbook of Latent Semantic Analysis, Mahwah NJ: Lawrence Erlbaum, 2007.
X. Yi, J. Allan,  "A Comparative Study of Utilizing Topic Models for Information Retrieval," in the Proceedings of ECIR'09.
Nallapati, Discriminative Models for Information Retrieval, in the Proceedings of SIGIR 2004
T. Joachims and F. Radlinski, Search Engines that Learn from Implicit Feedback, IEEE Trans. on Computer 40(8), pp. 34-40, 2007
B. Chen, H.M. Wang, L.S. Lee, “A discriminative HMM/N-gram-based retrieval approach for Mandarin spoken documents,” ACM Transactions on Asian Language Information Processing, Vol. 3, No. 2, pp. 128-145, June 2004.


Information Retrieval Resources

            SIGIR-Information Retrieval Resources