Information Retrieval 3170718 Syllabus Download With Weightage is presented in the 7th semester of the Computer department.

Sr. No.
1 Introduction to Information Retrieval: The nature of unstructured and semi-structured
text. Inverted index and Boolean queries.
2 Text Indexing, Storage and Compression: Text encoding: tokenization, stemming, stop
words, phrases, index optimization. Index compression: lexicon compression and postings
lists compression. Gap encoding, gamma codes, Zipf’s Law. Index construction. Postings
size estimation, merge sort, dynamic indexing, positional indexes, n-gram indexes, realworld issues.
3 Retrieval Models: Boolean, vector space, TFIDF, Okapi, probabilistic, language modeling,
latent semantic indexing. Vector space scoring. The cosine measure. Efficiency
considerations. Document length normalization. Relevance feedback and query expansion.
4 Performance Evaluation: Evaluating search engines. User happiness, precision, recall, Fmeasure. Creating test collections: kappa measure, interjudge agreement. 6
5 Text Categorization and Filtering: Introduction to text classification. Naive Bayes models.
Spam filtering. Vector space classification using hyperplanes; centroids; k Nearest
Neighbors. Support vector machine classifiers. Kernel functions. Boosting
6 Text Clustering: Clustering versus classification. Partitioning methods. k-means clustering.
Mixture of Gaussians model. Hierarchical agglomerative clustering. Clustering terms using
7 Advanced Topics: Summarization, Topic detection and tracking, Personalization, Question
answering, Cross language information retrieval
8 Web Information Retrieval: Hypertext, web crawling, search engines, ranking, link
analysis, PageRank, HITS.
9 Retrieving Structured Documents: XML retrieval, semantic web 5

