coursera课程Text Retrieval and Search Engines之Week 3 Overview
Week 3 OverviewHelp Center
Week 3
On this page:
- Instructional Activities
- Time
- Goals and Objectives
- Key Phrases/Concepts
- Guiding Questions
- Readings and Resources
- Video Lectures
- Tips for Success
- Getting and Giving Help
Instructional Activities
Below is a list of the activities and assignments available to you this week. See the How to Pass the Class page to know which assignments pertain to the badge or badges you are pursuing. Click on the name of each activity for more detailed instructions.
Relevant Badges | Activity | Due Date* | Estimated Time Required |
---|---|---|---|
Week 3 Video Lectures | Sunday, April 12 (suggested) |
3 hours | |
Week 3 Quiz | Sunday, April 19 | ~0.5 hours |
* All deadlines are at 11:55 PM Central Time (time zone conversion) unless otherwise noted.
Time
This module will last 7 days, and it should take approximately 6 hours of dedicated time to complete its readings and assignments.
Goals and Objectives
After you actively engage in the learning experiences in this module, you should be able to:
- Explain how to interpret p(R=1|q,d), and estimate it based on a large set of collected relevance judgments (or clickthrough information) about query q and document d.
- Explain how to interpret the conditional probability p(q|d) used for scoring documents in the query likelihood retrieval function.
- Explain Statistical Language Model and Unigram Language Model.
- Explain how to compute the maximum likelihood estimate of a Unigram Language Model.
- Explain how to use Unigram Language Models to discover semantically related words.
- Compute p(q|d) based on a given document language model p(w|d).
- Explain smoothing.
- Show that query likelihood retrieval function implements TF-IDF weighting if we smooth the document language model p(w|d) using the collection language model p(w|C) as a reference language model.
- Compute the estimate of p(w|d) using Jelinek-Mercer (JM) smoothing and Dirichlet Prior smoothing, respectively.
- Explain the similarity and differences in the three different kinds of feedback: relevance feedback, pseudo-relevance feedback, and implicit feedback.
- Explain how the Rocchio feedback algorithm works.
- Explain how the Kullback-Leibler (KL) divergence retrieval function generalizes the query likelihood retrieval function.
- Explain the basic idea of using a mixture model for feedback.
Key Phrases/Concepts
Keep your eyes open for the following key terms or phrases as you complete the readings and interact with the lectures. These topics will help you better understand the content in this module.
- p(R=1|q,d) ; query likelihood, p(q|d)
- Statistical Language Model; Unigram Language Model
- Maximum likelihood estimate
- Background language model, collection language model, document language model
- Smoothing of Unigram Language Models
- Relation between query likelihood and TF-IDF weighting
- Linear interpolation (i.e., Jelinek-Mercer) smoothing
- Dirichlet Prior smoothing
- Relevance feedback, pseudo-relevance feedback, implicit feedback
- Rocchio
- Kullback-Leiber divergence (KL-divergence) retrieval function
- Mixture language model
Guiding Questions
Develop your answers to the following guiding questions while completing the readings and working on assignments throughout the week.
- Given a table of relevance judgments in the form of three columns (query, document, and binary relevance judgments), how can we estimate p(R=1|q,d)?
- How should we interpret the query likelihood conditional probability p(q|d)?
- What is a Statistical Language Model? What is a Unigram Language Model? How many parameters are there in a unigram language model?
- How do we compute the maximum likelihood estimate of the Unigram Language Model (based on a text sample)?
- What is a background language model? What is a collection language model? What is a document language model?
- Why do we need to smooth a document language model in the query likelihood retrieval model? What would happen if we don’t do smoothing?
- When we smooth a document language model using a collection language model as a reference language model, what is the probability assigned to an unseen word in a document?
- How can we prove that the query likelihood retrieval function implements TF-IDF weighting if we use a collection language model smoothing?
- How does linear interpolation (Jelinek-Mercer) smoothing work? What is the formula?
- How does Dirichlet Prior smoothing work? What is the formula?
- What are the similarity and difference between Jelinek-Mercer smoothing and Dirichlet Prior smoothing?
- What is relevance feedback? What is pseudo-relevance feedback? What is implicit feedback?
- How does Rocchio work? Why do we need to ensure that the original query terms have sufficiently large weights in feedback?
- What is the KL-divergence retrieval function? How is it related to the query likelihood retrieval function?
- What is the basic idea of the two-component mixture model for feedback?
Readings & Resources
Read ONLY Chapter 3 and part of Chapter 5 (pages 55–63)
- Zhai, ChengXiang. Statistical Language Models for Information Retrieval. Synthesis Lectures Series on Human Language Technologies. Morgan & Claypool Publishers, 2008.
Video Lectures
Video Lecture | Lecture Notes | Transcript | Video Download | SRT Caption File | Forum |
---|---|---|---|---|---|
3.1 Probabilistic Retrieval Model: Basic Idea(00:12:44) |
(17.1 MB) |
||||
3.2 Probabilistic Retrieval Model: Statistical Language Model (00:17:53) |
(24.3 MB) |
||||
3.3 Probabilistic Retrieval Model: Query Likelihood (00:12:07) |
(16.2 MB) |
||||
3.4 Probabilistic Retrieval Model: Statistical Language Model – Part 1 (00:12:15) |
(16.5 MB) |
||||
3.4 Probabilistic Retrieval Model: Statistical Language Model – Part 2(00:09:36) |
(13.5 MB) |
||||
3.5 Probabilistic Retrieval Model: Smoothing Methods – Part 1(00:09:54) |
(14.5 MB) |
||||
3.5 Probabilistic Retrieval Model: Smoothing Methods – Part 2(00:13:17) |
(18.4 MB) |
||||
3.6 Retrieval Methods: Feedback in Text Retrieval(00:06:49) |
(9.6 MB) |
||||
3.7 Feedback in Text Retrieval: Feedback in VSM (00:12:05) |
(16.7 MB) |
||||
3.8 Feedback in Text Retrieval: Feedback in LM (00:19:11) |
(26.4 MB) |
Tips for Success
To do well this week, I recommend that you do the following:
- Review the video lectures a number of times to gain a solid understanding of the key questions and concepts introduced this week.
- When possible, provide tips and suggestions to your peers in this class. As a learning community, we can help each other learn and grow. One way of doing this is by helping to address the questions that your peers pose. By engaging with each other, we’ll all learn better.
- It’s always a good idea to refer to the video lectures and chapter readings we've read during this week and reference them in your responses. When appropriate, critique the information presented.
- Take notes while you read the materials and watch the lectures for this week. By taking notes, you are interacting with the material and will find that it is easier to remember and to understand. With your notes, you’ll also find that it’s easier to complete your assignments. So, go ahead, do yourself a favor; take some notes!
Getting and Giving Help
You can get/give help via the following means:
- Use the Learner Help Center to find information regarding specific technical problems. For example, technical problems would include error messages, difficulty submitting assignments, or problems with video playback. You can access the Help Center by clicking on theHelp Center link at the top right of any course page. If you cannot find an answer in the documentation, you can also report your problem to the Coursera staff by clicking on the Contact Us! link available on each topic's page within the Learner Help Center.
- Use the Content Issues forum to report errors in lecture video content, assignment questions and answers, assignment grading, text and links on course pages, or the content of other course materials. University of Illinois staff and Community TAs will monitor this forum and respond to issues.
As a reminder, the instructor is not able to answer emails sent directly to his account. Rather, all questions should be reported as described above.
from: https://class.coursera.org/textretrieval-001/wiki/Week3Overview
coursera课程Text Retrieval and Search Engines之Week 3 Overview的更多相关文章
- coursera课程Text Retrieval and Search Engines之Week 1 Overview
Week 1 OverviewHelp Center Week 1 On this page: Instructional Activities Time Goals and Objectives K ...
- coursera课程Text Retrieval and Search Engines之Week 2 Overview
Week 2 OverviewHelp Center Week 2 On this page: Instructional Activities Time Goals and Objectives K ...
- coursera课程Text Retrieval and Search Engines之Week 4 Overview
Week 4 OverviewHelp Center Week 4 On this page: Instructional Activities Time Goals and Objectives K ...
- 【Python学习笔记】Coursera课程《Using Databases with Python》 密歇根大学 Charles Severance——Week4 Many-to-Many Relationships in SQL课堂笔记
Coursera课程<Using Databases with Python> 密歇根大学 Week4 Many-to-Many Relationships in SQL 15.8 Man ...
- 【Python学习笔记】Coursera课程《Using Python to Access Web Data》 密歇根大学 Charles Severance——Week6 JSON and the REST Architecture课堂笔记
Coursera课程<Using Python to Access Web Data> 密歇根大学 Week6 JSON and the REST Architecture 13.5 Ja ...
- 【Python学习笔记】Coursera课程《Using Python to Access Web Data 》 密歇根大学 Charles Severance——Week2 Regular Expressions课堂笔记
Coursera课程<Using Python to Access Web Data > 密歇根大学 Charles Severance Week2 Regular Expressions ...
- Coursera课程下载和存档计划[转载]
上周三收到Coursera平台的群发邮件,大意是Coursera将在6月30号彻底关闭旧的课程平台,全面升级到新的课程平台上,一些旧的课程资源(课程视频.课程资料)将不再保存,如果你之前学习过相关的课 ...
- 【网页开发学习】Coursera课程《面向 Web 开发者的 HTML、CSS 与 Javascript》Week1课堂笔记
Coursera课程<面向 Web 开发者的 HTML.CSS 与 Javascript> Johns Hopkins University Yaakov Chaikin Week1 In ...
- 【DeepLearning学习笔记】Coursera课程《Neural Networks and Deep Learning》——Week2 Neural Networks Basics课堂笔记
Coursera课程<Neural Networks and Deep Learning> deeplearning.ai Week2 Neural Networks Basics 2.1 ...
随机推荐
- Loadrunner乱码问题
在LoadRunner中录制脚本时,出现乱码的问题解决 我在录制一个Web的脚本时,出现中文乱码. 原因为Web中采用的是UTF-8编码,而录制脚本的选项默认没有把支持UTF8选中. 方法:1. To ...
- 初始Winsock编程
1.套接字的创建和关闭 使用套接字之前,必须使用socket函数创建一个套接字,此函数调用成功将返回一个套接字句柄. 1 SOCKET socket( 2 int af, //用来指定套接字使用的地址 ...
- HTML5 Canvas游戏开发(二)高级功能
一.变形 1.放大和缩小 scale(X,Y)函数. 当使用该函数时,其起始坐标值也被放大或缩小.当X.Y为负值时,可以实现翻转. 2.平移变换 translate(X,Y)函数. 表示水平方向向左移 ...
- C语言感悟
还没接触C语言前,以为代码是一些单词组成的公式,和背单词一样的麻烦.枯燥无味,所以英语基础的很烂的我,对C语言没什么信心. 通过这一段时间的学习,现在对C语言的认识,和最开始时很大不一样.C语言中的代 ...
- Ionic Js一:上拉菜单(ActionSheet)
上拉菜单(ActionSheet)通过往上弹出的框,来让用户选择选项. 非常危险的选项会以高亮的红色来让人第一时间识别.你可以通过点击取消按钮或者点击空白的地方来让它消失. HTML 代码 <b ...
- 20172301 《Java软件结构与数据结构》实验一报告
20172301 <Java软件结构与数据结构>实验一报告 课程:<Java软件结构与数据结构> 班级: 1723 姓名: 郭恺 学号:20172301 实验教师:王志强老师 ...
- ajax异步请求模式
什么是异步请求 我们知道,在同步请求模型中,浏览器是直接向服务器发送请求,并直接接收.处理服务器响应的数据的.这就导致了浏览器发送完一个请求后,就只能干等着服务器那边处理请求,响应请求,在这期间其它事 ...
- mysql正则表达式,实现多个字段匹配多个like模糊查询
现在有这么一个需求 一个questions表,字段有题目(TestSubject),选项(AnswerA,AnswerB,AnswerC,AnswerD,AnswerE) 要求字段不包含png,jpg ...
- iOS 9应用开发教程之创建iOS 9项目与模拟器介绍
iOS 9应用开发教程之创建iOS 9项目与模拟器介绍 编写第一个iOS 9应用 本节将以一个iOS 9应用程序为例,为开发者讲解如何使用Xcode 7.0去创建项目,以及iOS模拟器的一些功能.编辑 ...
- splice() 的用法
splice splice()方法是修改Array的“万能方法”,它可以从指定的索引开始删除若干元素,然后再从该位置添加若干元素: var arr = ['Microsoft', 'Apple', ' ...