Week 1 OverviewHelp Center

Week 1

On this page:

Instructional Activities

Below is a list of the activities and assignments available to you this week. Click on the name of each activity for more detailed instructions.

Relevant Badges Activity Due Date* Estimated Time Required
  Week 1 Video Lectures Sunday, March 29 (Suggested) 3 hours
Programming Assignments Overview Sunday, March 29
(Suggested)
~1 hour
Week 1 Quiz Sunday, April 19 ~ 0.5 hour

* All deadlines are at 11:55 PM Central Time (time zone conversion) unless otherwise noted.

Time

This module will last 7 days and should take approximately 5 hours of dedicated time to complete, with its readings and assignments.

Goals and Objectives

After you actively engage in the learning experiences in this module, you should be able to:

  • Explain some basic concepts in natural language processing and text information access.
  • Explain why text retrieval is often defined as a ranking problem.
  • Explain how the vector space retrieval model works.
  • Explain what TF-IDF weighting is and why TF transformation and document length normalization is necessary for the design of an effective ranking function.

Key Phrases/Concepts

Keep your eyes open for the following key terms or phrases as you complete the readings and interact with the lectures. These topics will help you better understand the content in this module.

  • Part-of-speech tagging; syntactic analysis; semantic analysis; ambiguity
  • “Bag of words” representation
  • Push, pull, querying, browsing
  • Probability Ranking Principle
  • Relevance
  • Vector Space Model
  • Term Frequency (TF)
  • Document Frequency (DF); Inverse Document Frequency (IDF)
  • TF Transformation
  • Pivoted length normalization
  • Dot product
  • BM25

Guiding Questions

Develop your answers to the following guiding questions while watching the video lectures throughout the week.

  • What does a computer have to do in order to understand a natural language sentence?
  • What is ambiguity?
  • Why is natural language processing (NLP) difficult for computers?
  • What is bag-of-words representation? Why do modern search engines use this simple representation of text?
  • What are the two modes of text information access? Which mode does a Web search engine such as Google support?
  • When is browsing more useful than querying to help a user find relevant information?
  • Why is a text retrieval task defined as a ranking task?
  • What is a retrieval model?
  • What are the two assumptions made by the Probability Ranking Principle?
  • What is the Vector Space Retrieval Model? How does it work?
  • How do we define the dimensions of the Vector Space Model?
  • What are some different ways to place a document as a vector in the vector space?
  • What is Term Frequency (TF)?
  • What is TF Transformation?
  • What is Document Frequency (DF)?
  • What is Inverse Document Frequency (IDF)?
  • What is TF-IDF Weighting?
  • Why do we need to penalize long documents in text retrieval?
  • What is pivoted document length normalization?
  • What are the main ideas behind the retrieval function BM25?

Readings and Resources

The following readings are optional:

  • N. J. Belkin and W. B. Croft. "Information filtering and information retrieval: Two sides of the same coin?" Commun. ACM 35, 12 (Dec. 1992): 29-38.
  • A. Singhal, C. Buckley, and M. Mitra. "Pivoted document length normalization." In Proceedings of ACM SIGIR 1996.

Video Lectures

Video Lecture Lecture Notes Transcript Video Download SRT Caption File Forum
 1.1 Natural Language Processing(00:21:05)    
 
(35.5 MB)
   
 1.2 Text Access(00:09:24)    
 
(12.8 MB)
   
 1.3 Text Retrieval Problem(00:26:18)    
 
(36.7 MB)
   
 1.4 Overview of Text Retrieval Methods(00:10:10)    
 
(13.7 MB)
   
 1.5 Vector Space Model: Basic Idea(00:09:44)    
 
(13.0 MB)
   
 1.6 Vector Space Model: Instantiation(00:17:30)    
 
(23.1 MB)
   
 1.7 Vector Space Model: Improved Instantiation(00:16:52)    
 
(22.1 MB)
   
 1.8 TF Transformation (00:18:56)    
 
(12.7 MB)
   
 1.9 Doc Length Normalization(00:18:56)    
 
(25.6 MB)
   

Tips for Success

To do well this week, I recommend that you do the following:

  • Review the video lectures a number of times to gain a solid understanding of the key questions and concepts introduced this week.
  • When possible, provide tips and suggestions to your peers in this class. As a learning community, we can help each other learn and grow. One way of doing this is by helping to address the questions that your peers pose. By engaging with each other, we’ll all learn better.
  • It’s always a good idea to refer to the video lectures and reference them in your responses. When appropriate, critique the information presented.
  • Take notes while you watch the lectures for this week. By taking notes, you are interacting with the material and will find that it is easier to remember and to understand. With your notes, you’ll also find that it’s easier to complete your assignments. So, go ahead, do yourself a favor; take some notes!

Getting and Giving Help

You can get/give help via the following means:

  • Use the Learner Help Center to find information regarding specific technical problems. For example, technical problems would include error messages, difficulty submitting assignments, or problems with video playback. You can access the Help Center by clicking on theHelp Center link at the top right of any course page. If you cannot find an answer in the documentation, you can also report your problem to the Coursera staff by clicking on the Contact Us! link available on each topic's page within the Learner Help Center.
  • Use the Content Issues forum to report errors in lecture video content, assignment questions and answers, assignment grading, text and links on course pages, or the content of other course materials. University of Illinois staff and Community TAs will monitor this forum and respond to issues.

As a reminder, the instructor is not able to answer emails sent directly to his account. Rather, all questions should be reported as described above.

from: https://class.coursera.org/textretrieval-001/wiki/Week1Overview

coursera课程Text Retrieval and Search Engines之Week 1 Overview的更多相关文章

  1. coursera课程Text Retrieval and Search Engines之Week 2 Overview

    Week 2 OverviewHelp Center Week 2 On this page: Instructional Activities Time Goals and Objectives K ...

  2. coursera课程Text Retrieval and Search Engines之Week 3 Overview

    Week 3 OverviewHelp Center Week 3 On this page: Instructional Activities Time Goals and Objectives K ...

  3. coursera课程Text Retrieval and Search Engines之Week 4 Overview

    Week 4 OverviewHelp Center Week 4 On this page: Instructional Activities Time Goals and Objectives K ...

  4. 【Python学习笔记】Coursera课程《Using Databases with Python》 密歇根大学 Charles Severance——Week4 Many-to-Many Relationships in SQL课堂笔记

    Coursera课程<Using Databases with Python> 密歇根大学 Week4 Many-to-Many Relationships in SQL 15.8 Man ...

  5. 【Python学习笔记】Coursera课程《Using Python to Access Web Data》 密歇根大学 Charles Severance——Week6 JSON and the REST Architecture课堂笔记

    Coursera课程<Using Python to Access Web Data> 密歇根大学 Week6 JSON and the REST Architecture 13.5 Ja ...

  6. 【Python学习笔记】Coursera课程《Using Python to Access Web Data 》 密歇根大学 Charles Severance——Week2 Regular Expressions课堂笔记

    Coursera课程<Using Python to Access Web Data > 密歇根大学 Charles Severance Week2 Regular Expressions ...

  7. Coursera课程下载和存档计划[转载]

    上周三收到Coursera平台的群发邮件,大意是Coursera将在6月30号彻底关闭旧的课程平台,全面升级到新的课程平台上,一些旧的课程资源(课程视频.课程资料)将不再保存,如果你之前学习过相关的课 ...

  8. 【网页开发学习】Coursera课程《面向 Web 开发者的 HTML、CSS 与 Javascript》Week1课堂笔记

    Coursera课程<面向 Web 开发者的 HTML.CSS 与 Javascript> Johns Hopkins University Yaakov Chaikin Week1 In ...

  9. 【DeepLearning学习笔记】Coursera课程《Neural Networks and Deep Learning》——Week2 Neural Networks Basics课堂笔记

    Coursera课程<Neural Networks and Deep Learning> deeplearning.ai Week2 Neural Networks Basics 2.1 ...

随机推荐

  1. 实现nlp文本生成中的beam search解码器

    自然语言处理任务,比如caption generation(图片描述文本生成).机器翻译中,都需要进行词或者字符序列的生成.常见于seq2seq模型或者RNNLM模型中. 这篇博文主要介绍文本生成解码 ...

  2. ASP.NET MVC之验证终结者篇

    有时候我觉得,很多人将一个具体的技术细节写的那么复杂,我觉得没有必要,搞得很多人一头雾水的,你能教会别人用就成了,具体的细节可以去查MSDN什么的,套用爱因斯坦的名言:能在网上查到的就不要去记,用的时 ...

  3. Rookey.Frame之实体表单验证

    昨天给大家介绍了实体FluentValidation验证,今天继续给大家介绍表单验证,在Rookey.Frame框架中,表单验证有PrimaryKeyFields字段验证.唯一验证.必填验证.常用验证 ...

  4. 【POJ】1067.取石子游戏

    题解 这道题让我对SG函数有了更深刻的理解,这是道打表找规律题 我们打出来SG函数似乎是 1 2必败 3 5必败 4 7必败 6 10必败 8 13必败 哇我找到规律了-- 然而,我显然不会通项 后来 ...

  5. Oracle截取字符串和查找字符串

    oracle 截取字符(substr),检索字符位置(instr) case when then else end语句使用 收藏 常用函数:substr和instr 1.SUBSTR(string,s ...

  6. thinphp中volist嵌套循环时变量$i 被污染问题,key="k"

    默认是$i,但是嵌套循环是使用$i,默认的变量$i就会被污染.可以自定义设置变量key="k" k任意. 用 key="k" 代替默认的 $i 1 2 3 4 ...

  7. PhantomJS 远程做调试

    做爬虫的工程师,一定会用到phantomjs,这是一个在linux上用的无界面的浏览器 在终端用phantomjs来爬取数据,或者是做测试,怎么去能看到执行到哪一步了,去实时的观测. 其实chrome ...

  8. Ubuntu18.04 之jdk安装与环境配置

    1.oracle官网下载压缩包. 下载地址为: https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133 ...

  9. springmvc遇见406错误的问题分析

    如果springmvc遇到406错误: 90%没有加入Jackson的包 10%因为后缀为.html 10%的情况,解决方案为加多一个映射,使用.action

  10. web服务端安全之分布式拒绝服务攻击

    一.DDOS攻击的原理分布式拒绝服务,Distributed Denial of Service,利用目标系统网络服务功能缺陷或者直接消耗其系统资源,使得该目标系统无法提供正常的服务.通过大量合法的请 ...