[IR] Evaluation

无序检索结果的评价方法：

Precision  P  = tp/(tp + fp) 
Recall       R  = tp/(tp + fn) 

Accuracy = (tp + tn) / ( tp + fp + fn + tn)

有序检索结果的评价方法：

A precison-recall curve

调式search engine目前只是针对一个Query的表现。

You  need  to  average  performance  over  a  whole bunch of queries.

其实，就是在遵从precision降低，必然提高recall的原则下，画出趋势图。(也就是插值法 Interpolated  Precision)

What is the interpolated precision of the system at 25% recall?

1.0, 0.67, 0.5, 0.4, 0.36, 0.36, 0.36

Mean average precision (MAP)

System: D1, D2, D4, D3

k = 1, R, 1/1

k = 2, NR, n/a

k = 3, NR, n/a

k = 4, R, 2/4

MAP = (1/1+2/4)/2 = 3/4

What is the largest possible mean average precision that this system could have?

If the last two relevant documents are in ranking 21 and 22. 尽量早出现

MAP = (1.0+1.0+0.33+0.36+0.33+0.3+0.33+0.36)/8 = 0.503

What is the smallest possible mean average precision that this system could have?

If the last two relevant documents are in ranking 9999 and 10000. 尽量晚出现

MAP = (1.0+1.0+0.33+0.36+0.33+0.3+0.0007+0.0008)/6 = 0.416

用已有的MAP去估计未来可能的MAP的error是多少？

MAP = (1.0 + 1.0 + 0.33 + 0.36 + 0.33 + 0.3)/6 = 0.555

The error could be 0.555 - (0.503 + 0.416)/2 = 0.095

Kappa Measure

P(A) = Accuracy

P(E) = [ (person1-yes + person2-yes)/(total*2) ]^2 + [ (person1-no + person2-no)/(total*2) ]

Kappa = [ P(A) – P(E) ] / [ 1 – P(E) ]

Kappa  >  0.8 // good  agreement

0.67  <  Kappa  <  0.8  // “tentative  conclusions” (CarleSa   ’96) 

[IR] Evaluation的更多相关文章

数据挖掘方面重要会议的最佳paper集合
数据挖掘方面重要会议的最佳paper集合,兴许将陆续分析一下内容: 主要有KDD.SIGMOD.VLDB.ICML.SIGIR KDD (Data Mining) 2013 Simple and De ...
本人AI知识体系导航 - AI menu
Relevant Readable Links Name Interesting topic Comment Edwin Chen 非参贝叶斯徐亦达老板 Dirichlet Process 学习 ...
[笔记]RankSVM 和 IR SVM
之前的博客:http://www.cnblogs.com/bentuwuying/p/6681943.html中简单介绍了Learning to Rank的基本原理,也讲到了Learning to R ...
Learning to Rank算法介绍：RankSVM 和 IR SVM
之前的博客:http://www.cnblogs.com/bentuwuying/p/6681943.html中简单介绍了Learning to Rank的基本原理,也讲到了Learning to R ...
Datasets and Evaluation Metrics used in Recommendation System
Movielens and Netflix remain the most-used datasets. Other datasets such as Amazon, Yelp and CiteUli ...
Utility2：Appropriate Evaluation Policy
UCP收集所有Managed Instance的数据的机制,是通过启用各个Managed Instances上的Collection Set:Utility information(位于Managem ...
SQL SERVER 2012 从Enterprise Evaluation Edtion 升级到 Standard Edtion SP1
案例背景:公司从意大利购买了一套中控系统,前期我也没有参与其中(包括安装.实施都是第三方),直到最近项目负责人告诉我:前期谈判以为是数据库的License费用包含在合同中,现在经过确认SQL Serv ...
LLVM 笔记（五）—— LLVM IR
ilocker:关注 Android 安全(新手) QQ: 2597294287 LLVM 的 IR (Intermediate Representation) 是其设计中的最重要的部分.优化器在进行 ...
word record about IR target detecting and tracking
1 is submerged in background clutter 淹没在背景杂波中 2 performe poorly for the dim small targets in sever c ...

随机推荐

CStringArray用法
CStringArray使用之前先设置数组尺寸SetSize,才能使用SetAt CStringArray m_strScrkRfid ; ...
framework4.0注册到IIS
C:\WINDOWS\Microsoft.NET\Framework\v4.0.30319\aspnet_regiis.exe -ir -enable C:\WINDOWS\Microsoft.NET ...
Apache和tomcat服务器使用ajp_proxy模块
首先我们先介绍一下为什么要让Apache与Tomcat之间进行连接.事实上Tomcat本身已经提供了HTTP服务,该服务默认的端口是8080,装好tomcat后通过8080端口可以直接使用Tomcat ...
在Flex （Flash）中嵌入HTML 代码或页面—Flex IFrame
在flex组件中嵌入html代码,可以利用flex iframe.这个在很多时候会用到的,有时候flex必须得这样做,如果你不这样做还真不行-- flex而且可以和html进行JavaScript交互 ...
HEXO+PAGE 搭建个性博客
新博客地址: http://javen205.oschina.io https://javen205.github.io Hexo 是高效的静态站点生成框架,她基于 Node.js. 通过 Hexo ...
【转载】UEditor前端配置项说明
UEditor 的配置项分为两类:前端配置项和后端配置项后端配置项具体看这个文档L:后端配置项说明本文档介绍如何通过设置前端配置项,定制编辑器的特性,配置方法主要通过修改ueditor.con ...
注入器和发布库--AngularJS学习笔记（三）
AngularJS的一大特性就是Module的加载和依赖注入,本文将分析一下loader.js和最后这些代码文件是怎么组织和运行的. Loader.js 该文件中只有setupModuleLoader ...
源代码目录结构--AngularJS学习笔记（一）
最近开始接触AngularJS,确实是一个相当不错的东西,可以把很多东西简化掉.又对于其中的双向绑定等的实现很好奇,加之正在学习Javascript的东西,所以觉得从源代码这块开始深入学习Angula ...
记一次苦逼的SQL查询优化
最近在维护公司项目时,需要加载某页面,总共加载也就4000多条数据,竟然需要35秒钟,要是数据增长到40000条,我估计好几分钟都搞不定.卧槽,要我是用户的话估计受不了,趁闲着没事,就想把它优化一下, ...
php - 执行Linux命令没有报错但也没有输出
今天我需要在同事访问我的PHP页面的时候执行一段python脚本,于是我的代码是这样写的: 1 <?php 2 function my_workjob(){ 3 $this->makeLo ...

[IR] Evaluation

无序检索结果的评价方法：

有序检索结果的评价方法：

相关反馈：有点reinforcement learning的意思。

[IR] Evaluation的更多相关文章

随机推荐

热门专题