https://datascience.stackexchange.com/questions/15989/micro-average-vs-macro-average-performance-in-a-multiclass-classification-settin/16001

Micro- and macro-averages (for whatever metric) will compute slightly different things, and thus their interpretation differs.

A macro-average will compute the metric independently for each class and then take the average (hence treating all classes equally), whereas a micro-average will aggregate the contributions of all classes to compute the average metric. In a multi-class classification setup, micro-average is preferable if you suspect there might be class imbalance (i.e you may have many more examples of one class than of other classes).

To illustrate why, take for example precision Pr=TP(TP+FP)Pr=TP(TP+FP). Let's imagine you have a One-vs-All(there is only one correct class output per example) multi-class classification system with four classes and the following numbers when tested:

  • Class A: 1 TP and 1 FP
  • Class B: 10 TP and 90 FP
  • Class C: 1 TP and 1 FP
  • Class D: 1 TP and 1 FP

You can see easily that PrA=PrC=PrD=0.5PrA=PrC=PrD=0.5, whereas PrB=0.1PrB=0.1.

  • A macro-average will then compute: Pr=0.5+0.1+0.5+0.54=0.4Pr=0.5+0.1+0.5+0.54=0.4
  • A micro-average will compute: Pr=1+10+1+12+100+2+2=0.123Pr=1+10+1+12+100+2+2=0.123

These are quite different values for precision. Intuitively, in the macro-average the "good" precision (0.5) of classes A, C and D is contributing to maintain a "decent" overall precision (0.4). While this is technically true (across classes, the average precision is 0.4), it is a bit misleading, since a large number of examples are not properly classified. These examples predominantly correspond to class B, so they only contribute 1/4 towards the average in spite of constituting 94.3% of your test data. The micro-average will adequately capture this class imbalance, and bring the overall precision average down to 0.123 (more in line with the precision of the dominating class B (0.1)).

Micro- and macro-averages的更多相关文章

  1. F1 score,micro F1score,macro F1score 的定义

    F1 score,micro F1score,macro F1score 的定义 2018年09月28日 19:30:08 wanglei_1996 阅读数 976   本篇博客可能会继续更新 最近在 ...

  2. 机器学习--Micro Average,Macro Average, Weighted Average

    根据前面几篇文章我们可以知道,当我们为模型泛化性能选择评估指标时,要根据问题本身以及数据集等因素来做选择.本篇博客主要是解释Micro Average,Macro Average,Weighted A ...

  3. Micro和Macro性能学习【转载】

    转自:https://datascience.stackexchange.com/questions/15989/micro-average-vs-macro-average-performance- ...

  4. 多分类评测标准(micro 和 macro)

  5. (转)Illustrated: Efficient Neural Architecture Search ---Guide on macro and micro search strategies in ENAS

    Illustrated: Efficient Neural Architecture Search --- Guide on macro and micro search strategies in  ...

  6. Java资源大全中文版(Awesome最新版)

    Awesome系列的Java资源整理.awesome-java 就是akullpp发起维护的Java资源列表,内容包括:构建工具.数据库.框架.模板.安全.代码分析.日志.第三方库.书籍.Java 站 ...

  7. Java Bloom filter几种实现比较

    英文原始出处: Bloom filter for Scala, the fastest for JVM 本文介绍的是用Scala实现的Bloom filter. 源代码在github上.依照性能测试结 ...

  8. 【Code Tools】Java微基准测试工具JMH之入门篇

    一.JMH是什么 JMH是一个Java工具,用于构建.运行和分析用Java和其他语言编写的以JVM为目标的 nano/micro/milli/macro 基准测试. 二.基本注意事项 1)运行JMH基 ...

  9. [转]awsome-java

    原文链接 Awesome Java A curated list of awesome Java frameworks, libraries and software. Contents Projec ...

  10. 多分类评价指标python代码

    from sklearn.metrics import precision_score,recall_score print (precision_score(y_true, y_scores,ave ...

随机推荐

  1. Python爬虫原理

    前言 简单来说互联网是由一个个站点和网络设备组成的大网,我们通过浏览器访问站点,站点把HTML.JS.CSS代码返回给浏览器,这些代码经过浏览器解析.渲染,将丰富多彩的网页呈现我们眼前: 一.爬虫是什 ...

  2. mfscli的使用方法(解决mfscgi响应慢的问题)

    在moosefs中,mfscgi是一个python写的server程序,其中的数据是调用同样的python工具mfscli实现的. 每当用浏览器打开mfscgi的时候,它要把所有的表数据请求一遍,非常 ...

  3. linux常用文本编缉命令(strings/sed/awk/cut)

    一.strings strings--读出文件中的所有字符串 二.sed--文本编缉 类型 命令 命令说明 字符串替换 sed -i 's/str_reg/str_rep/' filename 将文件 ...

  4. Java 9中新的货币API

    译文出处: Java译站   原文出处:Michael Scharhag JSR 354定义了一套新的Java货币API,计划会在Java 9中正式引入.本文中我们将来看一下它的参考实现:JavaMo ...

  5. 转 举例说明使用MATLAB Coder从MATLAB生成C/C++代码步骤

    MATLAB Coder可以从MATLAB代码生成独立的.可读性强.可移植的C/C++代码. http://www.mathworks.cn/products/matlab-coder/ 使用MATL ...

  6. day06_python_1124

    01 昨日内容回顾 字典: 增: setdefault() 有责不变,无责添加 dic['key'] = vaulue 删: pop 按照key pop('key') pop('key',None) ...

  7. 解决了好几个软件的构建问题,在解决部署问题,bluemix部署

    https://www.puteulanus.com/archives/838#comment-961新版 Bluemix 一键搭建 SS 脚本 https://blog.feixueacg.com/ ...

  8. IEnumerable<T> list注意事项

    方法返回的时候 要设置用list会比较稳妥. 遇到的问题: private IDbConnection GetConnection(){var dataSettingsManager = new Da ...

  9. 7.6 C++基本序列式容器效率比较

    参考:http://www.weixueyuan.net/view/6403.html 总结: 对于vector而言,它只是一个可以伸缩长度的数组 对于deque而言,它是一个可以操作头部和尾部的并且 ...

  10. 使用MyEclipse开发Java EE应用:用XDoclet创建EJB 2 Session Bean项目(二)

    [MyEclipse最新版下载] 二.创建一个Session EJB – Part 1 MyEclipse中的EJB 2.x开发使用了EJB向导和集成XDoclet支持的组合. 每个EJB由三个基本部 ...