How TermFinder calculates P-values

Readme: MGI GO Term Finder

The GoTermFinder attempts to determine whether an observed level of annotation for a group of genes is significant within the context of annotation for all genes within the genome.  Suppose that we have a total population of N genes, in which M have a particular annotation.   If we observe x genes with that annotation, in a sample of n genes, then we can calculate the probability of that observation, using the hypergeometric distribution (eg, see http://mathworld.wolfram.com/HypergeometricDistribution.html ) as:

P-value is the probability or chance of seeing at least x number of genes out of the total n genes in the list annotated to a particular GO term, given the proportion of genes in the whole genome that are annotated to that GO Term. That is, the GO terms shared by the genes in the user's list are compared to the background distribution of annotation. The closer the p-value is to zero, the more significant the particular GO term associated with the group of genes is (i.e. the less likely the observed annotation of the particular GO term to a group of genes occurs by chance).

In other words, when searching the process ontology, if all of the genes in a group were associated with "DNA repair", this term would be significant. However, since all genes in the genome (with GO annotations) are indirectly associated with the top level term "biological_process", this would not be significant if all the genes in a group were associated with this very high level term.

Multiple comparisons problem

Gene Ontology analysis in multiple gene clusters under multiple hypothesis testing frameworks.

Hypergeometric distribution的更多相关文章

  1. Study notes for Discrete Probability Distribution

    The Basics of Probability Probability measures the amount of uncertainty of an event: a fact whose o ...

  2. 常见的概率分布类型(二)(Probability Distribution II)

    以下是几种常见的离散型概率分布和连续型概率分布类型: 伯努利分布(Bernoulli Distribution):常称为0-1分布,即它的随机变量只取值0或者1. 伯努利试验是单次随机试验,只有&qu ...

  3. NLP&数据挖掘基础知识

    Basis(基础): SSE(Sum of Squared Error, 平方误差和) SAE(Sum of Absolute Error, 绝对误差和) SRE(Sum of Relative Er ...

  4. 《量化投资:以MATLAB为工具》连载(2)基础篇-N分钟学会MATLAB(中)

    http://www.matlabsky.com/thread-43937-1-1.html   <量化投资:以MATLAB为工具>连载(3)基础篇-N分钟学会MATLAB(下)     ...

  5. 加州大学伯克利分校Stat2.2x Probability 概率初步学习笔记: Final

    Stat2.2x Probability(概率)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...

  6. 加州大学伯克利分校Stat2.2x Probability 概率初步学习笔记: Midterm

    Stat2.2x Probability(概率)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...

  7. 加州大学伯克利分校Stat2.2x Probability 概率初步学习笔记: Section 2 Random sampling with and without replacement

    Stat2.2x Probability(概率)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...

  8. 加州大学伯克利分校Stat2.3x Inference 统计推断学习笔记: Section 2 Testing Statistical Hypotheses

    Stat2.3x Inference(统计推断)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...

  9. R代码展示各种统计学分布 | 生物信息学举例

    二项分布 | Binomial distribution 泊松分布 | Poisson Distribution 正态分布 | Normal Distribution | Gaussian distr ...

随机推荐

  1. 图片上传插件:webuploader

    官网链接:https://github.com/fex-team/webuploader

  2. volatile的陷阱

         对于volatile关键字,大部分C语言的教程都是一笔带过,并没有做太深入的分析,所以这里简单的整理了一些 关于volatile的使用注意事项.实际上从语法上来看volatile和const ...

  3. 【python40--类和对象:一些相关的BIF】

    0.如何判断一个类是否为另外一个类的子类 --使用issubclass(class,classinfo)函数,如果第一个函数(class)是第二个参数(classinfo)的一个子类,则返回Ture, ...

  4. python --- 11 第一类对象 函数名 闭包 迭代器

    一 .函数名的运用    ①函数名是⼀个变量, 但它是⼀个特殊的变量, 与括号配合可以执⾏函数的变量 ②函数名是一个内存地址    ③ 函数名可以赋值给其他变量         ④函数名可以当做容器类 ...

  5. 内置函数之sorted,filter,map

    # 4,用map来处理字符串列表,把列表中所有人都变成sb,比方alex_sb # name=['oldboy','alex','wusir'] # print(list(map(lambda i:i ...

  6. mysql的数据类型- 特别是表示日期/时间的数据类型: 参考: http://www.cnblogs.com/bukudekong/archive/2011/06/27/2091590.html

    通常认为: 日期 就是 年-月-日: 时间就是: 小时:分钟:秒 要严格区分"日期"和 "时间"的 说法. 日期就是日期, 时间就是时间, 两者是不同的!! 日 ...

  7. 马虎的算式|2013年蓝桥杯B组题解析第二题-fishers

    小明是个急性子,上小学的时候经常把老师写在黑板上的题目抄错了. 有一次,老师出的题目是:36 x 495 = ? 他却给抄成了:396 x 45 = ? 但结果却很戏剧性,他的答案竟然是对的!! 假设 ...

  8. 【做题】agc002D - Stamp Rally——整体二分的技巧

    题意:给出一个无向连通图,有\(n\)个顶点,\(m\)条边.有\(q\)次询问,每次给出\(x,y,z\),最小化从\(x\)和\(y\)开始,总计访问\(z\)个顶点(一个顶点只计算一次),经过的 ...

  9. 安装ubuntu的坑&RHEL7配置

    1.需要其他设置->分区,分区需要有/根目录分区和swap空间,后者文件系统类型选择swap,其他都是ext4 2.普通配置电脑,安装16.04.5 LTS,不要安装最新的,安装重启后卡在那里, ...

  10. Python之Requests的安装与基本使用

    # 安装 使用 pip 安装Requests非常简单 pip install requests 或者使用 easy_install 安装 easy_install requests # 获得源码 Re ...