Hypergeometric distribution
How TermFinder calculates P-values
The GoTermFinder attempts to determine whether an observed level of annotation for a group of genes is significant within the context of annotation for all genes within the genome. Suppose that we have a total population of N genes, in which M have a particular annotation. If we observe x genes with that annotation, in a sample of n genes, then we can calculate the probability of that observation, using the hypergeometric distribution (eg, see http://mathworld.wolfram.com/HypergeometricDistribution.html ) as:
P-value is the probability or chance of seeing at least x number of genes out of the total n genes in the list annotated to a particular GO term, given the proportion of genes in the whole genome that are annotated to that GO Term. That is, the GO terms shared by the genes in the user's list are compared to the background distribution of annotation. The closer the p-value is to zero, the more significant the particular GO term associated with the group of genes is (i.e. the less likely the observed annotation of the particular GO term to a group of genes occurs by chance).
In other words, when searching the process ontology, if all of the genes in a group were associated with "DNA repair", this term would be significant. However, since all genes in the genome (with GO annotations) are indirectly associated with the top level term "biological_process", this would not be significant if all the genes in a group were associated with this very high level term.
Gene Ontology analysis in multiple gene clusters under multiple hypothesis testing frameworks.
Hypergeometric distribution的更多相关文章
- Study notes for Discrete Probability Distribution
The Basics of Probability Probability measures the amount of uncertainty of an event: a fact whose o ...
- 常见的概率分布类型(二)(Probability Distribution II)
以下是几种常见的离散型概率分布和连续型概率分布类型: 伯努利分布(Bernoulli Distribution):常称为0-1分布,即它的随机变量只取值0或者1. 伯努利试验是单次随机试验,只有&qu ...
- NLP&数据挖掘基础知识
Basis(基础): SSE(Sum of Squared Error, 平方误差和) SAE(Sum of Absolute Error, 绝对误差和) SRE(Sum of Relative Er ...
- 《量化投资:以MATLAB为工具》连载(2)基础篇-N分钟学会MATLAB(中)
http://www.matlabsky.com/thread-43937-1-1.html <量化投资:以MATLAB为工具>连载(3)基础篇-N分钟学会MATLAB(下) ...
- 加州大学伯克利分校Stat2.2x Probability 概率初步学习笔记: Final
Stat2.2x Probability(概率)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...
- 加州大学伯克利分校Stat2.2x Probability 概率初步学习笔记: Midterm
Stat2.2x Probability(概率)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...
- 加州大学伯克利分校Stat2.2x Probability 概率初步学习笔记: Section 2 Random sampling with and without replacement
Stat2.2x Probability(概率)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...
- 加州大学伯克利分校Stat2.3x Inference 统计推断学习笔记: Section 2 Testing Statistical Hypotheses
Stat2.3x Inference(统计推断)课程由加州大学伯克利分校(University of California, Berkeley)于2014年在edX平台讲授. PDF笔记下载(Acad ...
- R代码展示各种统计学分布 | 生物信息学举例
二项分布 | Binomial distribution 泊松分布 | Poisson Distribution 正态分布 | Normal Distribution | Gaussian distr ...
随机推荐
- 同时import两个版本的QtQuick【1、2】,默认使用
在同一个qml文件中,如果同时import了Qtquick1和2,那么谁在后面,谁起作用
- linux如何使make输出makefile中所有的规则和变量
答: make -p (会执行makefile,加入-q可以阻止makefile的执行)
- tp框架中的一些疑点知识-1
tp默认的编码是utf-8 Runtime中的Cache和Logs都是分模块的,因为在应用app下可以有多个模块,但是 公共模块和Runtime模块只有一个, 所以, Runtime要包含各个模块的内 ...
- 转载:Systemd 命令
目录 一.由来 二.Systemd 概述 三.系统管理 3.1 systemctl 3.2 systemd-analyze 3.3 hostnamectl 3.4 localectl 3.5 time ...
- PKM(个人知识管理)类软件收集(偶尔更新列表)
evernote(印象笔记) Wiz 有道云 麦库 leanote GoogleKeep OneNote SimpleNote(wp家的,免费) pocket(稍后读的软件,同类的还有Instapap ...
- angular-cli 正确安装步骤
npm install -g node-gyp npm install --global windows-build-tools npm install -g angular-cli
- 4、Python文件对象及os、os.path和pickle模块(0530)
文件系统和文件 1.文件系统是OS用于明确磁盘或分区上的文件的方法和数据结构---即在磁盘上组织文件的方法: 文件系统模块:os 2.计算机文件(称文件.电脑档案.档案),是存储在某种长期储存设备或临 ...
- HDU 1403 Longest Common Substring(最长公共子串)
http://acm.hdu.edu.cn/showproblem.php?pid=1403 题意:给出两个字符串,求最长公共子串的长度. 思路: 刚开始学后缀数组,确实感觉很难,但是这东西很强大,所 ...
- [从零开始搭网站五]http网站Tomcat配置web.xml和server.xml
点击下面连接查看从零开始搭网站全系列 从零开始搭网站 上一章我们在CentOS下搭建了Tomcat,但是还是没有跑起来...那么这一章就把最后的配置给大家放上去. 有两种方式:一种是用 rm -f 给 ...
- _pvp_killed_loot
该表控制玩家被击杀时掉落物品,包括角色身上装备,背包物品,银行物品 comment 备注 entry 掉落的物品ID lootCount 掉落的物品数量 chance 掉落的几率,例如50,则50%几 ...