统计学_Wilcoxon signed-rank test(python脚本)
python机器学习-乳腺癌细胞挖掘(博主亲自录制视频)
https://study.163.com/course/introduction.htm?courseId=1005269003&utm_campaign=commission&utm_source=cp-400000000398149&utm_medium=share
机器学习,统计项目联系QQ:231469242
两个配对样本,均匀分布,非正太分布
Wilcoxon signed-rank test
曼-惠特尼U检验Mann–Whitney Test
两个独立样本,均匀分布,非正太分布
https://en.wikipedia.org/wiki/Frank_Wilcoxon
ranking method
这是a=0.05对应得样本数
critical value 对应样本数,20样本对应的关键值为52
positive sum:
差异为正值的排名和
negative sum:
差异为负值的排名和
如果negative sum小于52,表明两组数有显著差异,推翻原假设
http://blog.sina.com.cn/s/blog_4bcf5ebd0101422n.html
SPSS学习笔记之——两配对样本的非参数检验(Wilcoxon符号秩检验)
一、概述
非参数检验对于总体分布没有要求,因而使用范围更广泛。对于两配对样本的非参数检验,首选Wilcoxon符号秩检验。它与配对样本t检验相对应。
二、问题
为了研究某放松方法(如听音乐)对于入睡时间的影响,选择了10名志愿者,分别记录未进行放松时的入睡时间及放松后的入睡时间(单位为分钟),数据如下笔。请问该放松方法对入睡时间有无影响。
本例可以采用配对样本t检验,但由于样本量少,数据可能不符合正太分布,所以考虑用非参数检验。
三、统计操作
数据视图
菜单选择
打开如下的对话框
该对话框有三个选项卡,第一个选项卡会根据第三个选项卡的设置自动设置,故一般不用手动设定。点击进入“字段”选项卡。将“放松前”、“放松后”均选入右边“检验字段”框中。
点击进入“设置”对话框,选择检验方法,切换为“自定义检验”,选择“Wilcoxon匹配样本对符号秩(二样本)”复选框。“检验选项”可以设定显著性水平。
点击“运行”按钮,输出结果
四、结果解读
这就是输出结果。原假设示放松前好放松后差值的中位数等于0,P=0.015<0.05,拒绝原假设,认为放松前后有统计学差异。
双击该表格,会弹出如下的“模型浏览器”窗口,可以看到更详细的信息。如下图。
# -*- coding: utf-8 -
import scipy.stats as stats
data1=[21,12,12,23,19,13,20,17,14,19]
data2=[12,11,8,9,10,15,16,17,10,16]
stats.wilcoxon(data1,data2)
'''
Out[2]: WilcoxonResult(statistic=2.0, pvalue=0.01471359242280415)
p值小于0.05,两组数据有显著差异
'''
https://study.com/academy/lesson/non-parametric-inferential-statistics-definition-examples.html
Normal
What is normal? At least in the world of statistics, this has nothing to do with how someone dresses, acts, or what their beliefs are. Normal data comes from a population with a normal distribution. A normal distribution is a distribution that has a symmetrical bell-shaped curve to it, which you're probably well aware of.
Keep this concept in mind as we go over the major differences between parametric and non-parametric statistics.
Parametric Methods
First, let's define our terms really simply. When we talk about parameters in statistics, what are we actually hinting at? Parameters are descriptive measures of population data, such as the population mean or population standard deviation.
When the variable we are considering is approximately (or completely) normally distributed, or the sample size is large, we can use two inferential methods that are concerned with parameters - appropriately called parametric methods - when performing a hypothesis test for a population mean. For instance, if we find that the distribution of the average salary of a sample looks like a bell curve, then parametric methods may be used.
These two methods are probably ones you've heard of before. They are the z-test, which we'd use when the population standard deviation is known to us; or the t-test, which we'd use when the population standard deviation is not known to us.
Non-Parametric Methods
Inferential methods that are not concerned with parameters are known, easily enough, as non-parametric methods. However, this term is also more broadly used to refer to many methods that are applied without assuming normality. So, for instance, if we find that the distribution of the average salary of a sample looks like the histogram you see on the screen now [see video], which is nothing close to that of a bell curve, then we will have to turn to non-parametric methods.
Such non-parametric methods have their pros and cons. On the pro side, these methods are usually simpler to compute and are more resistant to extreme values when compared to parametric methods. On the con side, if the requirements for the use of a parametric method are actually met, non-parametric methods do not have as much power as the z-test or t-test.
By power, I simply mean the probability of avoiding a type II error, which is an error where we fail to reject the false null hypothesis.
Example of a Non-Parametric Method
One example of a non-parametric method is the Wilcoxon signed-rank test. This is a test that assumes the variable under consideration does not need a specific shape and doesn't have to be normally distributed, but is symmetric in its distribution nonetheless. This means that it can be sliced in half to produce two mirror images.
So, for example, a right-skewed or left-skewed distribution would not be appropriate for this test since it's not symmetric. But a normal, symmetric bimodal, triangular, or uniform distribution would be a fit for this test since any one of those can be sliced in half to produce two mirror images of one another.
Other non-parametric tests include the likes of:
- The Kruskal-Wallis test
- The Mann-Whitney U test
- The sign test
Lesson Summary
Normal data comes from a population with a normal distribution. A normal distribution is a distribution that has a symmetrical bell-shaped curve to it, which you're probably well aware of.
Inferential methods that are concerned with parameters are appropriately called parametric methods, and include the z-test and t-test. Parameters are descriptive measures of population data.
Inferential methods that are not concerned with parameters are known as non-parametric methods. This term is also more broadly used to refer to many methods that are applied without assuming normality.
While non-parametric methods are easier to compute than parametric ones, they do not have as much power as parametric methods if the requirements for the use of a parametric method are met. By power, I simply mean the probability of avoiding a type II error, which is an error where we fail to reject the false null hypothesis.
An example of a non-parametric method is the Wilcoxon signed-rank test. This is a test that assumes the variable under consideration does not need a specific shape and doesn't have to be normally distributed, but is symmetric in its distribution nonetheless.
https://study.163.com/provider/400000000398149/index.htm?share=2&shareId=400000000398149( 欢迎关注博主主页,学习python视频资源,还有大量免费python经典文章)
统计学_Wilcoxon signed-rank test(python脚本)的更多相关文章
- hive使用python脚本导致java.io.IOException: Broken pipe异常退出
反垃圾rd那边有一个hql,在执行过程中出现错误退出,报java.io.IOException: Broken pipe异常,hql中使用到了python脚本,hql和python脚本最近没有人改过, ...
- 如何手动写一个Python脚本自动爬取Bilibili小视频
如何手动写一个Python脚本自动爬取Bilibili小视频 国庆结束之余,某个不务正业的码农不好好干活,在B站瞎逛着,毕竟国庆嘛,还让不让人休息了诶-- 我身边的很多小伙伴们在朋友圈里面晒着出去游玩 ...
- Wilcoxon Signed Rank Test
1.Wilcoxon Signed Rank Test Wilcoxon有符号秩检验(也称为Wilcoxon有符号秩和检验)是一种非参数检验.当统计数据中使用“非参数”一词时,并不意味着您对总体一无所 ...
- 2.3 Hive的数据类型讲解及实际项目中如何使用python脚本对数据进行ETL
一.hive Data Types https://cwiki. apache. org/confluence/display/HiveLanguageManual+Types Numeric Typ ...
- freeswitch嵌入python脚本
操作系统:debian8.5_x64 freeswitch 版本 : 1.6.8 python版本:2.7.9 开启python模块 安装python lib库 apt-get install pyt ...
- python脚本后台运行
问题描述: 环境: CentOS6.4 一个用python写的监控脚本test1.py,用while True方式一直运行,在ssh远程(使用putty终端)时通过以下命令启动脚本: python t ...
- 某互联网后台自动化组合测试框架RF+Sikuli+Python脚本
某互联网后台自动化组合测试框架RF+Sikuli+Python脚本 http://www.jianshu.com/p/b3e204c8651a 字数949 阅读323 评论1 喜欢0 一.**Robo ...
- 动态执行python脚本
前言 存在许多独立的python脚本,这些脚本可能会增加,也可能会减少,现在需要按照某种顺序调度这些程序.在python的standard library中,有一个模块imp可以实现动态的调用ptho ...
- 一个获取指定目录下一定格式的文件名称和文件修改时间并保存为文件的python脚本
摘自:http://blog.csdn.net/forandever/article/details/5711319 一个获取指定目录下一定格式的文件名称和文件修改时间并保存为文件的python脚本 ...
- SecureCRT中python脚本编写
SecureCRT中python脚本编写学习指南 SecureCRT python 引言 在测试网络设备中,通常使用脚本对设备端进行配置和测试以及维护:对于PE设备的测试维护人员来说使用较多是Secu ...
随机推荐
- TensorFlow入门——MNIST深入
#load MNIST data import tensorflow.examples.tutorials.mnist.input_data as input_data mnist = input_d ...
- 多线程编程-- part 9 信号量:Semaphore
Semaphore简介 Semaphore是一个计数信号量,它的本质是一个"共享锁". 信号量维护了一个信号量许可集.线程可以通过调用acquire()来获取信号量的许可:当信号量 ...
- 在iPhone开发中实现解压缩gzip
在iPhone开发中实现解压缩gzip是本文要介绍的内容,最近做的一个东西中,需要从网络获取xml文件,但是该文件用了gzip压缩的.搜索一 下有人说gzip压缩的用urlrequest可以自己解压, ...
- scrapy-redis 实现分布式爬虫
分布式爬虫 一 介绍 原来scrapy的Scheduler维护的是本机的任务队列(存放Request对象及其回调函数等信息)+本机的去重队列(存放访问过的url地址) 所以实现分布式爬取的关键就是,找 ...
- Hbase1.4.9的安装
HBase介绍 HBase – Hadoop Database,是一个高可靠性.高性能.面向列.可伸缩的分布式存储系统,利用HBase技术可在廉价PC Server上搭建起大规模结构化存储集群. HB ...
- fiddler获取响应时间以及服务器IP
抓包工具fiddler实现http协议请求应答抓包.在接口测试.性能测试.安全测试等软件测试活动过程中,可能会遇到需要获取接口响应时间.接口服务器IP这样的情况.默认情况下fiddler不支持接口响应 ...
- java8学习之Collector复合与注意事项
接着上一次[http://www.cnblogs.com/webor2006/p/8318066.html]继续对Collector进行javadoc详读,上一次读到了这: 接下来一条条来过目一下: ...
- list实现栈以及队列操作
1.堆栈stack操作:尾进 尾出 或者叫先进后出 //1借助LinkedList 类中的方法实现栈 public class MyStack { private LinkedList<Obje ...
- HashMap,LinkedHashMap,TreeMap的有序性
HashMap 实际上是一个链表的数组.HashMap 的一个功能缺点是它的无序性,被存入到 HashMap 中的元素,在遍历 HashMap 时,其输出是无序的.如果希望元素保持输入的顺序,可以使用 ...
- 多对多第三张表的创建方式 和 forms组件的使用
目录 一.多对多第三张表的创建 1. 全自动方式 (1)实现代码 (2)优点和不足 2. 纯手撸方式(了解) (1)实现代码 (2)优点和不足 3. 半自动方式(推荐使用) (1)实现代码 (2)优点 ...