Warning message: In ks.test(x, y) : p-value will be approximate in the presence of ties   The warning messages are due to the implementation of the KS test in R, which expects a continuous distribution and thus there should not be any identical value…
完成R Programming第三周 这周作业有点绕,更多地是通过一个缓存逆矩阵的案例,向我们示范[词法作用域 Lexical Scopping]的功效.但是作业里给出的函数有点绕口,花费了我们蛮多心思. Lexical Scopping: The value of free variables are searched for in the environment where the function was defined. 因此 make.power<-function(n){ pow<…
博客总目录,记录学习R与数据分析的一切:http://www.cnblogs.com/weibaar/p/4507801.html  --- 好久没发博客 且容我大吼一句 终于做完这周R Programming的作业了! 之前一直有报coursera的课程,但是总是没有坚持下去,这次收到他们的邮件推广,说data science系列课程开通了R语言的中文课程,有中文版论坛,有中文字幕,如此诚意满满,再不报名,就实在太落伍了. 为了让自己坚持,还花钱买了signature track,所以当这周五…
这是原帖 http://www.reddit.com/r/programming/comments/358tnp/five_programming_problems_every_software_engineer/ 后来作者被人发现他自己给的solution是错的,然后有人调侃他自己应该通不过自己的面试. http://www.reddit.com/r/programming/comments/35cr6n/real_programmers_can_do_these_problems_easil…
library(datasets) head(airquality) #按月分組 s <- split(airquality, airquality$Month) str(s) summary(s) lapply(s,function(x) colMeans(x[,c("Ozone","Solar.R","Wind")],na.rm = T)) sapply(s,function(x) colMeans(x[,c("Ozone&q…
#Generating normal distribution (Pseudo) random number x<-rnorm(10) x x2<-rnorm(10,2,1) x2 set.seed() #Generating Poisson data rpois(10,1) rpois(10,2) rpois(10,20) ppois(2,2) #Cumulative distribution ##P r(x <= 2) 平均發生率為2 ppois(4,2) #Cumulative d…
Introduction For this first programming assignment you will write three functions that are meant to interact with dataset that accompanies this assignment. The dataset is contained in a zip file specdata.zip that you can download from the Coursera we…
Something’s Wrong! Indications that something’s not right message: A generic notification/diagnostic message produced by the message function;execution of the function continues warning: An indication that something is wrong but not necessarily fatal…
Looping on the Command Line Writing for, while loops is useful when programming but not particularly easy when working interactively on the command line. There are some functions which implement looping to make life easier lapply: Loop over a list an…
A Diversion on Binding Values to Symbol When R tries to bind a value to a symbol,it searches through a series of environments to find the appropriate value.When you are working on the command line and need to retrieve the value of an Robject, the ord…
Control Structures Control structures in R allow you to control the flow of execution of the program, depending on runtime conditions. Common structures are: if, else: testing a condition for: execute a loop a fixed number of times while: execute a l…
Subsetting There are a number of operators that can be used to extract subsets of R objects. [ always returns an object of the same class as the original; can be used to select more than one element (there is one exception) [[ is used to extract elem…
Reading Data There are a few principal functions reading data into R. read.table, read.csv, for reading tabular data readLines, for reading lines of a text file source, for reading in R code files (inverse of dump) dget, for reading in R code files (…
Objects R has five basic or “atomic” classes of objects: character numeric (real numbers) integer complex logical (True/False) The most basic object is a vector A vector can only contain objects of the same class BUT: The one exception is a list, whi…
更多大数据分析.建模等内容请关注公众号<bigdatamodeling> 将代码封装在函数PlotKS_N里,Pred_Var是预测结果,可以是评分或概率形式:labels_Var是好坏标签,取值为1或0,1代表坏客户,0代表好客户:descending用于控制数据按违约概率降序排列,如果Pred_Var是评分,则descending=0,如果Pred_Var是概率形式,则descending=1:N表示在将数据按风险降序排列后,等分N份后计算KS值. PlotKS_N函数返回的结果为一列表,…
正态检验与R语言 1.Kolmogorov–Smirnov test 统计学里, Kolmogorov–Smirnov 检验(亦称:K–S 检验)是用来检验数据是否符合某种分布的一种非参数检验,通过比较一个频率分布f(x)与理论分布g(x)或者两个观测值分布来判断是否符合检验假设.其原假设H0:两个数据分布一致或者数据符合理论分布.拒绝域构造为:D=max| f(x)- g(x)|,当实际观测值D>D(n,α)则拒绝H0,否则则接受H0假设.由于KS检验不需要知道数据的分布情况,在小样本的统计分…
# Chinese translations for R package # Copyright (C) 2005 The R Foundation # This file is distributed under the same license as the PACKAGE package. # 陈斐 <feic@normipaiva.com>, 2006. # 邓小冬 DENG Xiaodong <xd_deng@hotmail.com>, 2015. # msgid &qu…
//Accepted 14796 KB 453 ms //划分树 //把查询的次数m打成n,也是醉了一晚上!!! //二分l--r区间第k大的数和h比较 #include <cstdio> #include <cstring> #include <iostream> #include <queue> #include <cmath> #include <algorithm> using namespace std; /** * Thi…
INTRODUCTION GPUs (Graphic Processing Units) have become much more popular in recent years for computationally intensive calculations.  Despite these gains, the use of this hardware has been very limited in the R programming language.  Although possi…
博客总目录,记录学习R与数据分析的一切:http://www.cnblogs.com/weibaar/p/4507801.html  ------- 经过周末一个半天的努力,终于把这次的Assignment3做出来,然后做完Quiz4,顺利结束R Programming这门课程. 对这门课的综合吐槽就是,Roger老师的github头像好帅,动态视频更帅,视频内容还算充足,但远远不足以应付assignment.Assignment设计一个比一个精巧,难度一个比一个大,没有足够的耐性以及一定的基础…
统计学上分布有很多,在R中基本都有描述.因能力有限,我们就挑选几个常用的.比较重要的简单介绍一下每种分布的定义,公式,以及在R中的展示. 统计分布每一种分布有四个函数:d――density(密度函数),p――分布函数,q――分位数函数,r――随机数函数.比如,正态分布的这四个函数为dnorm,pnorm,qnorm,rnorm.下面我们列出各分布后缀,前面加前缀d.p.q或r就构成函数名:norm:正态,t:t分布,f:F分布,chisq:卡方(包括非中心) unif:均匀,exp:指数,wei…
一.正态分布参数检验 例1. 某种原件的寿命X(以小时计)服从正态分布N(μ, σ)其中μ, σ2均未知.现测得16只元件的寿命如下: 159 280 101 212 224 379 179 264                  222 362 168 250 149 260 485 170 问是否有理由认为元件的平均寿命大于255小时? 解:按题意,需检验 H0: μ ≤ 225     H1: μ >  225 此问题属于单边检验问题 可以使用R语言t.test t.test(x,y=N…
Introduction Deep learning is a recent trend in machine learning that models highly non-linear representations of data. In the past years, deep learning has gained a tremendous momentum and prevalence for a variety of applications (Wikipedia 2016a).…
This is the third post about LifeCycle Grids. You can find the first post about the sense of LifeCycle Grids and A-Z process for creating and visualizing with R programming language here. Lastly, here is the second post about adding monetary metrics…
I want to share a very powerful approach for customer segmentation in this post. It is based on customer’s lifecycle, specifically on frequency and recency of purchases. The idea of using these metrics comes from the RFM analysis. Recency and frequen…
R基础学习 The Art of R Programming 1.seq 产生等差数列:seq(from,to,by) seq(from,to,length) for(i in 1:length(x)) 当x为null时 i会依次取 1,0 for(i in seq(x)) 能避免x为null时产生错误. seq(x) 会产生 1:length(x)的向量 2.rep rep(x,n) x整体重复n次 rep(x,each=m) x的每个元素依次重复m次 rep(x,y) x中每个元素按照对于的…
R语言特征 对大小写敏感 通常,数字,字母,. 和 _都是允许的(在一些国家还包括重音字母).不过,一个命名必须以 . 或者字母开头,并且如果以 . 开头,第二个字符不允许是数字. 基本命令要么是表达式(expressions)要么就是 赋值(assignments). 命令可以被 (;)隔开,或者另起一行. 基本命令可以通过大括弧({和}) 放在一起构成一个复合表达式(compound expression). 一行中,从井号(#)开始到句子收尾之间的语句就是是注释. R是动态类型.强类型的语…
转自:https://wenku.baidu.com/view/ccfa573a3968011ca30091d6.html https://www.cnblogs.com/arkenstone/p/5496761.html 1.定义 Kolmogorov-Smirnov是比较一个频率分布f(x)与理论分布g(x)或者两个观测值分布的检验方法.其原假设H0:两个数据分布一致或者数据符合理论分布.D=max| f(x)- g(x)|,当实际观测值D>D(n,α)则拒绝H0,否则则接受H0假设.KS检…
R in Nutshell 前言 例子(nutshell包) 本书中的例子包括在nutshell的R包中,使用数据,需加载nutshell包 install.packages("nutshell") 第一部分:基础 第一章 批处理(Batch Mode) R provides a way to run a large set of commands in sequence and save the results to a file. 以batch mode运行R的一种方式是:使用系统…
R语言与数据挖掘:公式:数据:方法 R语言特征 对大小写敏感 通常,数字,字母,. 和 _都是允许的(在一些国家还包括重音字母).不过,一个命名必须以 . 或者字母开头,并且如果以 . 开头,第二个字符不允许是数字. 基本命令要么是表达式(expressions)要么就是 赋值(assignments). 命令可以被 (;)隔开,或者另起一行. 基本命令可以通过大括弧({和}) 放在一起构成一个复合表达式(compound expression). 一行中,从井号(#)开始到句子收尾之间的语句就…