Global Statistics:

Common seen methods as such

1. Mean

2. Median

3. Standard deviation:  the larger the number means it various a lot.

4. Sum.

Rolling Statistics:

It use a time window, moving forward each day to calculate the mean value of those window periods.

To find which day is good to buy which day is good for sell, we can use Bollinger bands.

Bollinger bands:

import os
import pandas as pd
import matplotlib.pyplot as plt def test_run():
start_date='2017-01-01'
end_data='2017-12-15'
dates=pd.date_range(start_date, end_data) # Create an empty data frame
df=pd.DataFrame(index=dates) symbols=['SPY', 'AAPL', 'IBM', 'GOOG', 'GLD']
for symbol in symbols:
temp=getAdjCloseForSymbol(symbol)
df=df.join(temp, how='inner') return df if __name__ == '__main__':
df=test_run()
# data=data.ix['2017-12-01':'2017-12-15', ['IBM', 'GOOG']]
# df=normalize_data(df)
ax = df['SPY'].plot(title="SPY rolling mean", label='SPY')
rm = df['SPY'].rolling(20).mean()
rm.plot(label='Rolling mean', ax=ax)
ax.set_xlabel('Date')
ax.set_ylabel('Price')
ax.legend(loc="upper left")
plt.show()

Now we can calculate Bollinger bands, it is 2 times std value.

"""Bollinger Bands."""

import os
import pandas as pd
import matplotlib.pyplot as plt def symbol_to_path(symbol, base_dir="data"):
"""Return CSV file path given ticker symbol."""
return os.path.join(base_dir, "{}.csv".format(str(symbol))) def get_data(symbols, dates):
"""Read stock data (adjusted close) for given symbols from CSV files."""
df = pd.DataFrame(index=dates)
if 'SPY' not in symbols: # add SPY for reference, if absent
symbols.insert(0, 'SPY') for symbol in symbols:
df_temp = pd.read_csv(symbol_to_path(symbol), index_col='Date',
parse_dates=True, usecols=['Date', 'Adj Close'], na_values=['nan'])
df_temp = df_temp.rename(columns={'Adj Close': symbol})
df = df.join(df_temp)
if symbol == 'SPY': # drop dates SPY did not trade
df = df.dropna(subset=["SPY"]) return df def plot_data(df, title="Stock prices"):
"""Plot stock prices with a custom title and meaningful axis labels."""
ax = df.plot(title=title, fontsize=12)
ax.set_xlabel("Date")
ax.set_ylabel("Price")
plt.show() def get_rolling_mean(values, window):
"""Return rolling mean of given values, using specified window size."""
return values.rolling(window=window).mean() def get_rolling_std(values, window):
"""Return rolling standard deviation of given values, using specified window size."""
# TODO: Compute and return rolling standard deviation
return values.rolling(window=window).std() def get_bollinger_bands(rm, rstd):
"""Return upper and lower Bollinger Bands."""
# TODO: Compute upper_band and lower_band
upper_band = rstd * 2 + rm
lower_band = rm - rstd * 2
return upper_band, lower_band def test_run():
# Read data
dates = pd.date_range('2012-01-01', '2012-12-31')
symbols = ['SPY']
df = get_data(symbols, dates) # Compute Bollinger Bands
# 1. Compute rolling mean
rm_SPY = get_rolling_mean(df['SPY'], window=20) # 2. Compute rolling standard deviation
rstd_SPY = get_rolling_std(df['SPY'], window=20) # 3. Compute upper and lower bands
upper_band, lower_band = get_bollinger_bands(rm_SPY, rstd_SPY) # Plot raw SPY values, rolling mean and Bollinger Bands
ax = df['SPY'].plot(title="Bollinger Bands", label='SPY')
rm_SPY.plot(label='Rolling mean', ax=ax)
upper_band.plot(label='upper band', ax=ax)
lower_band.plot(label='lower band', ax=ax) # Add axis labels and legend
ax.set_xlabel("Date")
ax.set_ylabel("Price")
ax.legend(loc='upper left')
plt.show() if __name__ == "__main__":
test_run()

Daily return:

Subtract the previous day's closing price from the most recent day's closing price. In this example, subtract $35.50 from $36.75 to get $1.25. Divide your Step 4 result by the previous day's closing price to calculate the daily return. Multiply this result by 100 to convert it to a percentage.

"""Compute daily returns."""

import os
import pandas as pd
import matplotlib.pyplot as plt def symbol_to_path(symbol, base_dir="data"):
"""Return CSV file path given ticker symbol."""
return os.path.join(base_dir, "{}.csv".format(str(symbol))) def get_data(symbols, dates):
"""Read stock data (adjusted close) for given symbols from CSV files."""
df = pd.DataFrame(index=dates)
if 'SPY' not in symbols: # add SPY for reference, if absent
symbols.insert(0, 'SPY') for symbol in symbols:
df_temp = pd.read_csv(symbol_to_path(symbol), index_col='Date',
parse_dates=True, usecols=['Date', 'Adj Close'], na_values=['nan'])
df_temp = df_temp.rename(columns={'Adj Close': symbol})
df = df.join(df_temp)
if symbol == 'SPY': # drop dates SPY did not trade
df = df.dropna(subset=["SPY"]) return df def plot_data(df, title="Stock prices", xlabel="Date", ylabel="Price"):
"""Plot stock prices with a custom title and meaningful axis labels."""
ax = df.plot(title=title, fontsize=12)
ax.set_xlabel(xlabel)
ax.set_ylabel(ylabel)
plt.show() def compute_daily_returns(df):
"""Compute and return the daily return values."""
# TODO: Your code here
# Note: Returned DataFrame must have the same number of rows
return df / df.shift(-1) -1 def test_run():
# Read data
dates = pd.date_range('2012-07-01', '2012-07-31') # one month only
symbols = ['SPY','XOM']
df = get_data(symbols, dates)
plot_data(df) # Compute daily returns
daily_returns = compute_daily_returns(df)
plot_data(daily_returns, title="Daily returns", ylabel="Daily returns") if __name__ == "__main__":
test_run()

Cumulative return:

an investment relative to the principal amount invested over a specified amount of time. ... To calculate cumulative return, subtract the original price of the investment from the current price and divide that difference by the original price.

[Python] Statistical analysis of time series的更多相关文章

  1. How-to: Do Statistical Analysis with Impala and R

    sklearn实战-乳腺癌细胞数据挖掘(博客主亲自录制视频教程) https://study.163.com/course/introduction.htm?courseId=1005269003&a ...

  2. python data analysis | python数据预处理(基于scikit-learn模块)

    原文:http://www.jianshu.com/p/94516a58314d Dataset transformations| 数据转换 Combining estimators|组合学习器 Fe ...

  3. python学习笔记—DataFrame和Series的排序

    更多大数据分析.建模等内容请关注公众号<bigdatamodeling> ################################### 排序 ################## ...

  4. Should You Build Your Own Backtester?

    By Michael Halls-Moore on August 2nd, 2016 This post relates to a talk I gave in April at QuantCon 2 ...

  5. Python数据分析工具:Pandas之Series

    Python数据分析工具:Pandas之Series Pandas概述Pandas是Python的一个数据分析包,该工具为解决数据分析任务而创建.Pandas纳入大量库和标准数据模型,提供高效的操作数 ...

  6. 用 Python 通过马尔可夫随机场(MRF)与 Ising Model 进行二值图降噪

    前言 这个降噪的模型来自 Christopher M. Bishop 的 Pattern Recognition And Machine Learning (就是神书 PRML……),问题是如何对一个 ...

  7. 大数据分析与机器学习领域Python兵器谱

    http://www.thebigdata.cn/JieJueFangAn/13317.html 曾经因为NLTK的缘故开始学习Python,之后渐渐成为我工作中的第一辅助脚本语言,虽然开发语言是C/ ...

  8. Machine and Deep Learning with Python

    Machine and Deep Learning with Python Education Tutorials and courses Supervised learning superstiti ...

  9. Python 网页爬虫 & 文本处理 & 科学计算 & 机器学习 & 数据挖掘兵器谱(转)

    原文:http://www.52nlp.cn/python-网页爬虫-文本处理-科学计算-机器学习-数据挖掘 曾经因为NLTK的缘故开始学习Python,之后渐渐成为我工作中的第一辅助脚本语言,虽然开 ...

随机推荐

  1. win10关闭更新

    计算机--管理: 找到windows update 服务关闭:

  2. CDR X6低价还能持续多久?官方回应18年元旦过后要涨价

    目前,CDR X6特价活动,从双十二的到18年的元旦,火热程度一直屡刷新高,究其原因,其实不是大家不需要,只是这个平面设计软件价格实在太高(CDR X8/8200:CDR 2017/9500一套),尤 ...

  3. 解决vuex刷新页面数据丢失

    1.前言 vue构建的项目中,vuex的状态存储是响应式的,当vue组件从store中读取状态的时候,若store中的状态发生变化,那么相应的组件也会得到高效刷新,问题来了,vuex存储的数据只是在页 ...

  4. matplotlib bar函数重新封装

    参考: https://blog.csdn.net/jenyzhang/article/details/52047557 https://blog.csdn.net/liangzuojiayi/art ...

  5. shell学习日志

    0.shell的变量同环境变量不同,存在用户环境区. 变量赋值的方式是: variable_name = variable_value a= "hello" $a对a进行取值 关于 ...

  6. HISTFILESIZE与HISTSIZE的区别

    在linux系统中,history命令可以输出历史命令,历史命令默认保存在文件~/.bash_history中. HISTFILESIZE 与 HISTSIZE都是history命令需要用到的两个sh ...

  7. Linux学习总结(12)——Linux必须学会的60个命令

    Linux系统信息存放在文件里,文件与普通的公务文件类似.每个文件都有自己的名字.内容.存放地址及其它一些管理信息,如文件的用户.文件的大小等. 文件可以是一封信.一个通讯录,或者是程序的源语句.程序 ...

  8. Docker_入门?只要这篇就够了!(纯干货适合0基础小白)

    与sgy一起开启你的Docker之路 关键词: Docker; mac; Docker中使用gdb无法进入断点,无法调试; 更新1: 看起来之前那一版博文中参考资料部分引用的外站链接太多,被系统自动屏 ...

  9. POJ——T 3041 Asteroids

    http://poj.org/problem?id=3041 Time Limit: 1000MS   Memory Limit: 65536K Total Submissions: 23565   ...

  10. java.sql.SQLException: The server time zone value 'Öйú±ê׼ʱ¼ä' is unrecognized

    Exception in thread "main" java.sql.SQLException: The server time zone value 'Öйú±ê׼ʱ¼ä ...