神经网络的逻辑应该都是熟知的了,在这里想说明一下交叉验证

交叉验证方法:

看图大概就能理解了,大致就是先将数据集分成K份,对这K份中每一份都取不一样的比例数据进行训练和测试。得出K个误差,将这K个误差平均得到最终误差

这第一个部分是BP神经网络的建立

参数选取参照论文:基于数据挖掘技术的股价指数分析与预测研究_胡林林

  1. import math
  2. import random
  3. import tushare as ts
  4. import pandas as pd
  5.  
  6. random.seed(0)
  7.  
  8. def getData(id,start,end):
  9. df = ts.get_hist_data(id,start,end)
  10. DATA=pd.DataFrame(columns=['rate1', 'rate2','rate3','pos1','pos2','pos3','amt1','amt2','amt3','MA20','MA5','r'])
  11. P1 = pd.DataFrame(columns=['high','low','close','open','volume'])
  12. DATA2=pd.DataFrame(columns=['R'])
  13. DATA['MA20']=df['ma20']
  14. DATA['MA5']=df['ma5']
  15. P=df['close']
  16. P1['high']=df['high']
  17. P1['low']=df['low']
  18. P1['close']=df['close']
  19. P1['open']=df['open']
  20. P1['volume']=df['volume']
  21.  
  22. DATA['rate1']=(P1['close'].shift(1)-P1['open'].shift(1))/P1['open'].shift(1)
  23. DATA['rate2']=(P1['close'].shift(2)-P1['open'].shift(2))/P1['open'].shift(2)
  24. DATA['rate3']=(P1['close'].shift(3)-P1['open'].shift(3))/P1['open'].shift(3)
  25. DATA['pos1']=(P1['close'].shift(1)-P1['low'].shift(1))/(P1['high'].shift(1)-P1['low'].shift(1))
  26. DATA['pos2']=(P1['close'].shift(2)-P1['low'].shift(2))/(P1['high'].shift(2)-P1['low'].shift(2))
  27. DATA['pos3']=(P1['close'].shift(3)-P1['low'].shift(3))/(P1['high'].shift(3)-P1['low'].shift(3))
  28. DATA['amt1']=P1['volume'].shift(1)/((P1['volume'].shift(1)+P1['volume'].shift(2)+P1['volume'].shift(3))/3)
  29. DATA['amt2']=P1['volume'].shift(2)/((P1['volume'].shift(2)+P1['volume'].shift(3)+P1['volume'].shift(4))/3)
  30. DATA['amt3']=P1['volume'].shift(3)/((P1['volume'].shift(3)+P1['volume'].shift(4)+P1['volume'].shift(5))/3)
  31. templist=(P-P.shift(1))/P.shift(1)
  32. tempDATA = []
  33. for indextemp in templist:
  34. tempDATA.append(1/(1+math.exp(-indextemp*100)))
  35. DATA['r'] = tempDATA
  36. DATA=DATA.dropna(axis=0)
  37. DATA2['R']=DATA['r']
  38. del DATA['r']
  39. DATA=DATA.T
  40. DATA2=DATA2.T
  41. DATAlist=DATA.to_dict("list")
  42. result = []
  43. for key in DATAlist:
  44. result.append(DATAlist[key])
  45. DATAlist2=DATA2.to_dict("list")
  46. result2 = []
  47. for key in DATAlist2:
  48. result2.append(DATAlist2[key])
  49. return result
  50.  
  51. def getDataR(id,start,end):
  52. df = ts.get_hist_data(id,start,end)
  53. DATA=pd.DataFrame(columns=['rate1', 'rate2','rate3','pos1','pos2','pos3','amt1','amt2','amt3','MA20','MA5','r'])
  54. P1 = pd.DataFrame(columns=['high','low','close','open','volume'])
  55. DATA2=pd.DataFrame(columns=['R'])
  56. DATA['MA20']=df['ma20'].shift(1)
  57. DATA['MA5']=df['ma5'].shift(1)
  58. P=df['close']
  59. P1['high']=df['high']
  60. P1['low']=df['low']
  61. P1['close']=df['close']
  62. P1['open']=df['open']
  63. P1['volume']=df['volume']
  64.  
  65. DATA['rate1']=(P1['close'].shift(1)-P1['open'].shift(1))/P1['open'].shift(1)
  66. DATA['rate2']=(P1['close'].shift(2)-P1['open'].shift(2))/P1['open'].shift(2)
  67. DATA['rate3']=(P1['close'].shift(3)-P1['open'].shift(3))/P1['open'].shift(3)
  68. DATA['pos1']=(P1['close'].shift(1)-P1['low'].shift(1))/(P1['high'].shift(1)-P1['low'].shift(1))
  69. DATA['pos2']=(P1['close'].shift(2)-P1['low'].shift(2))/(P1['high'].shift(2)-P1['low'].shift(2))
  70. DATA['pos3']=(P1['close'].shift(3)-P1['low'].shift(3))/(P1['high'].shift(3)-P1['low'].shift(3))
  71. DATA['amt1']=P1['volume'].shift(1)/((P1['volume'].shift(1)+P1['volume'].shift(2)+P1['volume'].shift(3))/3)
  72. DATA['amt2']=P1['volume'].shift(2)/((P1['volume'].shift(2)+P1['volume'].shift(3)+P1['volume'].shift(4))/3)
  73. DATA['amt3']=P1['volume'].shift(3)/((P1['volume'].shift(3)+P1['volume'].shift(4)+P1['volume'].shift(5))/3)
  74. templist=(P-P.shift(1))/P.shift(1)
  75. tempDATA = []
  76. for indextemp in templist:
  77. tempDATA.append(1/(1+math.exp(-indextemp*100)))
  78. DATA['r'] = tempDATA
  79. DATA=DATA.dropna(axis=0)
  80. DATA2['R']=DATA['r']
  81. del DATA['r']
  82. DATA=DATA.T
  83. DATA2=DATA2.T
  84. DATAlist=DATA.to_dict("list")
  85. result = []
  86. for key in DATAlist:
  87. result.append(DATAlist[key])
  88. DATAlist2=DATA2.to_dict("list")
  89. result2 = []
  90. for key in DATAlist2:
  91. result2.append(DATAlist2[key])
  92. return result2
  93.  
  94. def rand(a, b):
  95. return (b - a) * random.random() + a
  96.  
  97. def make_matrix(m, n, fill=0.0):
  98. mat = []
  99. for i in range(m):
  100. mat.append([fill] * n)
  101. return mat
  102.  
  103. def sigmoid(x):
  104. return 1.0 / (1.0 + math.exp(-x))
  105.  
  106. def sigmod_derivate(x):
  107. return x * (1 - x)
  108.  
  109. class BPNeuralNetwork:
  110. def __init__(self):
  111. self.input_n = 0
  112. self.hidden_n = 0
  113. self.output_n = 0
  114. self.input_cells = []
  115. self.hidden_cells = []
  116. self.output_cells = []
  117. self.input_weights = []
  118. self.output_weights = []
  119. self.input_correction = []
  120. self.output_correction = []
  121.  
  122. def setup(self, ni, nh, no):
  123. self.input_n = ni + 1
  124. self.hidden_n = nh
  125. self.output_n = no
  126. # init cells
  127. self.input_cells = [1.0] * self.input_n
  128. self.hidden_cells = [1.0] * self.hidden_n
  129. self.output_cells = [1.0] * self.output_n
  130. # init weights
  131. self.input_weights = make_matrix(self.input_n, self.hidden_n)
  132. self.output_weights = make_matrix(self.hidden_n, self.output_n)
  133. # random activate
  134. for i in range(self.input_n):
  135. for h in range(self.hidden_n):
  136. self.input_weights[i][h] = rand(-0.2, 0.2)
  137. for h in range(self.hidden_n):
  138. for o in range(self.output_n):
  139. self.output_weights[h][o] = rand(-2.0, 2.0)
  140. # init correction matrix
  141. self.input_correction = make_matrix(self.input_n, self.hidden_n)
  142. self.output_correction = make_matrix(self.hidden_n, self.output_n)
  143.  
  144. def predict(self, inputs):
  145. # activate input layer
  146. for i in range(self.input_n - 1):
  147. self.input_cells[i] = inputs[i]
  148. # activate hidden layer
  149. for j in range(self.hidden_n):
  150. total = 0.0
  151. for i in range(self.input_n):
  152. total += self.input_cells[i] * self.input_weights[i][j]
  153. self.hidden_cells[j] = sigmoid(total)
  154. # activate output layer
  155. for k in range(self.output_n):
  156. total = 0.0
  157. for j in range(self.hidden_n):
  158. total += self.hidden_cells[j] * self.output_weights[j][k]
  159. self.output_cells[k] = sigmoid(total)
  160. return self.output_cells[:]
  161.  
  162. def back_propagate(self, case, label, learn, correct):
  163. # feed forward
  164. self.predict(case)
  165. # get output layer error
  166. output_deltas = [0.0] * self.output_n
  167. for o in range(self.output_n):
  168. error = label[o] - self.output_cells[o]
  169. output_deltas[o] = sigmod_derivate(self.output_cells[o]) * error
  170. # get hidden layer error
  171. hidden_deltas = [0.0] * self.hidden_n
  172. for h in range(self.hidden_n):
  173. error = 0.0
  174. for o in range(self.output_n):
  175. error += output_deltas[o] * self.output_weights[h][o]
  176. hidden_deltas[h] = sigmod_derivate(self.hidden_cells[h]) * error
  177. # update output weights
  178. for h in range(self.hidden_n):
  179. for o in range(self.output_n):
  180. change = output_deltas[o] * self.hidden_cells[h]
  181. self.output_weights[h][o] += learn * change + correct * self.output_correction[h][o]
  182. self.output_correction[h][o] = change
  183. # update input weights
  184. for i in range(self.input_n):
  185. for h in range(self.hidden_n):
  186. change = hidden_deltas[h] * self.input_cells[i]
  187. self.input_weights[i][h] += learn * change + correct * self.input_correction[i][h]
  188. self.input_correction[i][h] = change
  189. # get global error
  190. error = 0.0
  191. for o in range(len(label)):
  192. error += 0.5 * (label[o] - self.output_cells[o]) ** 2
  193. return error
  194.  
  195. def train(self, cases, labels, limit=10000, learn=0.05, correct=0.1):
  196. for i in range(limit):
  197. error = 0.0
  198. for i in range(len(cases)):
  199. label = labels[i]
  200. case = cases[i]
  201. error += self.back_propagate(case, label, learn, correct)
  202.  
  203. def test(self,id):
  204. result=getData("", "2015-01-05", "2015-01-09")
  205. result2=getDataR("", "2015-01-05", "2015-01-09")
  206. self.setup(11, 5, 1)
  207. self.train(result, result2, 10000, 0.05, 0.1)
  208.  
  209. for t in resulttest:
  210. print(self.predict(t))

下面是选取14-15年数据进行训练,16年数据作为测试集,调仓周期为20个交易日,大约1个月,对上证50中的股票进行预测,选取预测的涨幅前10的股票买入,对每只股票分配一样的资金,初步运行没有问题,但就是太慢了,等哪天有空了再运行

  1. import BPnet
  2. import tushare as ts
  3. import pandas as pd
  4. import math
  5. import xlrd
  6. import datetime as dt
  7. import time
  8.  
  9. #
  10. #nn =BPnet.BPNeuralNetwork()
  11. #nn.test('000001')
  12. #for i in ts.get_sz50s()['code']:
  13. holdList=pd.DataFrame(columns=['time','id','value'])
  14. share=ts.get_sz50s()['code']
  15. time2=ts.get_k_data('')['date']
  16. newtime = time2[400:640]
  17. newcount=0
  18. for itime in newtime:
  19. print(itime)
  20. if newcount % 20 == 0:
  21.  
  22. sharelist = pd.DataFrame(columns=['time','id','value'])
  23. for ishare in share:
  24. backwardtime = time.strftime('%Y-%m-%d',time.localtime(time.mktime(time.strptime(itime,'%Y-%m-%d'))-432000*4))
  25. trainData = BPnet.getData(ishare, '2014-05-22',itime)
  26. trainDataR = BPnet.getDataR(ishare, '2014-05-22',itime)
  27. testData = BPnet.getData(ishare, backwardtime,itime)
  28. try:
  29. print(testData)
  30. testData = testData[-1]
  31. print(testData)
  32. nn = BPnet.BPNeuralNetwork()
  33. nn.setup(11, 5, 1)
  34. nn.train(trainData, trainDataR, 10000, 0.05, 0.1)
  35. value = nn.predict(testData)
  36. newlist= pd.DataFrame({'time':itime,"id":ishare,"value":value},index=[""])
  37. sharelist = sharelist.append(newlist,ignore_index=True)
  38. except:
  39. pass
  40. sharelist=sharelist.sort(columns ='value',ascending=False)
  41. sharelist = sharelist[:10]
  42. holdList=holdList.append(sharelist,ignore_index=True)
  43. newcount+=1
  44. print(holdList)

神经网络(python源代码)的更多相关文章

  1. Python源代码目录组织结构

  2. Python源代码剖析笔记3-Python运行原理初探

    Python源代码剖析笔记3-Python执行原理初探 本文简书地址:http://www.jianshu.com/p/03af86845c95 之前写了几篇源代码剖析笔记,然而慢慢觉得没有从一个宏观 ...

  3. 《python源代码剖析》笔记 Python的编译结果

    本文为senlie原创.转载请保留此地址:http://blog.csdn.net/zhengsenlie 1.python的运行过程 1)对python源码进行编译.产生字节码 2)将编译结果交给p ...

  4. 《python源代码剖析》笔记 Python虚拟机框架

    本文为senlie原创,转载请保留此地址:http://blog.csdn.net/zhengsenlie 1. Python虚拟机会从编译得到的PyCodeObject对象中依次读入每一条字节码指令 ...

  5. 如何打包发布加密的 Python 源代码

    这里介绍一种使用 PyInstaller 和 PyArmor 来发布加密 Python 源代码的方式,能够达到以下目的 把所有 Python 源代码打包成为可执行文件,客户不需要 Python 就可以 ...

  6. 决策树(含python源代码)

    因为最近实习的需要,所以用python里的sklearn包重新写了一次决策树 工具:sklearn,http://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy:将 ...

  7. python 源代码分析之调试设置

    首先在官方下载源代码,我下载的是最新版本3.4.3版本:https://www.python.org/ftp/python/3.4.3/Python-3.4.3.tgz 解压后的目录如下(借用网上的目 ...

  8. 实现一个单隐层神经网络python

    看过首席科学家NG的深度学习公开课很久了,一直没有时间做课后编程题,做完想把思路总结下来,仅仅记录编程主线. 一 引用工具包 import numpy as np import matplotlib. ...

  9. python 源代码保护 之 xx.py -> xx.so

    前情提要 之前由于项目的需要,需要我们将一部分“关键代码”隐藏起来. 虽然Python 先天支持 将源代码 编译后 生成 xxx.pyc 文件,但是破解起来相当容易 -_-!! 于是搜罗到了另外一种方 ...

随机推荐

  1. AndroidManifest.xml的android:name是否带.的区别

    android项目里面的AndroidManifest.xml,会有这样的定义        <activity android:name=".Main"           ...

  2. printf()输出

    printf()函数是式样化输出函数, 一般用于向准则输出设备按规定式样输出消息.正在编写步骤时经常会用到此函数.printf()函数的挪用式样为: printf("<式样化字符串&g ...

  3. javaEE中关于dao层和services层的理解

    javaEE中关于dao层和services层的理解 入职已经一个多月了,作为刚毕业的新人,除了熟悉公司的项目,学习公司的框架,了解项目的一些业务逻辑之外,也就在没学到什么:因为刚入职, 带我的那个师 ...

  4. 使用python+pychram进行API测试(接口测试)初级STEP 1

    花了一天时间安装了解了下最基本的python+pychram进行API测试,下面这个可以指导自己以后入门:基本的开发级别还需要学习 1.python下载地址:https://www.python.or ...

  5. golang在linux下的开发环境部署[未完]

    uname -a Linux symons_laptop 4.8.2-1-ARCH #1 SMP PREEMPT Mon Oct 17 08:11:46 CEST 2016 x86_64 GNU/Li ...

  6. java学习第20天(IO流)

    构造方法File file = new File("e:\\demo"); 创建文件夹 File file = new File("e:\\demo"); fi ...

  7. NSSortDescriptor对象进行数组排序

    //创建一个数组 NSArray *array = @[@"zhangsan", @"lisi", @"zhonger", @"z ...

  8. python容器类型:列表,字典,集合等

    容器的概念我是从C++的STL中学到的 什么是容器? 容器是用来存储和组织其他对象的对象. 也就是说容器里面可以放很多东西,这些东西可以是字符串,可以是整数,可以是自定义类型,然后把这些东西有组织的存 ...

  9. jQuery div内容间隔1秒动态向上滚动HTML、JS代码

    demo1: <!DOCTYPE html> <html> <head> <title>div内容间隔1秒动态滚动</title> < ...

  10. Excel实用技巧

    情景:有时候,我们写了一个公式,然后想在其他行也套用这个公式,一般人都是把鼠标放在那个公式所在的单元格的右下角,然后往下拉,数据量少的时候还好,数据量大的时候就不太好操作了,此时,我们需要一个好方法. ...