Deep Reinforcement Learning Based Trading Application at JP Morgan Chase

https://medium.com/@ranko.mosic/reinforcement-learning-based-trading-application-at-jp-morgan-chase-f829b8ec54f2

FT released a story today about the new application that will optimize JP Morgan Chase trade execution ( Business Insider article on the same topic for readers that do not have FT subscription ). The intent is to reduce market impact and provide best trade execution results for large orders.

It is a complex application with many moving parts:

 

Its core is an RL algorithm that learns to perform the best action ( choose optimal price, duration and order size ) based on market conditions. It is not clear if it is Sarsa ( On-Policy TD Control) or Q-learning (Off-Policy Temporal Difference Control Algorithm ) as both algorithms are present in JP Morgan slides:

 

Sarsa

 

Q-learning

State consists of price series, expected spread cost, fill probability, size placed, as well as elapsed time, %progress, etc. Rewards are immediate rewards ( price spread ) and terminal ( end of episode ) rewards like completion, order duration and market penalties ( obviously those are negative rewards that punish the agent along these dimensions ).

 

Actions are memorized as weights of a Deep Neural Network — function approximation via NN is used since state, action space is too big to be handled in tabular form. We assume stochastic gradient descent is used for both feed forward and backprop operation operation ( hence Deep designation ):

 

JP Morgan is convinced this is the very first real time trading AI/ML application on Wall Street. We are assuming this is not true i.e. there are surely other players operating in this space as RL implementation to order execution is known for quite a while now ( Kearns and Nevmyvaka 2006 ).

The latest LOXM developmentswill be presented at QuantMinds Conference in Lisbon (May of 2018).

Instinet is also using Q-learning, probably for the same purpose ( market impact reduction ).

[转]Deep Reinforcement Learning Based Trading Application at JP Morgan Chase的更多相关文章

  1. 【资料总结】| Deep Reinforcement Learning 深度强化学习

    在机器学习中,我们经常会分类为有监督学习和无监督学习,但是尝尝会忽略一个重要的分支,强化学习.有监督学习和无监督学习非常好去区分,学习的目标,有无标签等都是区分标准.如果说监督学习的目标是预测,那么强 ...

  2. (转) Deep Reinforcement Learning: Playing a Racing Game

    Byte Tank Posts Archive Deep Reinforcement Learning: Playing a Racing Game OCT 6TH, 2016 Agent playi ...

  3. (转) Deep Reinforcement Learning: Pong from Pixels

    Andrej Karpathy blog About Hacker's guide to Neural Networks Deep Reinforcement Learning: Pong from ...

  4. (转) Playing FPS games with deep reinforcement learning

    Playing FPS games with deep reinforcement learning 博文转自:https://blog.acolyer.org/2016/11/23/playing- ...

  5. (zhuan) Deep Reinforcement Learning Papers

    Deep Reinforcement Learning Papers A list of recent papers regarding deep reinforcement learning. Th ...

  6. 论文笔记之:Asynchronous Methods for Deep Reinforcement Learning

    Asynchronous Methods for Deep Reinforcement Learning ICML 2016 深度强化学习最近被人发现貌似不太稳定,有人提出很多改善的方法,这些方法有很 ...

  7. [DQN] What is Deep Reinforcement Learning

    已经成为DL中专门的一派,高大上的样子 Intro: MIT 6.S191 Lecture 6: Deep Reinforcement Learning Course: CS 294: Deep Re ...

  8. 论文笔记:Learning how to Active Learn: A Deep Reinforcement Learning Approach

    Learning how to Active Learn: A Deep Reinforcement Learning Approach 2018-03-11 12:56:04 1. Introduc ...

  9. 18 Issues in Current Deep Reinforcement Learning from ZhiHu

    深度强化学习的18个关键问题 from: https://zhuanlan.zhihu.com/p/32153603 85 人赞了该文章 深度强化学习的问题在哪里?未来怎么走?哪些方面可以突破? 这两 ...

随机推荐

  1. 解决gitHub下载速度慢的问题

    转载:http://blog.csdn.net/x_studying/article/details/72588324 github被某个CDN被伟大的墙屏蔽所致. 解决方法: 1.访问http:// ...

  2. Analytic Functions in Oracle

    Contents Overview and IntroductionHow Analytic Functions WorkThe SyntaxExamplesCalculate a running T ...

  3. 运行B/s项目时,出现尝试访问类型与数组不兼容元素问题?

    1.问题描述 运行B/s项目时,浏览器出现应用程序中服务器错误(尝试访问类型与数组不兼容的元素) 2.问题原因 本人是项目引用的dll版本不一致问题,引用的System.Web.Mvc版本是4.0.0 ...

  4. 【Loadrunner_WebService接口】对项目中的GetProduct接口生成性能脚本

    一.环境 https://xxx.xxx.svc?wsdl 用户名:username 密码:password 对其中的GetProduct接口进行测试 备注:GetProducts.xml文件内容和S ...

  5. learning ddr mode register MR0

  6. bzoj1045

    题解: 随便推一下公式 然后发现是中位数 代码: #include<bits/stdc++.h> using namespace std; ],n; long long sum; int ...

  7. JDBC连接数据库:单线程、多线程、批处理插入数据的对比

    一.单线程(单条循环)插入50000条记录: 每执行一次就要访问一次数据库 import java.sql.Connection; import java.sql.DriverManager; imp ...

  8. jdk8-全新时间和日期api

    1.jdk8日期和时间api是线程安全的 1.java.time  处理日期时间 2.java.time.temporal: 时间校正器.获取每个月第一天,周几等等 3.java.time.forma ...

  9. java构造函数使用方法总结 (继承与构造函数)

    使用构造器时需要记住: 1.构造器必须与类同名(如果一个源文件中有多个类,那么构造器必须与公共类同名) 2.每个类可以有一个以上的构造器 3.构造器可以有0个.1个或1个以上的参数 4.构造器没有返回 ...

  10. DevExpress Windows 10 UWP Controls新版亮点

    行业领先的.NET界面控件2018年第二次重大更新——DevExpress v18.2日前正式发布,本站将以连载的形式为大家介绍新版本新功能.本文将介绍了DevExpress Windows 10 U ...