一直想着抓取股票的变化,偶然的机会在看股票数据抓取的博客看到了kaggle,然后看了看里面的题,感觉挺新颖的,就试了试. 题目如图:给了一个train.csv,现在预测test.csv里面的Passager是否幸存.train.csv里面包含的乘客信息有 PassagerId 乘客id Survived 乘客是否幸存 Pclass 仓位 Name 乘客姓名 Sex 乘客性别 Age 乘客年龄 SibSp 船上是否有兄弟姐妹 Parch 穿上是否有父母子女 Ticket 船票信息 Fare 票价…
 下面一文章就总结几点关键: 1.要学会观察,尤其是输入数据的特征提取时,看各输入数据和输出的关系,用绘图看! 2.训练后,看测试数据和训练数据误差,确定是否过拟合还是欠拟合: 3.欠拟合的话,说明模型不准确或者特征提取不够,对于特征提取不够问题,可以根据模型的反馈来看其和数据的相关性,如果相关系数是0,则放弃特征,如果过低,说明特征需要再次提炼! 4.用集成学习,bagging等通常可以获得更高的准确度! 5.缺失数据可以使用决策树回归进行预测! 转自:http://blog.csdn.net…
项目地址 https://www.kaggle.com/c/titanic 项目介绍: 除了乘客的编号以外,还包括下表中10个字段,构成了数据的所有特征 Variable Definition Key survival 是否存活 0 = No, 1 = Yes pclass 票的等级 1 = 1st, 2 = 2nd, 3 = 3rd sex 性别   Age 年龄   sibsp 同乘配偶或兄弟姐妹   parch 同乘孩子或父母   ticket 票号   fare 乘客票价   cabin…
泰坦尼克号幸存预测是本小白接触的第一个Kaggle入门比赛,主要参考了以下两篇教程: https://www.cnblogs.com/star-zhao/p/9801196.html https://zhuanlan.zhihu.com/p/30538352 本模型在Leaderboard上的最高得分为0.79904,排名前13%. 由于这个比赛做得比较早了,当时很多分析的细节都忘了,而且由于是第一次做,整体还是非常简陋的.今天心血来潮,就当做个简单的记录(流水账). 导入相关包: import…
A Data Science Framework: To Achieve 99% Accuracy https://www.kaggle.com/ldfreeman3/a-data-science-framework-to-achieve-99-accuracy/notebook 额,总共花了2天时间才把上面这个优秀回答运行完,前面还算看得懂,如何清理数据,和画图看联系 但是后面的数据处理,使用各种模型,不知道原理是什么,后面还得花点时间补一下,现在这里记录一下 疑问汇总: 第一问,第21行,左…
Method Feature(s) Sample(s) Result Value/Feature Permutation Importance 1 all validation samples Single Scale Partial Dependence Plots 1~2 all validation samples Vector(reasults vs feature) SHAP Values N individual sample 每个feature对当前结果的贡献(相对于baselin…
https://www.quora.com/How-do-I-learn-machine-learning-1?redirected_qid=6578644   How Can I Learn X? Learning Machine Learning Learning About Computer Science Educational Resources Advice Artificial Intelligence How-to Question Learning New Things Lea…
昨天总结了深度学习的资料,今天把机器学习的资料也总结一下(友情提示:有些网站需要"科学上网"^_^) 推荐几本好书: 1.Pattern Recognition and Machine Learning (by Hastie, Tibshirani, and Friedman's ) 2.Elements of Statistical Learning(by Bishop's) 这两本是英文的,但是非常全,第一本需要有一定的数学基础,第可以先看第二本.如果看英文觉得吃力,推荐看一下下面…
Step 1: Basic Python Skills install Anacondaincluding numpy, scikit-learn, and matplotlib Step 2: Foundational Machine Learning Skills Unofficial Andrew Ng course notes Tom Mitchell Machine Learning Lectures Step 3: Scientific Python Packages Overvie…
本文汇编了一些机器学习领域的框架.库以及软件(按编程语言排序). 1. C++ 1.1 计算机视觉 CCV —基于C语言/提供缓存/核心的机器视觉库,新颖的机器视觉库 OpenCV—它提供C++, C, Python, Java 以及 MATLAB接口,并支持Windows, Linux, Android and Mac OS操作系统. 1.2 机器学习 MLPack DLib ecogg shark 2. Closure Closure Toolbox—Clojure语言库与工具的分类目录 3…
<Brief History of Machine Learning> 介绍:这是一篇介绍机器学习历史的文章,介绍很全面,从感知机.神经网络.决策树.SVM.Adaboost到随机森林.Deep Learning. <Deep Learning in Neural Networks: An Overview> 介绍:这是瑞士人工智能实验室Jurgen Schmidhuber写的最新版本<神经网络与深度学习综述>本综述的特点是以时间排序,从1940年开始讲起,到60-80…
from:http://analyticsbot.ml/2016/10/machine-learning-pre-processing-features/ Machine Learning : Pre-processing features October 21, 2016 I am participating in this Kaggle competition. It is a prediction problem contest. The problem statement is: How…
What: 就是将统计学算法作为理论,计算机作为工具,解决问题.statistic Algorithm. How: 如何成为菜鸟一枚? http://www.quora.com/How-can-a-beginner-train-for-machine-learning-contests 链接内容总结: "学习任何一门学科,framework是必不可少的东西.没有framework的东西,那是研究." -- Jason Hawk One thing is for sure; you ca…
Common Pitfalls In Machine Learning Projects In a recent presentation, Ben Hamner described the common pitfalls in machine learning projects he and his colleagues have observed during competitions on Kaggle. The talk was titled "Machine Learning Grem…
<Brief History of Machine Learning> 介绍:这是一篇介绍机器学习历史的文章,介绍很全面,从感知机.神经网络.决策树.SVM.Adaboost 到随机森林.Deep Learning. <Deep Learning in Neural Networks: An Overview> 介绍:这是瑞士人工智能实验室 Jurgen Schmidhuber 写的最新版本<神经网络与深度学习综述>本综述的特点是以时间排序,从 1940 年开始讲起,到…
转自:机器学习(Machine Learning)&深度学习(Deep Learning)资料 <Brief History of Machine Learning> 介绍:这是一篇介绍机器学习历史的文章,介绍很全面,从感知机.神经网络.决策树.SVM.Adaboost到随机森林.Deep Learning. <Deep Learning in Neural Networks: An Overview> 介绍:这是瑞士人工智能实验室Jurgen Schmidhuber写的最…
##机器学习(Machine Learning)&深度学习(Deep Learning)资料(Chapter 2)---#####注:机器学习资料[篇目一](https://github.com/ty4z2008/Qix/blob/master/dl.md)共500条,[篇目二](https://github.com/ty4z2008/Qix/blob/master/dl2.md)开始更新------#####希望转载的朋友**一定要保留原文链接**,因为这个项目还在继续也在不定期更新.希望看到…
机器学习系统设计(Building Machine Learning Systems with Python)- Willi Richert Luis Pedro Coelho 总述 本书是 2014 的,看完以后才发现有第二版的更新,2016.建议阅读最新版,有能力的建议阅读英文版,中文翻译有些地方比较别扭(但英文版的书确实是有些贵). 我读书的目的:泛读主要是想窥视他人思考的方式. 作者写书的目标:面向初学者,但有时间看看也不错.作者说"我希望它能激发你的好奇心,并足以让你保持渴望,不断探索…
本章通过一个例子,介绍机器学习的整个流程. 2.1 使用真实数据集练手(Working with Real Data) 国外一些获取数据的网站: Popular open data repositories: UC Irvine Machine Learning Repository Kaggle datasets Amazon's AWS datasets Meta portals (they list open data repositories): http://dataportals.o…
Complete Small Focused Projects and Demonstrate Your Skills (完成小型针对性机器学习项目,证明你的能力) A portfolio is typically used by designers and artists to show examples of prior work to prospective clients and employers. Design, art and photography are examples wh…
Python Tools for Machine Learning Python is one of the best programming languages out there, with an extensive coverage in scientific computing: computer vision, artificial intelligence, mathematics, astronomy to name a few. Unsurprisingly, this hold…
Brief History of Machine Learning My subjective ML timeline Since the initial standpoint of science, technology and AI, scientists following Blaise Pascal and Von Leibniz ponder about a machine that is intellectually capable as much as humans. Famous…
https://www.quora.com/How-do-I-learn-mathematics-for-machine-learning   How do I learn mathematics for machine learning? Promoted by Time Doctor Software for productivity tracking. Time tracking and productivity improvement software with screenshots…
BRIEF HISTORY OF MACHINE LEARNING My subjective ML timeline (click for larger) Since the initial standpoint of science, technology and AI, scientists following Blaise Pascal and Von Leibniz ponder about a machine that is intellectually capable as muc…
Hi, Long time no see. Briefly, I plan to step into this new area, data analysis. In the past few years, I have tried Linux programming, device driver development, android application development and RF SOC development. Thus, "data analysis become my…
本博客汇总了个人在学习过程中所看过的一些论文.代码.资料以及常用的资源与网站,为了便于记录自身的学习过程,将其整理于博客之中. Machine Learning (1) Machine Learning Video Library - Caltech说明:罗列了机器学习的常用算法以及机器学习图谱 (2) Deep Learning - Bengio 说明:Deep Learning三大牛之一Bengio写的一本书 (3) Understanding LSTM Networks 说明:非常棒的LS…
转载:http://dataunion.org/8463.html?utm_source=tuicool&utm_medium=referral <Brief History of Machine Learning> 介绍:这是一篇介绍机器学习历史的文章,介绍很全面,从感知机.神经网络.决策树.SVM.Adaboost到随机森林.Deep Learning. <Deep Learning in Neural Networks: An Overview> 介绍:这是瑞士人工智…
How to handle Imbalanced Classification Problems in machine learning? from:https://www.analyticsvidhya.com/blog/2017/03/imbalanced-classification-problem/ Introduction If you have spent some time in machine learning and data science, you would have d…
机器学习(Machine Learning)&深度学习(Deep Learning)资料 機器學習.深度學習方面不錯的資料,轉載. 原作:https://github.com/ty4z2008/Qix/blob/master/dl.md 原作作者會不斷更新.本文更新至2014-12-21 <Brief History of Machine Learning> 介绍:这是一篇介绍机器学习历史的文章,介绍非常全面.从感知机.神经网络.决策树.SVM.Adaboost到随机森林.Deep L…
<Brief History of Machine Learning> 介绍:这是一篇介绍机器学习历史的文章,介绍很全面,从感知机.神经网络.决策树.SVM.Adaboost到随机森林.Deep Learning. <Deep Learning in Neural Networks: An Overview> 介绍:这是瑞士人工智能实验室Jurgen Schmidhuber写的最新版本<神经网络与深度学习综述>本综述的特点是以时间排序,从1940年开始讲起,到60-80…