假设有一段文本:"I have a cat, his name is Huzihu. Huzihu is really cute and friendly. We are good friends." 那么怎么提取这段文本的特征呢? 一个简单的方法就是使用词袋模型(bag of words model).选定文本内一定的词放入词袋,统计词袋内所有词在文本中出现的次数(忽略语法和单词出现的顺序),将其用向量的形式表示出来. 词频统计可以用scikit-learn的CountVectori…
https://www.quora.com/How-do-I-learn-mathematics-for-machine-learning How do I learn mathematics for machine learning? Promoted by Time Doctor Software for productivity tracking. Time tracking and productivity improvement software with screenshots…
@(131 - Machine Learning | 机器学习) 1 Feature Scaling transforms features to have range [0,1] according to the formula $x' = \frac{x-x_{min}}{x_{max}-x_{min}} $ 1.1 Sklearn - MinMaxScaler from sklearn.preprocessing import MinMaxScaler import numpy weigh…
In recent years, Kernel methods have received major attention, particularly due to the increased popularity of the Support Vector Machines. Kernel functions can be used in many applications as they provide a simple bridge from linearity to non-linear…
Machine Learning Methods: Decision trees and forests This post contains our crib notes on the basics of decision trees and forests. We first discuss the construction of individual trees, and then introduce random and boosted forests. We also discuss…
Machine Learning – Coursera Octave for Microsoft Windows GNU Octave官网 GNU Octave帮助文档 (有900页的pdf版本) Octave 4.0.0 安装 win7(文库) Octave学习笔记(文库) octave入门(文库) WIN7 64位系统安装JDK并配置环境变量(总是显示没有安装Java) MathWorks This week we're covering linear regression with mul…
In machine learning, is more data always better than better algorithms? No. There are times when more data helps, there are times when it doesn't. Probably one of the most famous quotes defending the power of data is that of Google's Research Directo…