Before you can plot anything, you need to specify which backend Matplotlib should use. The simplest option is to use Jupyter’s magic command %matplotlib inline. This tells Jupyter to set up Matplotlib so it uses Jupyter’s own backend.

Scatter Plot

housing.plot(kind="scatter", x="longitude", y="latitude")

You can set the parameter alpha to study the density of points:

housing.plot(kind="scatter", x="longitude", y="latitude", alpha=0.1)

The plot can convey more information by setting different colors, sizes, shapes, etc. Here we will use a predefined color map (option cmap) called jet. As an example, we plot the house prices in different locations and let the radius of each circle represents the district’s population (option s), and the color represents the price (option c).

 %matplotlib inline
import matplotlib.pyplot as plt
housing.plot(kind="scatter", x="longitude", y="latitude", alpha=0.4,
s=housing["population"]/100, label="population", figsize=(10,7),
c="median_house_value", cmap=plt.get_cmap("jet"), colorbar=True,
sharex=False)
plt.legend()
save_fig("housing_prices_scatterplot")

Note that the argument sharex=False fixes a display bug (the x-axis values and legend were not displayed). This is a temporary fix (see: https://github.com/pandas-dev/pandas/issues/10611).

Scatter Matrix

 from pandas.plotting import scatter_matrix

 attributes = ["median_house_value", "median_income", "total_rooms",
"housing_median_age"]
scatter_matrix(housing[attributes], figsize=(12, 8))
save_fig("scatter_matrix_plot")

Histogram

Histogram is a useful method to study the distribution of numeric attributes.

 %matplotlib inline
import matplotlib.pyplot as plt
housing.hist(bins=50, figsize=(20,15))
save_fig("attribute_histogram_plots")
plt.show()

For single attribute, you can use the following statement:

housing["median_income"].hist()

Correlation Plot

We can calculate the correlation coefficients between each pair of attributes using corr() method and look at the value by sort_values():

corr_matrix = housing.corr()
corr_matrix["median_house_value"].sort_values(ascending=False)

Also, we can use scatter_matrix function, which plots every numerical attribute against every other numerical attribute. The diagonal displays the histogram of each attribute.

from pandas.tools.plotting import scatter_matrix
attributes = ["median_house_value", "median_income", "total_rooms", "housing_median_age"]
scatter_matrix(housing[attributes], figsize=(12, 8))

[Machine Learning with Python] Data Visualization by Matplotlib Library的更多相关文章

  1. [Machine Learning with Python] Data Preparation through Transformation Pipeline

    In the former article "Data Preparation by Pandas and Scikit-Learn", we discussed about a ...

  2. [Machine Learning with Python] Data Preparation by Pandas and Scikit-Learn

    In this article, we dicuss some main steps in data preparation. Drop Labels Firstly, we drop labels ...

  3. Getting started with machine learning in Python

    Getting started with machine learning in Python Machine learning is a field that uses algorithms to ...

  4. Python (1) - 7 Steps to Mastering Machine Learning With Python

    Step 1: Basic Python Skills install Anacondaincluding numpy, scikit-learn, and matplotlib Step 2: Fo ...

  5. 《Learning scikit-learn Machine Learning in Python》chapter1

    前言 由于实验原因,准备入坑 python 机器学习,而 python 机器学习常用的包就是 scikit-learn ,准备先了解一下这个工具.在这里搜了有 scikit-learn 关键字的书,找 ...

  6. Coursera, Big Data 4, Machine Learning With Big Data (week 1/2)

    Week 1 Machine Learning with Big Data KNime - GUI based Spark MLlib - inside Spark CRISP-DM Week 2, ...

  7. 【Machine Learning】Python开发工具:Anaconda+Sublime

    Python开发工具:Anaconda+Sublime 作者:白宁超 2016年12月23日21:24:51 摘要:随着机器学习和深度学习的热潮,各种图书层出不穷.然而多数是基础理论知识介绍,缺乏实现 ...

  8. In machine learning, is more data always better than better algorithms?

    In machine learning, is more data always better than better algorithms? No. There are times when mor ...

  9. Machine Learning的Python环境设置

    Machine Learning目前经常使用的语言有Python.R和MATLAB.如果采用Python,需要安装大量的数学相关和Machine Learning的包.一般安装Anaconda,可以把 ...

随机推荐

  1. vue之列表循环

    文档:https://cn.vuejs.org/v2/guide/list.html 当 Vue.js 用 v-for 正在更新已渲染过的元素列表时,它默认用“就地复用”策略.如果数据项的顺序被改变, ...

  2. Python中的端口协议之基于UDP协议的通信传输

    UDP协议: 1.python中基于udp协议的客户端与服务端通信简单过程实现 2.udp协议的一些特点(与tcp协议的比较)        3.利用socketserver模块实现udp传输协议的并 ...

  3. pyecharts用法,本人亲测,陆续更新

    主题 除了默认的白色底色和dark之外,还支持安装扩展包 pip install echarts-themes-pypkg echarts-themes-pypkg 提供了 vintage, maca ...

  4. #pragma与_Pragma(转载)

    C90为预处理指令家族带来一位新成员:#pragma.一般情况下,大家很少见到它.        #pragma的作用是为特定的编译器提供特定的编译指示,这些指示是具体针对某一种(或某一些)编译器的, ...

  5. 通过IAR工程文件查看对应IAR版本号

    IAR使用技巧——如何使用合适的版本打开IAR工程 2014年07月05日 23:49:08 xukai871105 阅读数:12895 标签: IAR 更多 个人分类: 嵌入式ARM   0.前言 ...

  6. Win7里IIS7部署WebService

    最近忙于一个Web的Bug修正,是先人写的一个东东,架构很简单,一个前端的项目,一个WebService的项目,以及后台的一些dll.之前一直很排斥这个产品,因为它没法启动,印象中没有跑得起来过的时候 ...

  7. LA 5010 Go Deeper 2-SAT 二分

    题意: 有\(n\)个布尔变量\(x_i\),有一个递归函数.如果满足条件\(x[a[dep]] + x[b[dep]] \neq c[dep]\),那么就再往深递归一层. 问最多能递归多少层. 分析 ...

  8. js---JSONP原理及使用

    极简解释: 利用<script>标签没有跨域限制的“漏洞”(历史遗迹啊)来达到与第三方通讯的目的.当需要通讯时,本站脚本创建一个<script>元素,地址指向第三方的API网址 ...

  9. 爬虫Scrapy框架-Crawlspider链接提取器与规则解析器

    Crawlspider 一:Crawlspider简介 CrawlSpider其实是Spider的一个子类,除了继承到Spider的特性和功能外,还派生除了其自己独有的更加强大的特性和功能.其中最显著 ...

  10. sqlserver修改一个列

    --修改一个列alter table UserInfo alter Column [Address] nvarchar(64) null