Introduction to Data Visualization – Theory, R & ggplot2

The topic of data visualization is very popular in the data science community. The market size for visualization products is valued at $4 Billion and is projected to reach $7 Billion by the end of 2022 according to Mordor Intelligence. While we have seen amazing advances in the technology to display information, the understanding of how, why, and when to use visualization techniques has not kept up. Unfortunately, people are often taught how to make a chart before even thinking about whether or not it’s appropriate.

In short, are you adding value to your work or are you simply adding this to make it seem less boring? Let’s take a look at some examples before going through the Stoltzmaniac Data Visualization Philosophy.


I have to give credit to Junk Charts – it inspired a lot of this post.

One author at Vox wanted to show the cause of death in all of Shakespeare

Is this not insane!?!?!

Using a legend instead of data callouts is the only thing that could have made this worse. The author could easily have used a number of other tools to get the point across. While wordles are not ideal for any work requiring exact proportions, it does make for a great visual in this article.Junk Charts Article.

To be clear, I’m not close to being perfect when it comes to visualizations in my blog. The sizes, shapes, font colors, etc. tend to get out of control and I don’t take the time in R to tinker with all of the details. However, when it comes to displaying things professionally, it has to be spot on! So, I’ll walk through my theory and not worry too much about aesthetics (save that for a time when you’re getting paid).


The Good, The Bad, The Ugly

“The Good” visualizations:

  • Clearly illustrate a point
  • Are tailored to the appropriate audience
    • Analysts may want detail
    • Executives may want a high-level view
  • Are tailored to the presentation medium
    • A piece in an academic journal can be analyzed slowly and carefully
    • A slide in front of 5,000 people in a conference will be glanced at quickly
  • Are memorable to those who care about the material
  • Make an impact which increases the understanding of the subject matter

“The Bad” visualizations:

  • Are difficult to interpret
  • Are unintentionally misleading
  • Contain redundant and boring information

“The Ugly” visualizations:

  • Are almost impossible to interpret
  • Are filled with completely worthless information
  • Are intentionally created to mislead the audience
  • Are inaccurate

Coming soon:

  • Introduction to the ggplot2 in R and how it works
  • Determining whether or not you need a visualization
  • Choosing the type of plot to use depending on the use case
  • Visualization beyond the standard charts and graphs

As always, the code used in this post is on my GitHub

转自:https://www.stoltzmaniac.com/data-visualization-part-1/

DATA VISUALIZATION – PART 1的更多相关文章

  1. 7 Tools for Data Visualization in R, Python, and Julia

    7 Tools for Data Visualization in R, Python, and Julia Last week, some examples of creating visualiz ...

  2. Data Visualization 课程 笔记1

    对数据可视化比较有兴趣,因此最近在看coursera上伊利诺伊大学香槟分校的数据可视化课程,做了一些笔记. 1. 定义 Data visualization is a high bandwidth c ...

  3. DATA VISUALIZATION – PART 2

    A Quick Overview of the ggplot2 Package in R While it will be important to focus on theory, I want t ...

  4. Data Visualization – Banking Case Study Example (Part 1-6)

    python信用评分卡(附代码,博主录制) https://study.163.com/course/introduction.htm?courseId=1005214003&utm_camp ...

  5. D3.js & Data Visualization & SVG

    D3.js & Data Visualization & SVG https://davidwalsh.name/learning-d3 // import {scaleLinear} ...

  6. charts & data visualization

    charts & data visualization https://www.sitepoint.com/15-best-javascript-charting-libraries/ Can ...

  7. 学习笔记之Bokeh Data Visualization | DataCamp

    Bokeh Data Visualization | DataCamp https://www.datacamp.com/courses/interactive-data-visualization- ...

  8. 学习笔记之Introduction to Data Visualization with Python | DataCamp

    Introduction to Data Visualization with Python | DataCamp https://www.datacamp.com/courses/introduct ...

  9. 学习笔记之Data Visualization

    Data visualization - Wikipedia https://en.wikipedia.org/wiki/Data_visualization Data visualization o ...

随机推荐

  1. python 实例方法,类方法和静态方法

    在学习python代码时,看到有的类的方法中第一参数是cls,有的是self,经过了解得知,python并没有对类中方法的第一个参数名字做限制,可以是self,也可以是cls,不过根据人们的惯用用法, ...

  2. Angular.js学习笔记 (一)

    - angular中最重要的概念是指令(directive)- ng-model 是双向数据绑定的指令,效果就是将当前元素的value属性和模型中的[user.name]建立绑定关系### 模块(Mo ...

  3. 前端必备PS技巧

    hai,how is it going?I'm MuQing.I come back.哈哈,最近在英语的路上奋战,小秀一下.又好久没写博客了,实习的生活渐行渐远了,回到学校也终于能够坐下来对很多东西进 ...

  4. UVa/数组和字符串习题集

    UVa-272. Description: TEX is a typesetting language developed by Donald Knuth. It takes source text ...

  5. 基本数据结构——堆(Heap)的基本概念及其操作

    基本数据结构――堆的基本概念及其操作 小广告:福建安溪一中在线评测系统 Online Judge 在我刚听到堆这个名词的时候,我认为它是一堆东西的集合... 但其实吧它是利用完全二叉树的结构来维护一组 ...

  6. 【Spark2.0源码学习】-1.概述

          Spark作为当前主流的分布式计算框架,其高效性.通用性.易用性使其得到广泛的关注,本系列博客不会介绍其原理.安装与使用相关知识,将会从源码角度进行深度分析,理解其背后的设计精髓,以便后续 ...

  7. 使用vue-cli构建多页面应用+vux(三)

    上节中,我们成功的将vue-cli改造成了多入口,既然用了上简单的脚手架,那就希望用个合适的UI组件,去搜索了几个以后,最后选择了使用vux 贴上其vux的github地址  https://gith ...

  8. 在jquery中each循环中,要用return false代替break,return true代替continue。

    在jquery中each循环中,要用return false代替break,return true代替continue. $.each(data, function (n, value) { if(v ...

  9. Web.简单配置

    XML 元素不仅是大小写敏感的,而且它们还对出现在其他元素中的次序敏感.所有这些元素都是可选的.因此,可以省略掉某一元素,但不能把它放于不正确的位置. icon icon元素指出IDE和GUI工具用来 ...

  10. python练习_12

    题目:敏感词文本文件 filtered_words.txt,里面的内容 和 0011题一样,当用户输入敏感词语,则用 星号 * 替换,例如当用户输入「北京是个好城市」,则变成「**是个好城市」.(11 ...