
Data Science, Machine Learning, Big Data Analytics, Cognitive Computing …. well all of us have been avalanched with articles, skills demand info graph’s and point of views on these topics (yawn!). One thing is for sure; you cannot become a data scientist overnight. Its a journey, for sure a challenging one. But how do you go about becoming one? Where to start? When do you start seeing light at the end of the tunnel? What is the learning roadmap? What tools and techniques do I need to know? How will you know when you have achieved your goal?

Given how critical visualization is for data science, ironically I was not able to find (except for a few), pragmatic and yet visual representation of what it takes to become a data scientist. So here is my modest attempt at creating a curriculum, a learning plan that one can use in this becoming a data scientist journey. I took inspiration from the metro maps and used it to depict the learning path. I organized the overall plan progressively into the following areas / domains,

  1. Fundamentals
  2. Statistics
  3. Programming
  4. Machine Learning
  5. Text Mining / Natural Language Processing
  6. Data Visualization
  7. Big Data
  8. Data Ingestion
  9. Data Munging
  10. Toolbox

Each area  / domain is represented as a “metro line”, with the stations depicting the topics you must learn / master / understand in a progressive fashion. The idea is you pick a line, catch a train and go thru all the stations (topics) till you reach the final destination (or) switch to the next line. I have progressively marked each station (line) 1 thru 10 to indicate the order in which you travel. You can use this as an individual learning plan to identify the areas you most want to develop and the acquire skills. By no means this is the end; but a solid start. Feel free to leave your comments and constructive feedback.

Becoming a Data Scientist – Curriculum via Metromap的更多相关文章

  1. What do data scientist do?

    What do data scientist do? 1. Define the question 2.Define the ideal data set 3.Determine what data ...

  2. 现在很火的数据科学到底是什么?你对做DATA SCIENTIST感兴趣吗?

    转自– Warald (Email: 博客:,微博: ...

  3. 记录一下我做Udacity 的Data Scientist Nano Degree Project

    做项目的时候看了别人的blog,决定自己也随手记录下在做项目中遇到的好的小知识点. 最近在做Udacity的Data Scientist Nano Degree Project的Customer_Se ...

  4. 数据分析师(Data Analyst),数据工程师(Data Engineer),数据科学家(Data Scientist)的区别

    数据分析师(Data Analyst):负责从数据中提取出有用的信息,以帮助公司形成业务决策.工作内容包括:对数据进行提取,清洗,分析(用描述统计量,趋势分析,多维度分析,假设检验等统计常用方法对数据 ...

  5. 数据科学工作者(Data Scientist) 的日常工作内容包括什么

    数据科学工作者(Data Scientist) 的日常工作内容包括什么 众所周知,数据科学是这几年才火起来的概念,而应运而生的数据科学家(data scientist)明显缺乏清晰的录取标准和工作内容 ...

  6. Principal Data Scientist ...

  7. 微软职位内部推荐-Senior Data Scientist

    微软近期Open的职位: Extracting accurate, insightful and actionable information from data is part art and pa ...

  8. 微软职位内部推荐-Data Scientist

    微软近期Open的职位: Job Description:Extracting accurate, insightful and actionable information from data is ...

  9. Data scientist———java实现常见的机器学习代码(跟百度深度学习研究院师兄学机器学习)

    2016-05-02开始决定好好记录一切有关<数据科学家>的学习过程.记录学习笔记. --------------------------------------------------- ...


  1. 【转】移动端viewport的使用

    web端网站转移至移动端页面,注意点如下: 1.首先引入viewport调整页面宽度 <meta name="viewport" content="width=de ...

  2. 直接请求URL调用 axis webservices

    假设 有名称为 login 方法,且参数为 name , pwd 则,URL请求如下 ...


    [背景] 5.6.4以后时间类型(TIME,DATETIME,TIMESTAMP)支持微秒 DATETIME范围 :'1000-01-01 00:00:00.000000' to '9999-12-3 ...

  4. SQLSERVER执行性能统计工具SQLQueryStress

    SQLSERVER执行时间统计工具SQLQueryStress 有时候需要检测一下SQL语句的执行时间,相信大家都会用SET STATISTICS TIME ON开关打开SQLSERVER内置的时间统 ...

  5. 【Leetcode】【Medium】Decode Ways

    A message containing letters from A-Z is being encoded to numbers using the following mapping: 'A' - ...

  6. [C++] socket -9[匿名管道]

    ::怎么弄都不能读取信息....先把代码放着.... #include<windows.h> #include<stdio.h> int main() { HANDLE rea ...

  7. Linux系列笔记 - 用户以及用户组命令

    一.前言 这一系列的随笔笔记,并不是详细的说明的命令的原理,只是简单的记录, 以备后期的查看以及复习 二.直接输入命令问题 有时候,我们在用 useradd groupadd等命令时,直接在终端输入的 ...

  8. Working With Taxonomy Field in CSOM

    How to create taxonomy field with CSOM If you need to programmatic create a taxonomy field, you need ...

  9. Windows 10 技术预览

    windows10的技术预览版已经发布了很久了,正式版大约在今年的夏天就会发布,作为微软寄予厚望的下一代全平台操作系统,相比于windows8.1,windows10做了哪些改进,又添加了哪些新功能. ...

  10. 网站标题ico那些事

    浏览器打开一个网页都会有一个标题,用来显示当前页面的相关内容,如网站名称或者一篇文章的大标题,而定义它应该显示啥的话完全由HTML中title标签的内容决定. 如我们的大博客园: