Supervised Learning

In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output.

Supervised learning problems are categorized into "regression" and "classification" problems. In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function. In a classification problem, we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories.

Example 1:

Given data about the size of houses on the real estate market, try to predict their price. Price as a function of size is a continuous output, so this is a regression problem.

We could turn this example into a classification problem by instead making our output about whether the house "sells for more or less than the asking price." Here we are classifying the houses based on price into two discrete categories.

Example 2:

(a) Regression - Given a picture of a person, we have to predict their age on the basis of the given picture

(b) Classification - Given a patient with a tumor, we have to predict whether the tumor is malignant or benign.

Unsupervised Learning

Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don't necessarily know the effect of the variables.

We can derive this structure by clustering the data based on relationships among the variables in the data.

With unsupervised learning there is no feedback based on the prediction results.

Example:

Clustering: Take a collection of 1,000,000 different genes, and find a way to automatically group these genes into groups that are somehow similar or related by different variables, such as lifespan, location, roles, and so on.

Non-clustering: The "Cocktail Party Algorithm", allows you to find structure in a chaotic environment. (i.e. identifying individual voices and music from a mesh of sounds at a cocktail party).

Cost Function

This function is otherwise called the "Squared error function", or "Mean squared error".

ML_note1的更多相关文章

随机推荐

  1. Zookeeper的安装和配置

    1.ZooKeeper 1.1 zk可以用来保证数据在zk集群之间的数据的事务性一致.2.如何搭建ZooKeeper服务器集群 2.1 zk服务器集群规模不小于3个节点,要求各服务器之间系统时间要保持 ...

  2. Centos修改默认网卡名

    安装系统后默认的网卡名称为 enpXX ,修改为熟悉的eth0 1 vi /etc/default/grub GRUB_TIMEOUT=5GRUB_DEFAULT=savedGRUB_DISABLE_ ...

  3. Hibernate中,将session绑定到线程时,在保存和查询数据的代码里,要正确的关闭session

    比如有个保存的方法 // 保存 public void save(){ Transaction t = XXX Session s = getSession.beginTransaction(); X ...

  4. 几种在shell命令行中过滤adb logcat输出的方法

    我们在Android开发中总能看到程序的log日志内容充满了屏幕,而真正对开发者有意义的信息被淹没在洪流之中,让开发者无所适从,严重影响开发效率.本文就具体介绍几种在shell命令行中过滤adblog ...

  5. SQL Server2008数据库中删除用户,提示数据库主体在该数据库中拥有 架构,无法删除

    一个数据库,运行在SQL Server 2008下,数据库用户无法删除,在删除时提示“数据库主体在该数据库中拥有架构,无法删除”.原因很简单,就是由于此用户在数据库中拥有某些架构的所有权,将相关架构的 ...

  6. 安装tomcat过程中出现问题小结

    报错信息如下:Neither the JAVA_HOME nor the JRE_HOME environment variable is defined At least one of these ...

  7. 【WiFi密码破解详细图文教程】ZOL仅此一份 详细介绍从CDlinux U盘启动到设置扫描破解-破解软件论坛-ZOL中关村在线

    body { font-family: Microsoft YaHei UI,"Microsoft YaHei", Georgia,Helvetica,Arial,sans-ser ...

  8. PAT (Advanced Level) 1016. Phone Bills (25)

    简单模拟题. #include<iostream> #include<cstring> #include<cmath> #include<algorithm& ...

  9. 【BZOj 3670】【UOJ #5】【NOI 2014】动物园

    http://www.lydsy.com/JudgeOnline/problem.php?id=3670 http://uoj.ac/problem/5 可以建出"KMP自动机"然 ...

  10. Lua学习系列(五)

    calling C functions from Lua 5.2 这篇文章也不错: http://blog.csdn.net/x356982611/article/details/26688287 h ...