Supervised Learning

In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output.

Supervised learning problems are categorized into "regression" and "classification" problems. In a regression problem, we are trying to predict results within a continuous output, meaning that we are trying to map input variables to some continuous function. In a classification problem, we are instead trying to predict results in a discrete output. In other words, we are trying to map input variables into discrete categories.

Example 1:

Given data about the size of houses on the real estate market, try to predict their price. Price as a function of size is a continuous output, so this is a regression problem.

We could turn this example into a classification problem by instead making our output about whether the house "sells for more or less than the asking price." Here we are classifying the houses based on price into two discrete categories.

Example 2:

(a) Regression - Given a picture of a person, we have to predict their age on the basis of the given picture

(b) Classification - Given a patient with a tumor, we have to predict whether the tumor is malignant or benign.

Unsupervised Learning

Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don't necessarily know the effect of the variables.

We can derive this structure by clustering the data based on relationships among the variables in the data.

With unsupervised learning there is no feedback based on the prediction results.

Example:

Clustering: Take a collection of 1,000,000 different genes, and find a way to automatically group these genes into groups that are somehow similar or related by different variables, such as lifespan, location, roles, and so on.

Non-clustering: The "Cocktail Party Algorithm", allows you to find structure in a chaotic environment. (i.e. identifying individual voices and music from a mesh of sounds at a cocktail party).

Cost Function

This function is otherwise called the "Squared error function", or "Mean squared error".

ML_note1的更多相关文章

随机推荐

  1. git bash退回上一个文件夹

    cd ..\ a@w3311 MINGW32 /f/Projects/crm (master) $ cd..\ > bash: cd..: command not found a@w3311 M ...

  2. Attrib命令,可以让文件夹彻底的隐藏起来

    Attrib命令,可以让文件夹彻底的隐藏起来,就算是在文件夹选项中设置了显示隐藏文件夹,也无法显示出来的.只能通过路径访问的方式打开文件夹.如上图,就是attrib命令的隐藏文件夹和显示文件夹的两条命 ...

  3. hide the navigationBar and tabBar

    [self.navigationController setNavigationBarHidden:YES animated:NO]; hidesBottomBarWhenPushed

  4. jstring, String, char* 变换函数

    #include <malloc.h> #include <string.h> #include <stdlib.h> #include <vcclr.h&g ...

  5. DWR整合之Servlet

    DWR 与 Servlet 有 2 个 Java 类你一般需要用在 DWR 中,是 webContext 和 WebContextFactory 在 DWR 1.x 它们在 uk.ltd.getahe ...

  6. Python 函数简介之一

    面向对象--->对象------>Class 面向过程--->过程------>def 函数式编程--->函数---->def 函数的介绍: def fun1(): ...

  7. .NET程序默认启动线程数

    问:一个.NET程序在运行时到底启动了多少个线程? 答:至少3个. 启动CLR并运行Main方法的主线程 调试器帮助线程 Finalizer线程 class Program { static void ...

  8. HDU 1317(Floyd判断连通性+spfa判断正环)

    XYZZY Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/32768 K (Java/Others)Total Submi ...

  9. Super Jumping! Jumping! Jumping! 基础DP

    Super Jumping! Jumping! Jumping! Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/32768 ...

  10. 【poj解题】1308

    判断一个图是否是一个树,树满足一下2个条件即可:1. 边树比node数少12. 所有node的入度非0即1 节点数是0的时候,空树,合法树~ 代码如下 #include <stdio.h> ...