logistics多分类

multiclassification

#DATASET: https://archive.ics.uci.edu/ml/datasets/Glass+Identification
import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

import sklearn

import sklearn.preprocessing as pre

df=pd.read_csv('data\glassi\glass.data')

X,y=df.iloc[:,1:-1],df.iloc[:,-1]

X,y=np.array(X),np.array(y)

for idx,class_name in enumerate(sorted(list(set(y)))):

    y[y==class_name]=idx

from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.15,random_state=66)

f_mean, f_std = np.mean(X_train, axis=0), np.std(X_train, axis=0)

X_train = (X_train - f_mean) / f_std

X_test = (X_test - f_mean) / f_std

#add a constant parameter

X_train = np.concatenate((np.ones((X_train.shape[0], 1)), X_train), axis=1)

X_test = np.concatenate((np.ones((X_test.shape[0], 1)), X_test), axis=1)

#gradient descent function

def get_classifier(X_train,y_train,num_epoch=10000,alpha=0.01):

    theta=np.zeros(X_train.shape[1])

    for epoch in range(num_epoch):

        logist=np.dot(X_train,theta)

        h=1/(1+np.exp(-logist)) #hypothesis function

        cross_entropy_loss=(-y_train*np.log(h)-(1-y_train)*np.log(1-h)).mean()

        gradient=np.dot((h-y_train),X_train)/y_train.size

        theta-=alpha*gradient #update

    return theta

def multi_classifier(X_train,y_train):

    num_class=np.unique(y_train)

    parameter=np.zeros((len(num_class),X_train.shape[1])) #each has an array of parameters

    for i in num_class:

        label_t=np.zeros_like(y_train) #use label_t to label the target class!!!

        num_class=np.unique(y_train)

        label_t[y_train==num_class[i]]=1 #important,

        parameter[i,:]=get_classifier(X_train,label_t) #each array stands for one class's parameter

    return parameter

params = multi_classifier(X_train, y_train)

def pred(parameter,X_test,y_test):

    f_size=X_test.shape

    l_size=y_test.shape

    assert (f_size[0]==l_size[0])

    logist=np.dot(X_test,np.transpose(parameter)).squeeze()

    prob=1/(1+np.exp(-logist))

    pred=np.argmax(prob,axis=1)

    accuracy = np.sum(pred == y_test) / l_size[0] * 100

    return prob, pred, accuracy

_, preds, accu = pred(params, X_test, y_test)

print("Prediction: {}\n".format(preds))

print("Accuracy: {:.3f}%".format(accu))

Prediction: [0 1 0 4 1 5 1 0 0 1 0 1 0 0 5 1 1 1 1 0 5 4 0 1 5 0 0 1 1 0 3 1 0]

Accuracy: 66.667%

logistics多分类的更多相关文章

logistics二分类
binaryclassification #DATASET: https://archive.ics.uci.edu/ml/datasets/Glass+Identificationimport nu ...
sklearn多分类问题
sklearn实战-乳腺癌细胞数据挖掘(博主亲自录制视频) https://study.163.com/course/introduction.htm?courseId=1005269003& ...
Python_sklearn机器学习库学习笔记（三）logistic regression（逻辑回归）
# 逻辑回归 ## 逻辑回归处理二元分类 %matplotlib inline import matplotlib.pyplot as plt #显示中文 from matplotlib.font_m ...
R数据分析：二分类因变量的混合效应，多水平logistics模型介绍
今天给大家写广义混合效应模型Generalised Linear Random Intercept Model的第一部分 ,混合效应logistics回归模型,这个和线性混合效应模型一样也有好几个叫法 ...
多分类Logistics回归公式的梯度上升推导&极大似然证明sigmoid函数的由来
https://blog.csdn.net/zhy8623080/article/details/73188671 也即softmax公式
机器学习实战4：Adaboost提升：病马实例+非均衡分类问题
Adaboost提升算法是机器学习中很好用的两个算法之一,另一个是SVM支持向量机:机器学习面试中也会经常提问到Adaboost的一些原理:另外本文还介绍了一下非平衡分类问题的解决方案,这个问题在面试 ...
笔记+R︱Logistics建模简述（logit值、sigmoid函数）
本笔记源于CDA-DSC课程,由常国珍老师主讲.该训练营第一期为风控主题,培训内容十分紧凑,非常好,推荐:CDA数据科学家训练营 ---------------------------------- ...
笔记︱风控分类模型种类（决策、排序）比较与模型评估体系（ROC/gini/KS/lift）
每每以为攀得众山小,可.每每又切实来到起点,大牛们,缓缓脚步来俺笔记葩分享一下吧,please~ --------------------------- 本笔记源于CDA-DSC课程,由常国珍老师主讲 ...
logistics回归简单应用（二）
警告:本文为小白入门学习笔记网上下载的数据集链接:https://pan.baidu.com/s/1NwSXJOCzgihPFZfw3NfnfA 密码: jmwz 不知道这个数据集干什么用的,根据直 ...

随机推荐

[学习笔记] 平衡树——Treap
前置技能:平衡树前传:BST 终于学到我们喜闻乐见的平衡树啦! 所以我们这次讲的是平衡树中比较好写的\(Treap\). (以后会写splay的先埋个坑在这) 好了,进入正题. step 1 我们知道 ...
keras默认配置
使用keras后,会在用户目录下生成.keras/keras.json文件,Windows下为:C:\Users\user\.keras\keras.json,Linux下为:~/.keras/ker ...
[PWN]fsb with stack frame
0x00: 格式化字符串漏洞出现的时间很早了,偶然在前一段时间学到了一个其他的利用姿势,通过栈桢结构去利用格式化字符串漏洞. 原文链接:http://phrack.org/issues/59/7.ht ...
luogu3812 【模板】线性基
Code: #include <cstdio> #include <algorithm> #define ll long long #define N 64 #define s ...
ubuntu1804搜狗输入法乱码问题解决
打开终端,移除搜狗输入法配置文件: cd ~/.config sudo rm -rf SogouPY* sogou* 然后重启电脑即可.
Apache+php搭建
首先安装Apache -->下载修改httpd.conf文件 # # This is the main Apache HTTP server configuration file. It co ...
Codeforces 514 D R2D2 and Droid Army（Trie树）
题目链接大意是判断所给字符串组中是否存在与查询串仅一字符之差的字符串. 关于字符串查询的题,可以用字典树(Trie树)来解,第一次接触,做个小记.在查询时按题目要求进行查询. 代码: #define ...
IDEA如何将写好的java类（UDF函数）打成jar包上传linux
一.编写一个UDF函数,实现将字符串大写转小写 import org.apache.hadoop.hive.ql.exec.UDF; import org.apache.hadoop.io.Text; ...
zookeeper系列（八）zookeeper客户端的底层详解
作者:leesf 掌控之中,才会成功:掌控之外,注定失败.出处:http://www.cnblogs.com/leesf456/p/6098255.html 尊重原创,共同学习进步: 一.前言 ...
State Threads之网络架构库
原文: State Threads for Internet Applications 介绍 State Threads is an application library which provide ...

logistics多分类

multiclassification

logistics多分类的更多相关文章

随机推荐

热门专题