kaggle之人脸特征识别

facial-keypoints-detection, 这是一个人脸识别任务，任务是识别人脸图片中的眼睛、鼻子、嘴的位置。训练集包含以下15个位置的坐标，行末是图片的像素值，共96*96个像素值。测试集只包含图片的像素值。

left_eye_center, right_eye_center, left_eye_inner_corner, left_eye_outer_corner, right_eye_inner_corner, right_eye_outer_corner, left_eyebrow_inner_end, left_eyebrow_outer_end, right_eyebrow_inner_end, right_eyebrow_outer_end, nose_tip, mouth_left_corner, mouth_right_corner, mouth_center_top_lip, mouth_center_bottom_lip

import cPickle as pickle

from datetime import datetime

import os

import sys

import numpy as np

import pandas as pd

from lasagne import layers

from nolearn.lasagne import BatchIterator

from nolearn.lasagne import NeuralNet

from pandas.io.parsers import read_csv

from sklearn.utils import shuffle

import theano

/Library/Python/2.7/site-packages/theano/tensor/signal/downsample.py:6: UserWarning: downsample module has been moved to the theano.tensor.signal.pool module.

  "downsample module has been moved to the theano.tensor.signal.pool module.")

数据载入与预览

train_file = 'training.csv'

test_file = 'test.csv'

def load(test=False, cols=None):

    """

    载入数据，通过参数控制载入训练集还是测试集，并筛选特征列

    """

    fname = test_file if test else train_file

    df = pd.read_csv(os.path.expanduser(fname))

    # 将图像数据转换为数组

    df['Image'] = df['Image'].apply(lambda x: np.fromstring(x, sep=' '))

    # 筛选指定的数据列

    if cols:

        df = df[list(cols) + ['Image']]

    print(df.count())  # 每列的简单统计

    df = df.dropna()  # 删除空数据

    # 归一化到0到1

    X = np.vstack(df['Image'].values) / 255.

    X = X.astype(np.float32)

    # 针对训练集目标标签进行归一化

    if not test:

        y = df[df.columns[:-1]].values

        y = (y - 48) / 48

        X, y = shuffle(X, y, random_state=42)

        y = y.astype(np.float32)

    else:

        y = None

    return X, y

# 将单行像素数据转换为三维矩阵

def load2d(test=False, cols=None):

    X, y = load(test=test, cols=cols)

    X = X.reshape(-1, 1, 96, 96)

    return X, y

数据处理

一种方式是我们训练一个分类器，用来分类所有的目标特征。另一种是针对眼镜、鼻子、嘴分别设置不同的分类器，每个分类器只预测单个目标。通过观察数据我们发现，训练集中有许多缺失数据，如果训练一个分类器，删掉缺失数据会让我们的样本集变小，不能很好地利用起数据，因此，我们选择第二种方式，每个目标训练一个分类器，这样更好的利用样本数据。

from collections import OrderedDict

from sklearn.base import clone

SPECIALIST_SETTINGS = [

    dict(

        columns=(

            'left_eye_center_x', 'left_eye_center_y',

            'right_eye_center_x', 'right_eye_center_y',

            ),

        flip_indices=((0, 2), (1, 3)),

        ),

    dict(

        columns=(

            'nose_tip_x', 'nose_tip_y',

            ),

        flip_indices=(),

        ),

    dict(

        columns=(

            'mouth_left_corner_x', 'mouth_left_corner_y',

            'mouth_right_corner_x', 'mouth_right_corner_y',

            'mouth_center_top_lip_x', 'mouth_center_top_lip_y',

            ),

        flip_indices=((0, 2), (1, 3)),

        ),

    dict(

        columns=(

            'mouth_center_bottom_lip_x',

            'mouth_center_bottom_lip_y',

            ),

        flip_indices=(),

        ),

    dict(

        columns=(

            'left_eye_inner_corner_x', 'left_eye_inner_corner_y',

            'right_eye_inner_corner_x', 'right_eye_inner_corner_y',

            'left_eye_outer_corner_x', 'left_eye_outer_corner_y',

            'right_eye_outer_corner_x', 'right_eye_outer_corner_y',

            ),

        flip_indices=((0, 2), (1, 3), (4, 6), (5, 7)),

        ),

    dict(

        columns=(

            'left_eyebrow_inner_end_x', 'left_eyebrow_inner_end_y',

            'right_eyebrow_inner_end_x', 'right_eyebrow_inner_end_y',

            'left_eyebrow_outer_end_x', 'left_eyebrow_outer_end_y',

            'right_eyebrow_outer_end_x', 'right_eyebrow_outer_end_y',

            ),

        flip_indices=((0, 2), (1, 3), (4, 6), (5, 7)),

        ),

    ]

class FlipBatchIterator(BatchIterator):

    flip_indices = [

        (0, 2), (1, 3),

        (4, 8), (5, 9), (6, 10), (7, 11),

        (12, 16), (13, 17), (14, 18), (15, 19),

        (22, 24), (23, 25),

        ]

    def transform(self, Xb, yb):

        Xb, yb = super(FlipBatchIterator, self).transform(Xb, yb)

        # Flip half of the images in this batch at random:

        bs = Xb.shape[0]

        indices = np.random.choice(bs, bs / 2, replace=False)

        Xb[indices] = Xb[indices, :, :, ::-1]

        if yb is not None:

            # Horizontal flip of all x coordinates:

            yb[indices, ::2] = yb[indices, ::2] * -1

            # Swap places, e.g. left_eye_center_x -> right_eye_center_x

            for a, b in self.flip_indices:

                yb[indices, a], yb[indices, b] = (

                    yb[indices, b], yb[indices, a])

        return Xb, yb

class EarlyStopping(object):

    def __init__(self, patience=100):

        self.patience = patience

        self.best_valid = np.inf

        self.best_valid_epoch = 0

        self.best_weights = None

    def __call__(self, nn, train_history):

        current_valid = train_history[-1]['valid_loss']

        current_epoch = train_history[-1]['epoch']

        if current_valid < self.best_valid:

            self.best_valid = current_valid

            self.best_valid_epoch = current_epoch

            self.best_weights = nn.get_all_params_values()

        elif self.best_valid_epoch + self.patience < current_epoch:

            print("Early stopping.")

            print("Best valid loss was {:.6f} at epoch {}.".format(

                self.best_valid, self.best_valid_epoch))

            nn.load_params_from(self.best_weights)

            raise StopIteration()

class AdjustVariable(object):

    def __init__(self, name, start=0.03, stop=0.001):

        self.name = name

        self.start, self.stop = start, stop

        self.ls = None

    def __call__(self, nn, train_history):

        if self.ls is None:

            self.ls = np.linspace(self.start, self.stop, nn.max_epochs)

        epoch = train_history[-1]['epoch']

        new_value = np.cast['float32'](self.ls[epoch - 1])

        getattr(nn, self.name).set_value(new_value)

def float32(k):

    return np.cast['float32'](k)

net = NeuralNet(

    layers=[

        ('input', layers.InputLayer),

        ('conv1', layers.Conv2DLayer),

        ('pool1', layers.MaxPool2DLayer),

        ('dropout1', layers.DropoutLayer),

        ('conv2', layers.Conv2DLayer),

        ('pool2', layers.MaxPool2DLayer),

        ('dropout2', layers.DropoutLayer),

        ('conv3', layers.Conv2DLayer),

        ('pool3', layers.MaxPool2DLayer),

        ('dropout3', layers.DropoutLayer),

        ('hidden4', layers.DenseLayer),

        ('dropout4', layers.DropoutLayer),

        ('hidden5', layers.DenseLayer),

        ('output', layers.DenseLayer),

        ],

    input_shape=(None, 1, 96, 96),

    conv1_num_filters=32, conv1_filter_size=(3, 3), pool1_pool_size=(2, 2),

    dropout1_p=0.1,

    conv2_num_filters=64, conv2_filter_size=(2, 2), pool2_pool_size=(2, 2),

    dropout2_p=0.2,

    conv3_num_filters=128, conv3_filter_size=(2, 2), pool3_pool_size=(2, 2),

    dropout3_p=0.3,

    hidden4_num_units=300,

    dropout4_p=0.5,

    hidden5_num_units=300,

    output_num_units=30, output_nonlinearity=None,

    update_learning_rate=theano.shared(float32(0.03)),

    update_momentum=theano.shared(float32(0.9)),

    regression=True,

    batch_iterator_train = BatchIterator(batch_size = 100),

    batch_iterator_test = BatchIterator(batch_size = 100),

#     batch_iterator_train=FlipBatchIterator(batch_size=128),

#     on_epoch_finished=[

#         AdjustVariable('update_learning_rate', start=0.03, stop=0.0001),

#         AdjustVariable('update_momentum', start=0.9, stop=0.999),

#         EarlyStopping(patience=200),

#         ],

    max_epochs=10,

    verbose=1,

)

def fit_specialists(fname_pretrain=None):

    if fname_pretrain:

        with open(fname_pretrain, 'rb') as f:

            net_pretrain = pickle.load(f)

    else:

        net_pretrain = None

    specialists = OrderedDict()

    for setting in SPECIALIST_SETTINGS:

        cols = setting['columns']

        X, y = load2d(cols=cols)

        model = clone(net)

        model.output_num_units = y.shape[1]

        model.batch_iterator_train.flip_indices = setting['flip_indices']

        model.max_epochs = int(4e6 / y.shape[0])

        if 'kwargs' in setting:

            # an option 'kwargs' in the settings list may be used to

            # set any other parameter of the net:

            vars(model).update(setting['kwargs'])

        if net_pretrain is not None:

            # if a pretrain model was given, use it to initialize the

            # weights of our new specialist model:

            model.load_params_from(net_pretrain)

        print("Training model for columns {} for {} epochs".format(

            cols, model.max_epochs))

        model.fit(X, y)

        specialists[cols] = model

    with open('net-specialists.pickle', 'wb') as f:

        # this time we're persisting a dictionary with all models:

        pickle.dump(specialists, f, -1)

def predict(fname_specialists='net-specialists.pickle'):

    with open(fname_specialists, 'rb') as f:

        specialists = pickle.load(f)

    X = load2d(test=True)[0]

    y_pred = np.empty((X.shape[0], 0))

    for model in specialists.values():

        y_pred1 = model.predict(X)

        y_pred = np.hstack([y_pred, y_pred1])

    columns = ()

    for cols in specialists.keys():

        columns += cols

    y_pred2 = y_pred * 48 + 48

    y_pred2 = y_pred2.clip(0, 96)

    df = DataFrame(y_pred2, columns=columns)

    lookup_table = read_csv(os.path.expanduser(FLOOKUP))

    values = []

    for index, row in lookup_table.iterrows():

        values.append((

            row['RowId'],

            df.ix[row.ImageId - 1][row.FeatureName],

            ))

    now_str = datetime.now().isoformat().replace(':', '-')

    submission = DataFrame(values, columns=('RowId', 'Location'))

    filename = 'submission-{}.csv'.format(now_str)

    submission.to_csv(filename, index=False)

    print("Wrote {}".format(filename))

if __name__ == '__main__':

    fit_specialists()

    predict()

left_eye_center_x     7039

left_eye_center_y     7039

right_eye_center_x    7036

right_eye_center_y    7036

Image                 7049

dtype: int64

Training model for columns ('left_eye_center_x', 'left_eye_center_y', 'right_eye_center_x', 'right_eye_center_y') for 568 epochs

/Library/Python/2.7/site-packages/lasagne/layers/conv.py:489: UserWarning: The `image_shape` keyword argument to `tensor.nnet.conv2d` is deprecated, it has been renamed to `input_shape`.

  border_mode=border_mode)

# Neural Network with 4779676 learnable parameters

## Layer information

  #  name      size

---  --------  ---------

  0  input     1x96x96

  1  conv1     32x94x94

  2  pool1     32x47x47

  3  dropout1  32x47x47

  4  conv2     64x46x46

  5  pool2     64x23x23

  6  dropout2  64x23x23

  7  conv3     128x22x22

  8  pool3     128x11x11

  9  dropout3  128x11x11

 10  hidden4   300

 11  dropout4  300

 12  hidden5   300

 13  output    4

# Neural Network with 4779676 learnable parameters

## Layer information

  #  name      size

---  --------  ---------

  0  input     1x96x96

  1  conv1     32x94x94

  2  pool1     32x47x47

  3  dropout1  32x47x47

  4  conv2     64x46x46

  5  pool2     64x23x23

  6  dropout2  64x23x23

  7  conv3     128x22x22

  8  pool3     128x11x11

  9  dropout3  128x11x11

 10  hidden4   300

 11  dropout4  300

 12  hidden5   300

 13  output    4

  epoch    trn loss    val loss    trn/val  dur

-------  ----------  ----------  ---------  -------

      1     [36m0.01113[0m     [32m0.00475[0m    2.34387  181.86s

  epoch    trn loss    val loss    trn/val  dur

-------  ----------  ----------  ---------  -------

      1     [36m0.01113[0m     [32m0.00475[0m    2.34387  181.86s

Warning

单机执行实在是太慢了，这里可以使用Amazon AWS的GPU实例来运行程序，创建过程如下参见：deep-learning-tutorial

在运行实例之后还有几点要做：

安装python pip > sudo apt-get install python-pip python-dev build-essential
创建Kaggle cookies文件，为了下载训练和测试数据，我们需要将本地浏览器中的cookies导出，通过chrome 插件：https://chrome.google.com/webstore/detail/cookietxt-export/lopabhfecdfhgogdbojmaicoicjekelh/related
把Github中的https://github.com/wendykan/AWSGPU_DeepLearning clone到你的AWS实例中，进行一些机器学习的初始化工作。

一些参考：

http://ramhiser.com/2016/01/05/installing-tensorflow-on-an-aws-ec2-instance-with-gpu-support/

http://danielnouri.org/notes/2014/12/17/using-convolutional-neural-nets-to-detect-facial-keypoints-tutorial/

kaggle之人脸特征识别的更多相关文章

java 开发 face++ 人脸特征识别系统
首先要在 face++ 注册一个账号,并且创建一个应用,拿到 api key 和 api secret: 下载 java 接入工具,一个 jar 包:https://github.com/FacePl ...
iOS 11之Vision人脸检测
代码地址如下:http://www.demodashi.com/demo/11783.html 大道如青天,我独不得出前言在上一篇iOS Core ML与Vision初识中,初步了解到了visio ...
java基于OpenCV的人脸识别
基于Java简单的人脸和人眼识别程序使用这个程序之前必须先安装配置OpenCV详细教程见:https://www.cnblogs.com/prodigal-son/p/12768948.html 注 ...
python实现人脸关键部位检测（附源码）
人脸特征提取本文主要使用dlib库中的人脸特征识别功能. dlib库使用68个特征点标注出人脸特征,通过对应序列的特征点,获得对应的脸部特征.下图展示了68个特征点.比如我们要提取眼睛特征,获取3 ...
一年过去了，25万月薪的AI工程师还存在吗？
导读:2017 年的时候,AI 前线进行了一场有关人工智能领域薪资差异的专题策划,这篇名为<25 万年薪的你与 25 万月薪的他,猎头来谈你们之间的差别>的文章引起了读者们的热烈讨论.一年 ...
TensorFlow练习24: GANs-生成对抗网络 (生成明星脸)
http://blog.topspeedsnail.com/archives/10977 从2D图片生成3D模型(3D-GAN) https://blog.csdn.net/u014365862/ar ...
[Bayes] *Bayesian Deep Learning for Transparency Improvement
为何有必要进修统计机器学习? 因为你没有那么多的数据因为未知的东西最终还是需理论所解释基于规则?基于概率? ---- 图灵奖得主.贝叶斯之父 Judea Pearl 谈深度学习局限,想造自由意志机 ...
写给程序员的机器学习入门 (十) - 对象识别 Faster-RCNN - 识别人脸位置与是否戴口罩
每次看到大数据人脸识别抓逃犯的新闻我都会感叹技术发展的太快了,国家治安水平也越来越好了
[转]40多个关于人脸检测/识别的API、库和软件
[转]40多个关于人脸检测/识别的API.库和软件 http://news.cnblogs.com/n/185616/ 英文原文:List of 40+ Face Detection / Recogn ...

随机推荐

Mysql 细节记忆
DELIMITER $$ 和 DELIMITER ; DROP PROCEDURE IF EXISTS `pro_follow_getBookBeforeExpired`$$ DECLARE p_Se ...
asp.net 实现 tts
之前用WinForm实现tts已经成功,就调用了下系统的类库.但我把相同的代码搬到asp.net上时却碰到了许多问题,查了好多网站.试过了很多方法,到现在算是做出了一部分吧. 之前调用微软的TTS是用 ...
.NET技术
1.在C#中,string str = null 与 string str = “” 请尽量使用文字或图象说明其中的区别.回答要点:说明详细的空间分配.(10分) 解:string str=null ...
IOS 创建App的最佳捷径
简网App工场 ----------------创建App的最佳捷径
学习unity的第一个小游戏(Roll the ball)的笔记
1.摄像机的跟随运动,逻辑就是保持摄像机跟主角的距离不变(Undate()函数). offset=trandform.position-player.position. Undate() { tran ...
PDO LIMIT bug
PDO存在一个LIMIT BUG(mysql) 需要指定数据类型,而且limit后面跟的2个参数必须是数值类型,不然的话获取不到数据例1: $dsn = "mysql:host=127.0 ...
我的django之旅（三）数据库和模型
我的django之旅(三)模型和数据库标签(空格分隔):模型数据库 ORM 1.django ORM django内置了一套完整的解决方案,其中就包括他自己的ORM.可惜没有使用SQLAlchem ...
uva1368 DNA Consensus String
<tex2html_verbatim_mark> Figure 1. DNA (Deoxyribonucleic Acid) is the molecule which contains ...
jdk1.7 JDBC连接SQL Server2008
路由器网:http://www.ming4.com/news/2355.html Jackie的博客:http://blog.163.com/jackie_howe/blog/static/19949 ...
100个iOS开发/设计程序员面试题汇总，你将如何作答？
100个iOS开发/设计程序员面试题汇总,你将如何作答? 雪姬 2015-01-25 19:10:49 工作职场评论(0) 无论是对于公司还是开发者或设计师个人而言,面试都是一项耗时耗钱的项目, ...