项目总结二:人脸识别项目(Face Recognition for the Happy House)
一、人脸验证问题(face verification)与人脸识别问题(face recognition)
1、人脸验证问题(face verification): 输入 数据库
Image Image
2、人脸识别问题(face recognition): 输入 数据库
Image Image *100
ID * 100
假设数据库里有100张图片,通过分别计算输入图片与数据库里所有图片 的d函数的的值,即如果d>阈值τ,则不是同一个人;如果d<阈值τ,则是同一个人。1对k的问题,需要解决一次学习问题(One-shot learning problem),这意味着在大多数人脸识别应用中,你需要通过单单一张图片或者单单一个人脸样例就能去识别这个人。例如Andrew NG展示的百度员工上班的门禁系统。
By using a 128-neuron fully connected layer as its last layer, the model ensures that the output is an encoding vector of size 128.By computing a distance between two encodings and thresholding(0.7), you can determine if the two pictures represent the same person.
1、Siamese 网络(Siamese network)
2、Inception 模型
(1)Inception 模块
(2)实现Inception Network的代码
import tensorflow as tf
import numpy as np
import os
from numpy import genfromtxt
from keras import backend as K
from keras.layers import Conv2D, ZeroPadding2D, Activation, Input, concatenate
from keras.models import Model
from keras.layers.normalization import BatchNormalization
from keras.layers.pooling import MaxPooling2D, AveragePooling2D
import fr_utils
from keras.layers.core import Lambda, Flatten, Dense def inception_block_1a(X):
Implementation of an inception block
""" X_3x3 = Conv2D(96, (1, 1), data_format='channels_first', name ='inception_3a_3x3_conv1')(X)
X_3x3 = BatchNormalization(axis=1, epsilon=0.00001, name = 'inception_3a_3x3_bn1')(X_3x3)
X_3x3 = Activation('relu')(X_3x3)
X_3x3 = ZeroPadding2D(padding=(1, 1), data_format='channels_first')(X_3x3)
X_3x3 = Conv2D(128, (3, 3), data_format='channels_first', name='inception_3a_3x3_conv2')(X_3x3)
X_3x3 = BatchNormalization(axis=1, epsilon=0.00001, name='inception_3a_3x3_bn2')(X_3x3)
X_3x3 = Activation('relu')(X_3x3) X_5x5 = Conv2D(16, (1, 1), data_format='channels_first', name='inception_3a_5x5_conv1')(X)
X_5x5 = BatchNormalization(axis=1, epsilon=0.00001, name='inception_3a_5x5_bn1')(X_5x5)
X_5x5 = Activation('relu')(X_5x5)
X_5x5 = ZeroPadding2D(padding=(2, 2), data_format='channels_first')(X_5x5)
X_5x5 = Conv2D(32, (5, 5), data_format='channels_first', name='inception_3a_5x5_conv2')(X_5x5)
X_5x5 = BatchNormalization(axis=1, epsilon=0.00001, name='inception_3a_5x5_bn2')(X_5x5)
X_5x5 = Activation('relu')(X_5x5) X_pool = MaxPooling2D(pool_size=3, strides=2, data_format='channels_first')(X)
X_pool = Conv2D(32, (1, 1), data_format='channels_first', name='inception_3a_pool_conv')(X_pool)
X_pool = BatchNormalization(axis=1, epsilon=0.00001, name='inception_3a_pool_bn')(X_pool)
X_pool = Activation('relu')(X_pool)
X_pool = ZeroPadding2D(padding=((3, 4), (3, 4)), data_format='channels_first')(X_pool) X_1x1 = Conv2D(64, (1, 1), data_format='channels_first', name='inception_3a_1x1_conv')(X)
X_1x1 = BatchNormalization(axis=1, epsilon=0.00001, name='inception_3a_1x1_bn')(X_1x1)
X_1x1 = Activation('relu')(X_1x1) # CONCAT
inception = concatenate([X_3x3, X_5x5, X_pool, X_1x1], axis=1) return inception def inception_block_1b(X):
X_3x3 = Conv2D(96, (1, 1), data_format='channels_first', name='inception_3b_3x3_conv1')(X)
X_3x3 = BatchNormalization(axis=1, epsilon=0.00001, name='inception_3b_3x3_bn1')(X_3x3)
X_3x3 = Activation('relu')(X_3x3)
X_3x3 = ZeroPadding2D(padding=(1, 1), data_format='channels_first')(X_3x3)
X_3x3 = Conv2D(128, (3, 3), data_format='channels_first', name='inception_3b_3x3_conv2')(X_3x3)
X_3x3 = BatchNormalization(axis=1, epsilon=0.00001, name='inception_3b_3x3_bn2')(X_3x3)
X_3x3 = Activation('relu')(X_3x3) X_5x5 = Conv2D(32, (1, 1), data_format='channels_first', name='inception_3b_5x5_conv1')(X)
X_5x5 = BatchNormalization(axis=1, epsilon=0.00001, name='inception_3b_5x5_bn1')(X_5x5)
X_5x5 = Activation('relu')(X_5x5)
X_5x5 = ZeroPadding2D(padding=(2, 2), data_format='channels_first')(X_5x5)
X_5x5 = Conv2D(64, (5, 5), data_format='channels_first', name='inception_3b_5x5_conv2')(X_5x5)
X_5x5 = BatchNormalization(axis=1, epsilon=0.00001, name='inception_3b_5x5_bn2')(X_5x5)
X_5x5 = Activation('relu')(X_5x5) X_pool = AveragePooling2D(pool_size=(3, 3), strides=(3, 3), data_format='channels_first')(X)
X_pool = Conv2D(64, (1, 1), data_format='channels_first', name='inception_3b_pool_conv')(X_pool)
X_pool = BatchNormalization(axis=1, epsilon=0.00001, name='inception_3b_pool_bn')(X_pool)
X_pool = Activation('relu')(X_pool)
X_pool = ZeroPadding2D(padding=(4, 4), data_format='channels_first')(X_pool) X_1x1 = Conv2D(64, (1, 1), data_format='channels_first', name='inception_3b_1x1_conv')(X)
X_1x1 = BatchNormalization(axis=1, epsilon=0.00001, name='inception_3b_1x1_bn')(X_1x1)
X_1x1 = Activation('relu')(X_1x1) inception = concatenate([X_3x3, X_5x5, X_pool, X_1x1], axis=1) return inception def inception_block_1c(X):
X_3x3 = fr_utils.conv2d_bn(X,
cv1_filter=(1, 1),
cv2_filter=(3, 3),
cv2_strides=(2, 2),
padding=(1, 1)) X_5x5 = fr_utils.conv2d_bn(X,
cv1_filter=(1, 1),
cv2_filter=(5, 5),
cv2_strides=(2, 2),
padding=(2, 2)) X_pool = MaxPooling2D(pool_size=3, strides=2, data_format='channels_first')(X)
X_pool = ZeroPadding2D(padding=((0, 1), (0, 1)), data_format='channels_first')(X_pool) inception = concatenate([X_3x3, X_5x5, X_pool], axis=1) return inception def inception_block_2a(X):
X_3x3 = fr_utils.conv2d_bn(X,
cv1_filter=(1, 1),
cv2_filter=(3, 3),
cv2_strides=(1, 1),
padding=(1, 1))
X_5x5 = fr_utils.conv2d_bn(X,
cv1_filter=(1, 1),
cv2_filter=(5, 5),
cv2_strides=(1, 1),
padding=(2, 2)) X_pool = AveragePooling2D(pool_size=(3, 3), strides=(3, 3), data_format='channels_first')(X)
X_pool = fr_utils.conv2d_bn(X_pool,
cv1_filter=(1, 1),
padding=(2, 2))
X_1x1 = fr_utils.conv2d_bn(X,
cv1_filter=(1, 1))
inception = concatenate([X_3x3, X_5x5, X_pool, X_1x1], axis=1) return inception def inception_block_2b(X):
X_3x3 = fr_utils.conv2d_bn(X,
cv1_filter=(1, 1),
cv2_filter=(3, 3),
cv2_strides=(2, 2),
padding=(1, 1))
X_5x5 = fr_utils.conv2d_bn(X,
cv1_filter=(1, 1),
cv2_filter=(5, 5),
cv2_strides=(2, 2),
padding=(2, 2)) X_pool = MaxPooling2D(pool_size=3, strides=2, data_format='channels_first')(X)
X_pool = ZeroPadding2D(padding=((0, 1), (0, 1)), data_format='channels_first')(X_pool) inception = concatenate([X_3x3, X_5x5, X_pool], axis=1) return inception def inception_block_3a(X):
X_3x3 = fr_utils.conv2d_bn(X,
cv1_filter=(1, 1),
cv2_filter=(3, 3),
cv2_strides=(1, 1),
padding=(1, 1))
X_pool = AveragePooling2D(pool_size=(3, 3), strides=(3, 3), data_format='channels_first')(X)
X_pool = fr_utils.conv2d_bn(X_pool,
cv1_filter=(1, 1),
padding=(1, 1))
X_1x1 = fr_utils.conv2d_bn(X,
cv1_filter=(1, 1)) inception = concatenate([X_3x3, X_pool, X_1x1], axis=1) return inception def inception_block_3b(X):
X_3x3 = fr_utils.conv2d_bn(X,
cv1_filter=(1, 1),
cv2_filter=(3, 3),
cv2_strides=(1, 1),
padding=(1, 1))
X_pool = MaxPooling2D(pool_size=3, strides=2, data_format='channels_first')(X)
X_pool = fr_utils.conv2d_bn(X_pool,
cv1_filter=(1, 1))
X_pool = ZeroPadding2D(padding=(1, 1), data_format='channels_first')(X_pool) X_1x1 = fr_utils.conv2d_bn(X,
cv1_filter=(1, 1))
inception = concatenate([X_3x3, X_pool, X_1x1], axis=1) return inception def faceRecoModel(input_shape):
Implementation of the Inception model used for FaceNet Arguments:
input_shape -- shape of the images of the dataset Returns:
model -- a Model() instance in Keras
""" # Define the input as a tensor with shape input_shape
X_input = Input(input_shape) # Zero-Padding
X = ZeroPadding2D((3, 3))(X_input) # First Block
X = Conv2D(64, (7, 7), strides = (2, 2), name = 'conv1')(X)
X = BatchNormalization(axis = 1, name = 'bn1')(X)
X = Activation('relu')(X) # Zero-Padding + MAXPOOL
X = ZeroPadding2D((1, 1))(X)
X = MaxPooling2D((3, 3), strides = 2)(X) # Second Block
X = Conv2D(64, (1, 1), strides = (1, 1), name = 'conv2')(X)
X = BatchNormalization(axis = 1, epsilon=0.00001, name = 'bn2')(X)
X = Activation('relu')(X) # Zero-Padding + MAXPOOL
X = ZeroPadding2D((1, 1))(X) # Second Block
X = Conv2D(192, (3, 3), strides = (1, 1), name = 'conv3')(X)
X = BatchNormalization(axis = 1, epsilon=0.00001, name = 'bn3')(X)
X = Activation('relu')(X) # Zero-Padding + MAXPOOL
X = ZeroPadding2D((1, 1))(X)
X = MaxPooling2D(pool_size = 3, strides = 2)(X) # Inception 1: a/b/c
X = inception_block_1a(X)
X = inception_block_1b(X)
X = inception_block_1c(X) # Inception 2: a/b
X = inception_block_2a(X)
X = inception_block_2b(X) # Inception 3: a/b
X = inception_block_3a(X)
X = inception_block_3b(X) # Top layer
X = AveragePooling2D(pool_size=(3, 3), strides=(1, 1), data_format='channels_first')(X)
X = Flatten()(X)
X = Dense(128, name='dense_layer')(X) # L2 normalization
X = Lambda(lambda x: K.l2_normalize(x,axis=1))(X) # Create model instance
model = Model(inputs = X_input, outputs = X, name='FaceRecoModel') return model
3、损失函数:The Triplet Loss
raining will use triplets of images (A,P,N):
- A is an "Anchor" image--a picture of a person.
- P is a "Positive" image--a picture of the same person as the Anchor image.
- N is a "Negative" image--a picture of a different person than the Anchor image.
These triplets are picked from our training dataset. We will write (A(i),P(i),N(i)) to denote the i-th training example.
You'd like to make sure that an image A(i)of an individual is closer to the Positive P(i) than to the Negative image N(i) by at least a margin α:
You would thus like to minimize the following "triplet cost":
Here, we are using the notation "[z]+" to denote max(z,0).
- The term (1) is the squared distance between the anchor "A" and the positive "P" for a given triplet; you want this to be small.
- The term (2) is the squared distance between the anchor "A" and the negative "N" for a given triplet, you want this to be relatively large, so it thus makes sense to have a minus sign preceding it.
- α is called the margin. It is a hyperparameter that you should pick manually. We will use α=0.2.
Most implementations also normalize the encoding vectors to have norm equal one (i.e., ∣∣f(img)∣∣2=1); you won't have to worry about that here.
4、model compile
FRmodel.compile(optimizer = 'adam', loss = triplet_loss, metrics = ['accuracy'])
The pretrained model we use is inspired by Victor Sy Wang's implementation and was loaded using his code: https://github.com/iwantooxxoox/Keras-OpenFace.
(1)输入数据:This network uses 96x96 dimensional RGB images as its input. Specifically, inputs a face image (or batch of m face images) as a tensor of shape (m,nC,nH,nW)=(m,3,96,96)
(2)输出数据:It outputs a matrix of shape (m,128) that encodes each input face image into a 128-dimensional vector
- Face verification solves an easier 1:1 matching problem; face recognition addresses a harder 1:K matching problem.
- The triplet loss is an effective loss function for training a neural network to learn an encoding of a face image.
- The same encoding can be used for verification and recognition. Measuring distances between two images' encodings allows you to determine whether they are pictures of the same person.
项目总结二:人脸识别项目(Face Recognition for the Happy House)的更多相关文章
- 转:基于开源项目OpenCV的人脸识别Demo版整理(不仅可以识别人脸,还可以识别眼睛鼻子嘴等)【模式识别中的翘楚】
文章来自于:http://blog.renren.com/share/246648717/8171467499 基于开源项目OpenCV的人脸识别Demo版整理(不仅可以识别人脸,还可以识别眼睛鼻子嘴 ...
- 基于卷积神经网络的人脸识别项目_使用Tensorflow-gpu+dilib+sklearn
https://www.cnblogs.com/31415926535x/p/11001669.html 基于卷积神经网络的人脸识别项目_使用Tensorflow-gpu+dilib+sklearn ...
- Github开源人脸识别项目face_recognition
Github开源人脸识别项目face_recognition 原文:https://www.jianshu.com/p/0b37452be63e 译者注: 本项目face_recognition是一个 ...
- 基于Python与命令行人脸识别项目(系列二)
接着系统一,继续开始我们face_recognition. Python 模块:face_recognition 在Python中,你可以导入face_recognition模块,调用丰富的API接口 ...
- 基于Python与命令行人脸识别项目(系列一)
Face Recognition 人脸识别 摘要:本项目face_recognition是一个强大.简单.易上手的人脸识别开源项目,并且配备了完整的开发文档和应用案例,方便大家使用.对于本项目可以使用 ...
- anaconda3下配置python-3.5+tensorflow-gpu-1.9.0人脸识别项目环境
https://www.cnblogs.com/31415926535x/p/10620732.html 之前为了配置tensorflow-gpu的环境又是装cuda,又是装cudnn,还有tenso ...
- Faces人脸识别项目简介
Faces人脸识别 分为两个模块,Faces文件夹下存放人脸识别算法的代码,Web文件夹下存放网站搭建的代码 详情请查看各个模块下的readme文档 项目简介 核心算法 一款基于Dlib.opencv ...
- python人脸识别项目face-recognition
该项目基于Github上面的开源项目人脸识别face-recognition,主要是对图像和视频中的人脸进行识别,在开源项目给出的例子基础上对视频人脸识别的KNN算法进行了实现. 0x1 工程项目结构 ...
- express+gulp构建项目(二)启动项目和主文件
这一次整理的内容是项目主文件和如何启动项目. 启动项目 通过nodejs官网的例子https://nodejs.org/docs/latest-v4.x/doc/api/synopsis.html我们 ...
- brace源码改造实现跨服务器监控-zjs
1.从GitHub上下载源码,本地编译,有部分代码编译报错,如下图: 百度搜索import sun.jvmstat.monitor.需要导入什么jar包:导入jdk/lib/tools.jar.
- 从零开始学java (四)反射
反射机制是在运行状态中,对于任意一个类,都能够知道这个类的所有属性和方法:对于任意一个对象,都能够调用它的任意一个方法和属性:这种动态获取的信息以及动态调用对象的方法的功能称为java语言的反射机制. ...
- 使用kbmmw 实现图形验证码
首先感谢图形验证码的提供者 晴空无彩虹 https://blog.csdn.net/u011784006/article/details/80827181 他用FMX 实现了验证码的生成,我修改成 ...
- 织梦dedecms后台文章搜索关键字,关键字包含文章内容的代码修改
1.织梦dedecms后台文章搜索功能在哪里找?织梦dedecms后台-->核心-->常用操作-->所有档案列表(或)织梦dedecms后台-->核心-->内容管理--& ...
- 小白的CTF学习之路7——内存与硬盘
前天去网吧跟朋友包宿,导致昨天一整天都报废,今天早上研究了一下nethunter导致手机成功变砖,感冒不停地咳嗽,这些理由应该足够我前两天拖更了吧,下面开始正题 磁盘学习路线 虚拟缓存 虚拟内存 节约 ...
- Android端高性能图像分类解决方案
由于公司业务需要,前段时间开始了解AI方面的东西,准备找一个在android端性能较高的前向计算框架,了解了tflite,百度的mdl和腾讯的ncnn,最终敲定ncnn,不失所望,效果很不错,基本达到 ...
- Unity加载AB资源
using System.Collections; using System.Collections.Generic; using UnityEngine; public class AssetBun ...
- 记一次需要用到复杂的groupingBy的需求
一:先定义结构 public class Foo { private Integer id; private String name; private BigDecimal amount; publi ...
- mqtt mosquitto 源码安装
下载地址 ububtu : wget https://codeload.github.com/eclipse/mosquitto/zip/master 安装依赖 sudo apt-get insta ...
- scrapy的入门使用(二)
1. scrapy实现翻页请求 找到下一页的url地址 构造url地址的请求,传递给引擎 1.1 实现翻页请求 使用方法 在获取到url地址之后,可以通过scrapy.Request(url,call ...