Yongchao Xu——【2018】TextField_Learning A Deep Direction Field for Irregular Scene Text Detection

论文

作者

亮点

提出的TextField方法非常新颖，用点到最近boundary点的向量来区分不同instance

方法概述

针对曲文检测，采用Instance-segmentation思路，提出一种对于分割点的新的表示方法TextField，旨在解决text instance的黏连问题。

TextField是一个二维的向量v，用来表示分割score map上的每一个点，它的含义是：每个text像素点到离自己最近的boundary点的向量。它的属性包括：

非text像素点=（0, 0），text像素点 $\ne$ （0，0）
向量的magnitude，可以用来区分是文字/非文字像素点
向量的direction，可以用来进行后处理帮助形成文本块

具体检测流程是：用一个VGG+FPN网络学习TextField的两张score map图，然后这两张图上做关于超像素、合并、形态学等后处理来得到text instance。

Fig. 3: Pipeline of the proposed method. Given an image, the network learns a novel direction field in terms of a two-channel map, which can be regarded as an image of two-dimensional vectors. To better show the predicted direction field, we calculate and visualize its magnitude and direction information. Text instances are then obtained based on these information via the proposed post-processing using some morphological tools.

方法细节

Direction Field示例图

Fig. 1: Different text representations. Classical relatively simple text representations in (a-c) fail to accurately delimit irregular texts. The text instances in (e) stick together using binary text mask representation in (d), requiring heavy postprocessing to extract text instances. The proposed direction field in (f) is able to precisely describe irregular text instances.

网络结构

VGG16+FPN

Fig. 5: Network architecture. We adopt the pre-trained VGG16 [52] as the backbone network and multi-level feature fusion to capture multi-scale text instances. The network is trained to predict dense per-pixel direction field

TextField向量定义

For each pixel p inside a text instance T , let Np be the nearest pixel to p lying outside the text instance T , we then define a two-dimensional unit vector Vgt(p) that points away from Np to the underlying text pixel p. This unit vector Vgt(p) directly encodes approximately relative location of p inside T and highlights the boundary between adjacent text instances.

where |NpP| denotes length of the vector starting from pixel Np to p, and T stands for all the text instances in an image. In practice, for each text pixel p, it is simple to compute its nearest pixel Np outside the text instance containing p by distance transform algorithm.

Fig. 4: Illustration of the proposed direction field. Given an image and its text annotation, a binary text mask can be easily generated. For each text pixel p, we find its nearest non-text pixel Np. Then, a two-dimensional unit vector that points away from N p to p is defined as the direction field on p. For non-text pixels, the direction field is set to (0;0). On the right, we visualize the direction information of the text direction field.

损失函数

欧式距离+带权（按text instance的面积）

后处理流程

Fig. 6: Illustration of the proposed post-processing. (a): Directions on candidate text pixels; (b): Text superpixels (in different color) and their representatives (in white); (c): Dilated and grouped representatives of text superpixels; (d): Labels of filtered representatives; (e): Candidate text instances; (f) Final segmented text instances.

实验结果

SCUT-CTW1500

Total-Text

ICDAR2015

MSRA-TD500

收获点与问题

没有说清楚的点：怎么算最近boundary点距离，还有后处理的那么多方法都没办法说清
方法非常新颖，但是，后处理太复杂了，速度上就占了1/4，向量表示方法也不太直观，不是特别通用的方法。

论文速读（Yongchao Xu——【2018】TextField_Learning A Deep Direction Field for Irregular Scene Text）的更多相关文章

论文速读（Chuhui Xue——【arxiv2019】MSR_Multi-Scale Shape Regression for Scene Text Detection）
Chuhui Xue--[arxiv2019]MSR_Multi-Scale Shape Regression for Scene Text Detection 论文 Chuhui Xue--[arx ...
论文阅读（Weilin Huang——【TIP2016】Text-Attentional Convolutional Neural Network for Scene Text Detection）
Weilin Huang--[TIP2015]Text-Attentional Convolutional Neural Network for Scene Text Detection) 目录作者 ...
论文速读（Jiaming Liu——【2019】Detecting Text in the Wild with Deep Character Embedding Network ）
Jiaming Liu--[2019]Detecting Text in the Wild with Deep Character Embedding Network 论文 Jiaming Liu-- ...
【论文速读】Cong_Yao_CVPR2017_EAST_An_Efficient_and_Accurate_Scene_Text_Detector
Cong_Yao_CVPR2017_EAST_An_Efficient_and_Accurate_Scene_Text_Detector 作者和代码非官方版tensorflow实现非官方版kera ...
【论文速读】Yuliang Liu_2017_Detecting Curve Text in the Wild_New Dataset and New Solution
Yuliang Liu_2017_Detecting Curve Text in the Wild_New Dataset and New Solution 作者和代码 caffe版代码关键词文字 ...
【论文速读】XiangBai_CVPR2018_Rotation-Sensitive Regression for Oriented Scene Text Detection
XiangBai_CVPR2018_Rotation-Sensitive Regression for Oriented Scene Text Detection 作者和代码 caffe代码关键词 ...
【论文速读】XiangBai_TIP2018_TextBoxes++_A Single-Shot Oriented Scene Text Detector
XiangBai_TIP2018_TextBoxes++_A Single-Shot Oriented Scene Text Detector 作者和代码 Minghui Liao, Baoguang ...
【论文速读】Shitala Prasad_ECCV2018】Using Object Information for Spotting Text
Shitala Prasad_ECCV2018]Using Object Information for Spotting Text 作者和代码关键词文字检测.水平文本.FasterRCNN.xy ...
【论文速读】Sheng Zhang_AAAI2018_Feature Enhancement Network_A Refined Scene Text Detector
Sheng Zhang_AAAI2018_Feature Enhancement Network_A Refined Scene Text Detector 作者关键词文字检测.水平文字.Fast ...

随机推荐

Kali Linux常用服务配置教程安装及配置DHCP服务
Kali Linux常用服务配置教程安装及配置DHCP服务在Kali Linux中,默认没有安装DHCP服务.下面将介绍安装并配置DHCP服务的方法. 1.安装DHCP服务在Kali Linux中 ...
Server酱微信推送中的问题
1.写在URL的文字就是不在微信端显示当时为了明显提示写了个这个:<--11111-->后来发现1111不能显示,去掉两边的<---->就可以了, 2.输出到微信端的文字不换 ...
LOJ.2865.[IOI2018]狼人(Kruskal重构树主席树)
LOJ 洛谷这题不就是Peaks(加强版)或者归程么..这算是$IOI2018$撞上$NOI2018$的题了? $Kruskal$重构树(具体是所有点按从小到大/从大到小的顺序,依次加入 ...
Java_深度剖析ConcurrentHashMap
本文基于Java 7的源码做剖析. ConcurrentHashMap的目的多线程环境下,使用Hashmap进行put操作会引起死循环,导致CPU利用率接近100%,所以在并发情况下不能使用Hash ...
输入，输出与Mad Libs游戏
name1=input('请输入一个名字') name2=input('请输入一个名字') car=input('请输入一种车子') print('饥饿的{}看到{}穿着三级甲骑着{}下山'.form ...
GMA Round 1 新程序
传送门新程序程序框图如图所示,当输入的n=时,输出结果的ans是多少? 容易看出该程序求n以内质数个数,50以内有15个. 定位:简单题
高性能平滑动画_requestAnimationFrame
高性能平滑动画_requestAnimationFrame 在下一次重绘之前,执行一个函数
__x__(16)0906第三天__层叠样式表CSS简介
层叠样式表CSS Cascading Style Sheets 用来为网页创建样式表,通过样式表对网页进行装饰. 所谓层叠,就是将网页想象成一层一层的结构,层次高的将覆盖层次低的. CSS可以为网页的 ...
python pymsql的用法 180903
一.1.pymysql 的下载pip3 install pymysql2.pymysql的使用import pymysqlname=input("请输入用户名:")password ...
在区块链侧链上进行Dapp技术开发
我在白皮书里提到过,asch使用的是不同于以太坊和比特币的侧链架构,dapp是运行在侧链上的,每套侧链对应一个dapp. 侧链的独立性侧链架构的好处是代码和数据独立,不增加主链的负担,避免数据过度膨 ...

论文速读（Yongchao Xu——【2018】TextField_Learning A Deep Direction Field for Irregular Scene Text）

Yongchao Xu——【2018】TextField_Learning A Deep Direction Field for Irregular Scene Text Detection

论文

作者

亮点

方法概述

方法细节

实验结果

收获点与问题

论文速读（Yongchao Xu——【2018】TextField_Learning A Deep Direction Field for Irregular Scene Text）的更多相关文章

随机推荐

热门专题