Tesseract&tesseractOCRiOS

安装tesseract在上篇。

1、安装之后默认语言包只有英文包，在github上下载中文简体，链接：https://github.com/tesseract-ocr/tessdata

然后放入tessdata文件中，/usr/local/share/tessdata

2、然后就可以识别文字了

在同等目录下

tesseract .jpg output_333 -l chi_sim

会在目录下生成一个output_333.text文件

TesseractOCRiOS

直接pod TesseractOCRiOS到工程

platform :ios,'8.0'

target "TesseractDemo" do

pod 'TesseractOCRiOS', '~> 4.0.0'

end

将Enable Bitcode 改为NO

引用的.m文件改为.mm

导入#import <TesseractOCR/TesseractOCR.h>

在工程下新建一个tessdata文件夹放置语言包

- (void)tesseractRecogniceWithImage:(UIImage *)image compleate:(void(^)(NSString *text))compleate {

    G8Tesseract *tesseract = [[G8Tesseract alloc]initWithLanguage:@"eng+chi_sim"];

    //模式

    tesseract.engineMode = G8OCREngineModeTesseractOnly;

    tesseract.maximumRecognitionTime = ;

    tesseract.pageSegmentationMode = G8PageSegmentationModeAuto;

    //灰化 如果是英文或者数字推荐使用。如果是汉字不推荐使用

    //tesseract.image = [image g8_blackAndWhite];

    tesseract.image = image;

    [tesseract recognize];

    compleate(tesseract.recognizedText);

}

目前出现的问题有在上面链接中下载的chi_sim语言包放在项目中会报

actual_tessdata_num_entries_ <= TESSDATA_NUM_ENTRIES:Error:Assert failed:in file tessdatamanager.cpp, line

应该是语言包的版本和tesseract的版本不一致导致的。

在这个链接下中文语言包就好了

https://github.com/tesseract-ocr/tessdata/blob/bf82613055ebc6e63d9e3b438a5c234bfd638c93/chi_sim.traineddata

Tesseract&tesseractOCRiOS的更多相关文章

selenium使用笔记（二）——Tesseract OCR
在自动化测试过程中我们经常会遇到需要输入验证码的情况,而现在一般以图片验证码居多.通常我们处理这种情况应该用最简单的方式,让开发给个万能验证码或者直接将验证码这个环节跳过.之前在技术交流群里也跟朋友讨 ...
[转]Tesseract 3.02中文字库训练
下载chi_sim.traindata字库下载tesseract-ocr-setup-3.02.02.exe 下载地址:http://code.google.com/p/tesseract-ocr/d ...
tesseract 编译与使用（windows）
tesseract是google的一个开源OCR项目,项目地址已经迁移到github(现在 2016/09),地址 https://github.com/tesseract-ocr/tesseract ...
Tesseract API在VS 2013中的配置以及调用
[Tesseract]Tesseract API在VS 2013中的配置以及调用时间:2016-05-31 20:35:19 阅读:127 评论:0 收藏:0 ...
tesseract配置过程
tesseract配置过程: 1. 为了避免配置环境变量,可以先下载一个 tesseract-ocr-setup-3.02.02.exe(tesseract配置文件夹里有),然后安装(假设安装目录为D ...
CMakeLists for tesseract
在网上找了很多,直接用都不行,试了半天的到以下的结果. cmake_minimum_required(VERSION 2.8) project( test ) include_directories ...
alfresco install in linux, and integrated with tesseract ocr
本文描述在Linux系统上安装Alfresco的步骤: 1. 下载安装文件:alfresco-community-5.0.d-installer-linux-x64.bin 2. 增加执行权限并执行: ...
Atititi tesseract使用总结
Atititi tesseract使用总结消除bug,优化,重新发布.当前版本为3.02 项目下载地址为:http://code.google.com/p/tesseract-ocr. Window ...
tesseract ocr文字识别Android实例程序和训练工具全部源代码
tesseract ocr是一个开源的文字识别引擎,Android系统中也可以使用.可以识别50多种语言,通过自己训练识别库的方式,可以大大提高识别的准确率. 为了节省大家的学习时间,现将自己近期的学 ...

随机推荐

shell cat 合并文件，合并数据库sql文件
> 覆盖写入 >> append模式写入 ###################################################################合并数 ...
POJ 1329 Circle Through Three Points(三角形外接圆)
题目链接:http://poj.org/problem?id=1329 #include<cstdio> #include<cmath> #include<algorit ...
SQlite 学习资料
很有用的开源跨平台数据库,可以作为客户端的小型内存数据库使用,据说它有N多用户(Nokia's Symbian,Mozilla,Abobe,Google,阿里旺旺,飞信,Chrome,FireFo ...
firewall防火墙配置
获取所有zone firewall-cmd --list-all-zones 重启服务 firewall-cmd --complete-reload 名词解释在具体介绍zone之前学生先给大家介绍几 ...
使用shell脚本查看文件类型
显示文件类型 #如查看 /etc 目录 [root@localhost ~]# sh test.sh /etc /etc/ [目录文件] #如查看 /etc 目录下所有文件 [root@localho ...
[已解决]报错This event loop is already running
安装nest_asyncio pip install nest_asyncio 导入并调用 import nest_asyncio nest_asyncio.apply()
解决MSF更新证书错误
如下图所示提示签名无效下载失败,导致更新不了msf 解决办法如下: echo 'deb http://apt.metasploit.com/ lucid main' > /etc/apt/sou ...
ubuntu apache配置检测及重启 nginx配置检测及重启
apache 配置文件检测:sudo apachectl configtestapache 重启:sudo service apache2 restartnginx 配置文件检测:sudo nginx ...
cmake 支持-lpthread
set(CMAKE_BUILD_TYPE "Release") if( CMAKE_BUILD_TYPE STREQUAL "Debug" ) set(C ...
jdbc打印sql语句-p6spy配置
@Configuration public class P6SpyConfig { /** * P6数据源包装, 打印SQL语句 */ @Bean public P6DataSourceBeanPos ...

Tesseract&tesseractOCRiOS

安装tesseract在上篇。

TesseractOCRiOS

Tesseract&tesseractOCRiOS的更多相关文章

随机推荐

热门专题