Python中文转拼音代码(支持全拼和首字母缩写)

本文的代码，从https://github.com/cleverdeng/pinyin.py升级得来，针对原文的代码，做了以下升级：

1、可以传入参数firstcode：如果为true，只取汉子的第一个拼音字母；如果为false，则会输出全部拼音；

2、修复：如果为英文字母，则直接输出；

3、修复：如果分隔符为空字符串，仍然能正常输出；

4、升级：可以指定词典的文件路径

代码很简单，直接读取了一个词典（字符和英文的映射），然后挨个替换中文中的拼音即可；

Python

#!/usr/bin/env python

# -*- coding:utf-8 -*-

"""

原版代码：https://github.com/cleverdeng/pinyin.py

新增功能：

1、可以传入参数firstcode：如果为true，只取汉子的第一个拼音字母；如果为false，则会输出全部拼音；

2、修复：如果为英文字母，则直接输出；

3、修复：如果分隔符为空字符串，仍然能正常输出；

4、升级：可以指定词典的文件路径

"""

__version__ = '0.9'

__all__ = ["PinYin"]

import os.path

class PinYin(object):

def __init__(self):

self.word_dict = {}

def load_word(self, dict_file):

self.dict_file = dict_file

if not os.path.exists(self.dict_file):

raise IOError("NotFoundFile")

with file(self.dict_file) as f_obj:

for f_line in f_obj.readlines():

try:

line = f_line.split(' ')

self.word_dict[line[0]] = line[1]

except:

line = f_line.split(' ')

self.word_dict[line[0]] = line[1]

def hanzi2pinyin(self, string="", firstcode=False):

result = []

if not isinstance(string, unicode):

string = string.decode("utf-8")

for char in string:

key = '%X' % ord(char)

value = self.word_dict.get(key, char)

outpinyin = str(value).split()[0][:-1].lower()

if not outpinyin:

outpinyin = char

if firstcode:

result.append(outpinyin[0])

else:

result.append(outpinyin)

return result

def hanzi2pinyin_split(self, string="", split="", firstcode=False):

"""提取中文的拼音

@param string:要提取的中文

@param split:分隔符

@param firstcode: 提取的是全拼还是首字母？如果为true表示提取首字母，默认为False提取全拼

"""

result = self.hanzi2pinyin(string=string, firstcode=firstcode)

return split.join(result)

if __name__ == "__main__":

test = PinYin()

test.load_word('word.data')

string = "Java程序性能优化-让你的Java程序更快更稳定"

print "in: %s" % string

print "out: %s" % str(test.hanzi2pinyin(string=string))

print "out: %s" % test.hanzi2pinyin_split(string=string, split="", firstcode=True)

print "out: %s" % test.hanzi2pinyin_split(string=string, split="", firstcode=False)

实例中main函数的代码输出结果

代码使用方法：

如果需要其他的提取，可以修改一下代码实现；

Python中文转拼音代码(支持全拼和首字母缩写)的更多相关文章

PHP：汉字转拼音类（全拼与首字母）
[php] <?php class GetPingYing { private $pylist = array( 'a'=>-20319,'ai'=>-20317,'an'=> ...
select2 全拼以及首字母
转自:https://blog.csdn.net/kanhuadeng/article/details/78475317 具体实现方法为: 首先需要在网上下载select2的源码,并引入到项目中,具体 ...
js汉语转拼音（全拼、首字母、拼音首字母）
新建js文件first_alphabet.js // JavaScript Document // 汉字拼音首字母列表本列表包含了20902个汉字,用于配合 ToChineseSpell //函数使 ...
java 汉语转拼音（全拼，首字母）
import java.util.*; import net.sourceforge.pinyin4j.PinyinHelper;import net.sourceforge.pinyin4j.for ...
java根据汉字获取全拼和首字母
import net.sourceforge.pinyin4j.PinyinHelper; import net.sourceforge.pinyin4j.format.HanyuPinyinCase ...
【Java】使用pinyin4j获取汉字的全拼或首字母
汉字转拼音的工具类,常用于做汉字拼音的模糊查询. https://www.cnblogs.com/htyj/p/7891918.html
c#中文转全拼或首拼
参考:http://www.jb51.net/article/42217.htmhttp://blog.csdn.net/cstester/article/details/4758172 Chines ...
NPinyin 中文转换拼音代码
Mono 3.2 测试NPinyin 中文转换拼音代码 C#中文转换为拼音NPinyin代码在Mono 3.2下运行正常,Spacebuilder 有使用到NPinyin组件,代码兼容性没有问 ...
Java获取中文拼音、中文首字母缩写和中文首字母
获取中文拼音(如:广东省 -->guangdongsheng) /** * 得到中文全拼 * @param src 需要转化的中文字符串 * @return */ public static S ...

随机推荐

php反射类的使用及Laravel对反射的使用介绍
PHP的反射类与实例化对象作用相反,实例化是调用封装类中的方法.成员,而反射类则是拆封类中的所有方法.成员变量,并包括私有方法等.就如“解刨”一样,我们可以调用任何关键字修饰的方法.成员.当然在正常业 ...
hdu1069线性dp
/* dp[i]:取第i个方块时最多可以累多高 */ #include<bits/stdc++.h> using namespace std; struct node{ int x,y,z ...
python 全栈开发，Day131(向app推送消息,玩具端消息推送)
先下载github代码,下面的操作,都是基于这个版本来的! https://github.com/987334176/Intelligent_toy/archive/v1.4.zip 注意:由于涉及到 ...
Maven的下载，安装，配置，测试，初识以及Maven私服
:Maven目录分析 bin:含有mvn运行的脚本 boot:含有plexus-classworlds类加载器框架 conf:含有settings.xml配置文件 lib:含有Maven运行时所需要的 ...
Journal of BitcoinJ 从clone开始
启动Powershell cd D:\workspace mkdir BitcoinJ git init
配置apache和php mysql的一些问题
关于"Windows不能在本地计算机启动Apache2.......并参考特定服务错误代码1"问题解决 apache的httpd.conf文件配置“LoadModule php5_ ...
BZOJ1406 [AHOI2007]密码箱数论
欢迎访问~原文出处——博客园-zhouzhendong 去博客园看该题解题目传送门 - BZOJ1406 题意概括求所有数x,满足 x<n 且 x2≡1 (mod n). n<=2 ...
6-2 S树 uva712
这题关键是反转查询是固定按照x1x2x3来的那么先收集前面的顺序然后在数组里面直接调用即可比如前面的树是 x3 x1 x2 就把这个当作数组下标最左边的树是 1<&l ...
[OpenCV-Python] OpenCV 中的图像处理部分 IV (二）
部分 IVOpenCV 中的图像处理 OpenCV-Python 中文教程(搬运)目录 16 图像平滑目标 • 学习使用不同的低通滤波器对图像进行模糊 • 使用自定义的滤波器对图像进行卷积(2D 卷 ...
drupal笔记
$app_root :网站根目录安装汉化:1将汉化包放置drupal8\sites\default\files\translations下安装:2极简版的话需要在extend(扩展)中安装Inte ...

Python中文转拼音代码(支持全拼和首字母缩写)

Python中文转拼音代码(支持全拼和首字母缩写)的更多相关文章

随机推荐

热门专题