python机器学习-乳腺癌细胞挖掘(博主亲自录制视频)https://study.163.com/course/introduction.htm?courseId=1005269003&utm_campaign=commission&utm_source=cp-400000000398149&utm_medium=share

机器学习,统计项目合作QQ:231469242

Wordnet with NLTK

英语的同义词和反义词函数
# -*- coding: utf-8 -*-
"""
Spyder Editor 英语的同义词和反义词函数
""" import nltk
from nltk.corpus import wordnet
syns=wordnet.synsets('program')
'''
syns
Out[11]:
[Synset('plan.n.01'),
Synset('program.n.02'),
Synset('broadcast.n.02'),
Synset('platform.n.02'),
Synset('program.n.05'),
Synset('course_of_study.n.01'),
Synset('program.n.07'),
Synset('program.n.08'),
Synset('program.v.01'),
Synset('program.v.02')] ''' print(syns[0].name()) '''
plan.n.01
''' #just the word只显示文字,lemma要点
print(syns[0].lemmas()[0].name())
'''
plan
'''
#单词句子使用
print(syns[0].examples())
'''
['they drew up a six-step plan', 'they discussed plans for a new bond issue']
''' '''
synonyms=[]
antonyms=[] list_good=wordnet.synsets("good")
for syn in list_good:
for l in syn.lemmas():
#print('l.name()',l.name())
synonyms.append(l.name())
if l.antonyms():
antonyms.append(l.antonyms()[0].name()) print(set(synonyms))
print(set(antonyms))
''' word="good"
#返回一个单词的同义词和反义词列表
def Word_synonyms_and_antonyms(word):
synonyms=[]
antonyms=[]
list_good=wordnet.synsets(word)
for syn in list_good:
for l in syn.lemmas():
#print('l.name()',l.name())
synonyms.append(l.name())
if l.antonyms():
antonyms.append(l.antonyms()[0].name())
return (set(synonyms),set(antonyms)) #返回一个单词的同义词列表
def Word_synonyms(word):
list_synonyms_and_antonyms=Word_synonyms_and_antonyms(word)
return list_synonyms_and_antonyms[0] #返回一个单词的反义词列表
def Word_antonyms(word):
list_synonyms_and_antonyms=Word_synonyms_and_antonyms(word)
return list_synonyms_and_antonyms[1] '''
Word_synonyms("evil")
Out[43]:
{'evil',
'evilness',
'immorality',
'iniquity',
'malefic',
'malevolent',
'malign',
'vicious',
'wickedness'} Word_antonyms('evil')
Out[44]: {'good', 'goodness'}
'''

wordNet是一个英语词汇数据库,普林斯顿大学创建,是nltk语料库的一部分

WordNet is a lexical database for the English language, which was created by Princeton, and is part of the NLTK corpus.

You can use WordNet alongside the NLTK module to find the meanings
of words, synonyms同义词, antonyms反义词, and more. Let's cover some examples.

First, you're going to need to import wordnet:

from nltk.corpus import wordnet

Then, we're going to use the term "program" to find synsets 同义词集合like so:

syns = wordnet.synsets("program")

An example of a synset:

print(syns[0].name())

plan.n.01

Just the word: 只显示单词

print(syns[0].lemmas()[0].name())

plan

Definition of that first synset:

print(syns[0].definition())

a series of steps to be carried out or goals to be accomplished

Examples of the word in use:

print(syns[0].examples())

['they drew up a six-step plan', 'they discussed plans for a new bond issue']

Next, how might we discern synonyms and antonyms to a word? The lemmas will be synonyms, and then you can use .antonyms to find the antonyms to the lemmas. As such, we can populate some lists like:

synonyms = []
antonyms = [] for syn in wordnet.synsets("good"):
for l in syn.lemmas():
synonyms.append(l.name())
if l.antonyms():
antonyms.append(l.antonyms()[0].name()) print(set(synonyms))
print(set(antonyms))
{'beneficial', 'just', 'upright', 'thoroughly', 'in_force', 'well', 'skilful', 'skillful', 'sound', 'unspoiled', 'expert', 'proficient', 'in_effect', 'honorable', 'adept', 'secure', 'commodity', 'estimable', 'soundly', 'right', 'respectable', 'good', 'serious', 'ripe', 'salutary', 'dear', 'practiced', 'goodness', 'safe', 'effective', 'unspoilt', 'dependable', 'undecomposed', 'honest', 'full', 'near', 'trade_good'} {'evil', 'evilness', 'bad', 'badness', 'ill'}

As you can see, we got many more synonyms than antonyms, since we just looked up the antonym for the first lemma, but you could easily balance this buy also doing the exact same process for the term "bad."

比较单词近似度

Next, we can also easily use WordNet to compare the similarity of two words and their tenses, by incorporating the Wu and Palmer method for semantic related-ness.

Let's compare the noun of "ship" and "boat:"

w1 = wordnet.synset('ship.n.01')
w2 = wordnet.synset('boat.n.01')
print(w1.wup_similarity(w2))

0.9090909090909091

w1 = wordnet.synset('ship.n.01')
w2 = wordnet.synset('car.n.01')
print(w1.wup_similarity(w2))

0.6956521739130435

w1 = wordnet.synset('ship.n.01')
w2 = wordnet.synset('cat.n.01')
print(w1.wup_similarity(w2))

0.38095238095238093

Next, we're going to pick things up a bit and begin to cover the topic of Text Classification.

自然语言22_Wordnet with NLTK的更多相关文章

  1. 转 --自然语言工具包(NLTK)小结

    原作者:http://www.cnblogs.com/I-Tegulia/category/706685.html 1.自然语言工具包(NLTK) NLTK 创建于2001 年,最初是宾州大学计算机与 ...

  2. 自然语言17_Chinking with NLTK

    https://www.pythonprogramming.net/chinking-nltk-tutorial/?completed=/chunking-nltk-tutorial/ 代码 # -* ...

  3. 自然语言16_Chunking with NLTK

    Chunking with NLTK 对chunk分类数据结构可以图形化输出,用于分析英语句子主干结构 # -*- coding: utf-8 -*-"""Created ...

  4. Python自然语言处理工具NLTK的安装FAQ

    1 下载Python 首先去python的主页下载一个python版本http://www.python.org/,一路next下去,安装完毕即可 2 下载nltk包 下载地址:http://www. ...

  5. Python自然语言工具包(NLTK)入门

    在本期文章中,小生向您介绍了自然语言工具包(Natural Language Toolkit),它是一个将学术语言技术应用于文本数据集的 Python 库.称为“文本处理”的程序设计是其基本功能:更深 ...

  6. Python NLTK 自然语言处理入门与例程(转)

    转 https://blog.csdn.net/hzp666/article/details/79373720     Python NLTK 自然语言处理入门与例程 在这篇文章中,我们将基于 Pyt ...

  7. NLTK在自然语言处理

    nltk-data.zip 本文主要是总结最近学习的论文.书籍相关知识,主要是Natural Language Pracessing(自然语言处理,简称NLP)和Python挖掘维基百科Infobox ...

  8. Python自然语言处理工具小结

    Python自然语言处理工具小结 作者:白宁超 2016年11月21日21:45:26 目录 [Python NLP]干货!详述Python NLTK下如何使用stanford NLP工具包(1) [ ...

  9. 自然语言处理(NLP)入门学习资源清单

    Melanie Tosik目前就职于旅游搜索公司WayBlazer,她的工作内容是通过自然语言请求来生产个性化旅游推荐路线.回顾她的学习历程,她为期望入门自然语言处理的初学者列出了一份学习资源清单. ...

随机推荐

  1. iOS -- 隐藏返回按钮

    // 隐藏返回按钮 [self.navigationItem setHidesBackButton:YES];

  2. MyBatis学习--mybatis开发dao的方法

    简介 使用Mybatis开发Dao,通常有两个方法,即原始Dao开发方法和Mapper接口开发方法. 主要概念介绍: MyBatis中进行Dao开发时候有几个重要的类,它们是SqlSessionFac ...

  3. Oracle之物化视图

    来源于:http://www.cnblogs.com/Ronger/archive/2012/03/28/2420962.html 近期根据项目业务需要对oracle的物化视图有所接触,在网上搜寻关于 ...

  4. T3 任职定级面试准备

    山东大学计算机专业本科毕业,工作8年,以前在华为工作,来YY正好1年. 个人心态开放积极,对未知事物好奇心很强,前沿科学.古老宗教皆有涉猎.英语口语能力较强,能和老外流程的交流.技术涉猎广泛,喜好研究 ...

  5. FastDFS搭建及java整合代码【转】

    FastDFS软件介绍 1.什么是FastDFS FastDFS是用C语言编写的一款开源的分布式文件系统.FastDFS为互联网量身定制,充分考虑了冗余备份.负载均衡.线性扩容等机制,并注重高可用.高 ...

  6. html页面中meta的作用

    meta是用来在HTML文档中模拟HTTP协议的响应头报文.meta 标签用于网页的<head>与</head>中,meta 标签的用处很多.meta 的属性有两种:name和 ...

  7. this action could not be completed.try again登陆appstore错误提示

    今天升级10.11后登陆appstore的时候发现报错了: this action could not be completed.try again 解决办法,终端敲入: sudo mkdir -p ...

  8. ftp,http,https有啥区别?

    [ftp与http的区别?] HTTP(Hyper Text Transmission Protocol)是超文本传输协议,FTP(FileTransferProtocol)是文件传输协议! HTTP ...

  9. js 对象 copy 对象

    function clone(myObj) { if (typeof (myObj) != 'object') return myObj; if (myObj == null) return myOb ...

  10. Hadoop 学习笔记3 Develping MapReduce

    小笔记: Mavon是一种项目管理工具,通过xml配置来设置项目信息. Mavon POM(project of model). Steps: 1. set up and configure the ...