用Python调用Google翻译,就是模拟人将原文本(英语)粘贴在Google翻译的左边文本框,选择翻译设置从英文到简体中文,然后点击翻译,最后复制右边文本框中的翻译结果,并保存的过程。我比文献《用Python实现调用Google翻译》 的高明之处在,在提取翻译后的结果时,用正则表达式匹配很轻巧地抓取到了翻译后的文本。另外,代码完整。
- # -*- coding: utf-8 -*-
- #Python -V: Python 2.6.6
- __author__ = "Yinlong Zhao (zhaoyl[at]sjtu[dot]edu[dot]cn)"
- __date__ = "$Date: 2013/04/21 $"
- import re
- import urllib,urllib2
- #urllib:
- #urllib2: The urllib2 module defines functions and classes which help in opening
- #URLs (mostly HTTP) in a complex world — basic and digest authentication,
- #redirections, cookies and more.
- def translate(text):
- '''模拟浏览器的行为,向Google Translate的主页发送数据,然后抓取翻译结果 '''
- #text 输入要翻译的英文句子
- text_1=text
- #'langpair':'en'|'zh-CN'从英语到简体中文
- values={'hl':'zh-CN','ie':'UTF-8','text':text_1,'langpair':"'en'|'zh-CN'"}
- url=''
- data = urllib.urlencode(values)
- req = urllib2.Request(url,data)
- #模拟一个浏览器
- browser='Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)'
- req.add_header('User-Agent',browser)
- #向谷歌翻译发送请求
- response = urllib2.urlopen(req)
- #读取返回页面
- #从返回页面中过滤出翻译后的文本
- #使用正则表达式匹配
- #翻译后的文本是'TRANSLATED_TEXT='等号后面的内容
- #.*? non-greedy or minimal fashion
- #(?<=...)Matches if the current position in the string is preceded
- #by a match for ... that ends at the current position
- p=re.compile(r"(?<=TRANSLATED_TEXT=).*?;")
- return text_2
- if __name__ == "__main__":
- #text_1 原文
- #text_1=open('c:\\text.txt','r').read()
- text_1='Hello, my name is Derek. Nice to meet you! '
- print('The input text: %s' % text_1)
- text_2=translate(text_1).strip("'")
- print('The output text: %s' % text_2)
- #保存结果
- filename='c:\\Translation.txt'
- fp=open(filename,'w')
- fp.write(text_2)
- fp.close()
- report='Master, I have done the work and saved the translation at '+filename+'.'
- print('Report: %s' % report)
- >>>
- The input text: Hello, my name is Derek. Nice to meet you!
- The output text: 你好,我的名字是德里克。很高兴见到你!
- Report: Master, I have done the work and saved the translation at c:\Translation.txt.
- >>>
