[python]《Python编程快速上手:让繁琐工作自动化》学习笔记4

1. 处理Excel 电子表格笔记（第12章)(代码下载)

本文主要介绍openpyxl 的2.5.12版处理excel电子表格，原书是2.1.4 版，OpenPyXL 团队会经常发布新版本。不过不用担心，新版本应该在相当长的时间内向后兼容。如果你有新版本，想看看它提供了什么新功能，可以查看OpenPyXL 的完整文档：
http://openpyxl.readthedocs.org/。

1.1 读取Excel文档
常用函数如下：

函数(2.5.12)	函数(2.1.4)	用途	备注
openpyxl.load_workbook(‘example.xlsx’)	openpyxl.load_workbook(‘example.xlsx’)	打开excel文档
wb.sheetnames	wb.get_sheet_names()	取得工作簿中所有表名的列表
sheet = wb[‘Sheet’]	sheet = wb.get_sheet_by_name(‘Sheet’)	获得工作簿表格Sheet
sheet.title	sheet.title	获得sheet的表格名
anotherSheet = wb.active	anotherSheet = wb.get_active_sheet()	获得当前活动表	活动表就是工作簿在Excel 中打开时出现的工作表
sheet[‘A1’].value	sheet[‘A1’].value	获得A1框格的值	可以通过=赋值改变A1框格的值
sheet.cell(row=1, column=2)	sheet.cell(row=1, column=2)	获得B2框格的值	另一种赋值方法，row表示行号，row代表用数字标识的列号
sheet.max_row/sheet.max_column	sheet.get_highest_row()/sheet.get_highest_column()	获得表格最大行数/列数	返回整数
from openpyxl.utils import get_column_letter, column_index_from_string	from openpyxl.cell import get_column_letter, column_index_from_string	调用数字字母转换函数
get_column_letter(2)	get_column_letter(2)	将数字2转为字母B
column_index_from_string(‘AA’)	column_index_from_string(‘AA’)	将字母AA转为数字27
tuple(sheet[‘A1’:‘C3’])	tuple(sheet[‘A1’:‘C3’])	获得A1到C3处的单元格

1.2 写入Excel文档
常用函数如下：

函数(2.5.12)	函数(2.1.4)	用途	备注
wb = openpyxl.Workbook()	wb = openpyxl.Workbook()	创建新的工作表对象
wb.create_sheet(index=2, title=‘Middle Sheet’))	wb.create_sheet(index=2, title=‘Middle Sheet’))	创建序号为index+1的新工作表	可以缺省输入参数
wb.remove(wb[‘Sheet1’])	wb.remove_sheet(wb.get_sheet_by_name(‘Sheet’))	删除工作表Sheet1
wb.save(‘example_copy.xlsx’)	wb.save(‘example_copy.xlsx’)	保存xlsx文件

1.3 表格设置
常用函数如下：

函数(2.5.12)	函数(2.1.4)	用途	备注
from openpyxl.styles import Font	from openpyxl.styles import Font, Style	导入字体设置模块
italic24Font = Font(name=‘Times New Roman’,size=24, italic=True,bold=True) sheet[‘A1’].font = italic24Font	italic24Font = Font(name=‘Times New Roman’,size=24, italic=True,bold=True) styleObj = Style(font=italic24Font) sheet[‘A1’].style=styleObj	设置A1单元格字体，斜体加粗，字号24
sheet[‘A3’] = ‘=SUM(A1:A2)’	sheet[‘A3’] = ‘=SUM(A1:A2)’	设置公式	类似excel公式处理
sheet.row_dimensions[1].height = 70	sheet.row_dimensions[1].height = 70	设置第1行高度
sheet.column_dimensions[‘B’].width = 20	sheet.column_dimensions[‘B’].width = 20	设置第B列宽度
sheet.merge_cells(‘A1:D3’)	sheet.merge_cells(‘A1:D3’)	将’A1:D3’12 个单元格合并为一个单元格
sheet.unmerge_cells(‘C5:D5’)	sheet.unmerge_cells(‘C5:D5’)	解除单元格合并

2. 项目练习

2.1 从电子表格中读取数据

2010 年美国人口普查数据censuspopdata.xlsx，可以从https://download.csdn.net/download/luohenyj/11266976 下载。程序主要功能：

读取电子表格数据
填充数据结构
将结果写入文件
代码如下：

import openpyxl

import pprint

print('Opening workbook...')

# 打开excel文档

wb = openpyxl.load_workbook('censuspopdata.xlsx')

# 打开 工作表

#sheet = wb.get_sheet_by_name('Population by Census Tract')

sheet = wb['Population by Census Tract']

countyData = {}

print('Reading rows...')

# 按行读取数据

for row in range(2, sheet.max_row + 1):

    state = sheet['B' + str(row)].value

    county = sheet['C' + str(row)].value

    pop = sheet['D' + str(row)].value

    # Make sure the key for this state exists.

    countyData.setdefault(state, {})

    # country没有时设置的默认值

    countyData[state].setdefault(county, {'tracts': 0, 'pop': 0})

    # Each row represents one census tract, so increment by one.

    # 计算每个县中普查区的数

    countyData[state][county]['tracts'] += 1

    # Increase the county pop by the pop in this census tract.

    # 统计总人口

    countyData[state][county]['pop'] += int(pop)

# Open a new text file and write the contents of countyData to it.

print('Writing results...')

resultFile = open('census2010.py', 'w')

# 保存数据

resultFile.write('allData = ' + pprint.pformat(countyData))

resultFile.close()

print('Done.')

'''

import os

# 导入py数据文件

import census2010

# 打印某一项数据

census2010.allData['AK']['Anchorage']

anchoragePop = census2010.allData['AK']['Anchorage']['pop']

print('The 2010 population of Anchorage was ' + str(anchoragePop))

'''

2.2 更新一个电子表格

从produceSales.xlsx找到特定类型的产品，并更新它们的价格。produceSales.xlsx可以从https://download.csdn.net/download/luohenyj/11266976 下载

import openpyxl

# 打开文档

wb = openpyxl.load_workbook('produceSales.xlsx')

#sheet = wb.get_sheet_by_name('Sheet')

sheet = wb['Sheet']

# 要改变的价格

PRICE_UPDATES = {'Garlic': 3.07,

                 'Celery': 1.19,

                 'Lemon': 1.27}

# skip the first row 跳过首行

for rowNum in range(2, sheet.max_row):

    # 获得该行第一列的值

    produceName = sheet.cell(row=rowNum, column=1).value

    # 判断键值

    if produceName in PRICE_UPDATES:

        # 更新值

        sheet.cell(row=rowNum, column=2).value = PRICE_UPDATES[produceName]

wb.save('updatedProduceSales.xlsx')

2.3 乘法表

从命令行接受数字N，在一个Excel 电子表格中创建一个N×N 的乘法表。



import openpyxl

from openpyxl.utils import get_column_letter, column_index_from_string

# 设置字体风格

from openpyxl.styles import Font

wb = openpyxl.Workbook()

sheet = wb['Sheet']

print("Please input N:")

N = int(input())

A = get_column_letter(1)

# 设置字体加粗

fontObj = Font(bold=True)

# 设置行

for row in range(2, N+2):

    sheet['A'+str(row)] = row-1

    sheet['A'+str(row)].font = fontObj

# 设置列

for col in range(2, N+2):

    colName = get_column_letter(col)

    sheet[colName+str(1)] = col-1

    sheet[colName+str(1)].font = fontObj

for row in range(2, N+2):

    for col in range(2, N+2):

        colName = get_column_letter(col)

        rowName = row

        sheet[colName+str(row)] = sheet[colName+str(1)

                                        ].value*sheet['A'+str(row)].value

wb.save('multiplicationTable.xlsx')

wb.close()

Please input N:

9

2.4 空行插入程序

命令行参数输入N和M。从第N行开始，在电子表格中插入M 个空行

import openpyxl

xlsxPath='multiplicationTable.xlsx'

# 打开文件

wb = openpyxl.load_workbook(xlsxPath)

sheet = wb['Sheet']

print("空行起始行号：")

rowBlank=int(input())

print("空行数：")

rowBlanInsert=int(input())

# 新的文件

wb_save = openpyxl.Workbook()

sheet_save = wb_save['Sheet']

# 保存空行的数据

for row in range(1,sheet.max_row+1):

    for col in range(1,sheet.max_column+1):

        # 保存前rowBlank的值

        if row<rowBlank:

            sheet_save.cell(row,col).value=sheet.cell(row,col).value

        else:

            sheet_save.cell(row+rowBlanInsert,col).value=sheet.cell(row,col).value

wb_save.save('blankRowInserter.xlsx')

空行起始行号：

2

空行数：

3

2.5 电子表格单元格翻转程序

编写一个程序，翻转电子表格中行和列的单元格

import openpyxl

xlsxPath = 'blankRowInserter.xlsx'

# 打开文件

wb = openpyxl.load_workbook(xlsxPath)

sheet = wb['Sheet']

# 新的文件

wb_save = openpyxl.Workbook()

sheet_save = wb_save['Sheet']

# 翻转数据

for row in range(1, sheet.max_row+1):

    for col in range(1, sheet.max_column+1):

            sheet_save.cell(col, row).value = sheet.cell(row, col).value

wb_save.save('transSheet.xlsx')

2.6 文本文件到电子表格

读入几个文本文件的内容（可以自己创造这些文本文件），并将这些内容插入一个电子表格，每行写入一行文本。第一个文本文件中的行将写入
列A 中的单元格，第二个文本文件中的行将写入列B 中的单元格，以此类推。

import openpyxl

import os

wb = openpyxl.Workbook()

sheet = wb['Sheet']

searchPath = './my_test'

# 返回指定目录下所有的文件名和文件夹名列表

fileNames = os.listdir(searchPath)

txtNames = []

for fileName in fileNames:

    # 如果结尾是txt

    if fileName.endswith('.txt'):

        # 保存到列表

        txtNames.append(fileName)

# 遍历文件

for i in range(len(txtNames)):

    txtPath = os.path.join(searchPath, txtNames[i])

    file = open(txtPath)

    fileContents = file.readlines()

    for j in range(len(fileContents)):

        sheet.cell(i+1, j+1).value = fileContents[j]

wb.save('txtToXlsx.xlsx')

2.7 电子表格到文本文件

将列A 中的单元格写入一个文本文件，将列B 中的单元格写入另一个文本文件，以此类推。

# -*- coding: utf-8 -*-

import openpyxl

import os

wb = openpyxl.load_workbook('txtToXlsx.xlsx')

sheet = wb.active

for rows in range(sheet.max_row):

    txtFile = open('%s.txt' % (rows), 'w')

    # 遍历行

    for cols in sheet[rows+1]:

        txtFile.write(str(cols.value))

    txtFile.close()