python 数据结构中被忽视的小技巧

一、一个包含N个元素的字符串、元组、序列、以及任何可迭代对象均可以拆分成N个单独的“变量”

　　1.字符串的拆分

 #字符串

 In [10]: s="abdefg"

 In [11]: o, p, q, x, y, z = s
 In [12]: q

 Out[12]: 'd'

 In [13]: x

 Out[13]: 'e'

 In [14]: z

 Out[14]: 'g'

　　2.列表、元组、集合和可迭代对象的拆分

 list_l = [1,2,3,4]

 In [16]: tuple_l = [5,6,7,8]

 In [17]: a,b,c,d = list_l

 In [18]: b

 Out[18]: 2

 In [19]: c

 Out[19]: 3

 In [20]: d

 Out[20]: 4

 In [21]: m,n,o,p = tuple_l

 In [22]: o

 Out[22]: 7

 In [23]: p

 Out[23]: 8

 d={'k1':'v1','k2':'v2'}

 In [2]: d

 Out[2]: {'k1': 'v1', 'k2': 'v2'}

 In [3]: key1,key2 = d.keys()

 In [4]: key1

 Out[4]: 'k2'

 In [5]: key2

 Out[5]: 'k1'

　　一个综合的例子:

 In [11]: data = [1,2,[3,4,6],(7,8,9),{'k1':'v1','k2':2}]

 In [12]: a,b,li,tup,dic = data

 In [13]: a

 Out[13]: 1

 In [14]: b

 Out[14]: 2

 In [15]: tup

 Out[15]: (7, 8, 9)

 In [16]: dic

 Out[16]: {'k1': 'v1', 'k2': 2}

 In [17]: li

 Out[17]: [3, 4, 6]

　　问题1：如果想丢弃分解后的一些无用值怎么办？可以使用用不到的变量名"_"进行分割：

 l=['money','money','no use','nouse','money']

 In [19]: a,b,_,_,e=l

 In [20]: print(a,b)

 money money

 In [21]: print(a,b,e)

 money money money

　　问题2：如果一个可迭代对象比较长，且大于要分割的数目怎么办?"*表达式"就该发挥作用了

l=range(1,100)

In [24]: len(l)

Out[24]: 99

In [25]: first,*p,last = l #p是一个列表[2,...,98]

In [26]: print (first,len(p),last)

1 97 99

In [1]: l=range(1,100)
In [2]: *q,m,n = l
In [3]: print (len(q),m,n) #q是一个列表[1,...,96]
97 98 99
In [4]: x,y,*z=l
In [5]: print(x,y,len(z))
1 2 97

同样的，“*_”也可以作为看可以丢弃的无用的变量名
In [11]: l=range(1,100)
In [12]: first,*_,end = l
In [13]: print(first,end)
1 99

二、python标准库collections模块的常见应用

　　1. deque类，优点：可以在两端执行添加和弹出，并且效率更高

 from collections import deque

 # 创建一个总长度为5的deque对象

 # 默认maxlen=None,无限长度

 dq = deque(maxlen=5)

 print(dq)

 # out:deque([], maxlen=5)

 # append、appendleft方法

 dq.append(1)

 print(dq)

 # out:deque([1], maxlen=5)

 dq.appendleft(2)

 print(dq)

 # out:deque([2, 1], maxlen=5)

 # 插入序列列表、元组都可以

 dq.extend([5,6])

 print(dq)

 # out:deque([2, 1, 5, 6], maxlen=5)

 dq.extendleft([7,8,9])

 print(dq)

 # out:deque([9, 8, 7, 2, 1], maxlen=5)

 # 从末尾pop出来一个元素:1

 dq.pop()

 print(dq)

 # out:deque([9, 8, 7, 2], maxlen=5)

 # 从开头出pop出来一个元素:9

 dq.popleft()

 print(dq)

 # out:deque([8, 7, 2], maxlen=5)

 # 复制一份,相互独立

 dq_copy = dq.copy()

 print(dq,dq_copy)

 # out:deque([8, 7, 2], maxlen=5) deque([8, 7, 2], maxlen=5)

 # dq.index(x)返回x在dq中的索引

 print(dq.index(7))

 # out:1

 # dq.count(x)返回dq中x的数目

 print(dq.count(7))

 # out:1

 # 返回容量

 print(dq.maxlen)

 # out:5

 # dq.insert(index,x) 在index处插入x

 dq.insert(0,110)

 print(dq)

 # out:deque([110, 8, 7, 2], maxlen=5)

 # dq.remove(x)从dq中移除x

 try:

     dq.remove(99)

 except Exception as e:

     print(e)

 # out:deque.remove(x): x not in deque

 #dq.reverse(),转置

 dq.reverse()

 print(dq)

 # out:deque([2, 7, 8, 110], maxlen=5)

 for i in dq:

     print(i)

 '''

 2

 7

 8

 110

 '''

 # 清空所有元素

 dq.clear()

 print(dq)

 # deque([], maxlen=5)

 #^_^ over

　　2. defaultdict类，定义一个字典对象.优点：非常容易的实现一Key多Vallue的字典

　　　　例如;d={'a':[1,2,3,4],'b':[5,7]} ,s={'a':{2,3,4},'b':{4,5}}

　　　　实例化:d=defaultdict(list|set) 列表、集合根据需要设置，效果如上.(元素去重用set,保持元素顺序用list)

 '''

 default一大优点就是创建实例的同时就已经初始化了一个可能会用到的值例如下列中的d1['a']、d2['a']

 '''

 In [5]: d1=dict()

 In [6]: from collections import defaultdict

 In [7]: d2=defaultdict(list)

 In [8]: print(d1['a'])

 ---------------------------------------------------------------------------

 KeyError                                  Traceback (most recent call last)

 <ipython-input-8-1877cceef97a> in <module>()

 ----> 1 print(d1['a'])

 KeyError: 'a'

 In [9]: print(d2['a'])

 []

　　看一个我遇到的使用defaultdict的比较好的例子:

 from collections import defaultdict

 '''完整代码不贴了，用一个列表简化一下当时我的需求：

 计算下面列表中的每一个元素split后的第二个元素是y的个数(即：string.split()[1]=='y')，并且与string.split()[0]一 一对应.
 '''

 l=['a y','a n','a n','a n','a y','a y','a y''a y','b n','b n','b y','b y','b n','b y']

 d=defaultdict(list)

 for i in l:

     if i.split()[1]=='y':

         d[i.split()[0]].append(1) #这一步完全忽视i.split()[0]是什么东西，就是干，换了dict()肯定要先定义一个key了。。。比如d=dict();d['a']=[];

 for k,v in d.items():

     print('{}:{}次'.format(k,len(v))) 
--------------------------------------------
b:3次　　　　#直到这里才知道都是么鬼，几个鬼
a:3次

　　3.OrderedDict类,看名字就跟顺序有关,是的，它的作用是让存入字典的k,v对保持插入时的顺序(数据量较大时需要考虑性能)

 In [10]: from collections import OrderedDict

 In [11]: d=OrderedDict()

 In [12]: d['a']=1

 In [13]: d['b']=2

 In [14]: d['c']=3

 In [15]: d['d']=4

 In [16]: for k in d:

    ....:     print(k,d[k])

    ....:

 ------------------------------------

 a 1

 b 2

 c 3

 d 4

4.Counter类.能够简单快捷的返回序列中出现次数最多的元素和个数.

 >>> l = []

 >>> for i in range(1,20):

 ...     l.append(chr(random.randint(97,100)))　　#这里我们并不知道都随机生成了什么字符列表，但是大致知道是[a-d]

 ...

 >>> list_count = Counter(l)    #实例Counter对象

 >>> top_three = list_count.most_common(3)  #取出啊前三个出现次数最多的元素

 >>> print (top_three)

 [('a', 8), ('d', 4), ('b', 4)]

　　假如我们知道里面有一个字符(比如'a')出现了很多次，想要得到具体多少次呢？

>>> list_count['a']

8

>>> list_count['b']

4

>>> list_count['d']

4

>>> list_count['c']

3

>>> print(list_count['e'])

0

>>> print(l)

['a', 'd', 'a', 'd', 'a', 'b', 'a', 'a', 'd', 'b', 'd', 'a', 'a', 'a', 'c', 'c', 'c', 'b', 'b']

　　操作跟字典真是太像了，没错看一下Counter类的定义,以及其它用法

class Counter(dict):#没错，继承自字典。。。

    '''Dict subclass for counting hashable items.  Sometimes called a bag

    or multiset.  Elements are stored as dictionary keys and their counts

    are stored as dictionary values.

    >>> c = Counter('abcdeabcdabcaba')  # count elements from a string

    >>> c.most_common(3)                # three most common elements

    [('a', 5), ('b', 4), ('c', 3)]

    >>> sorted(c)                       # list all unique elements

    ['a', 'b', 'c', 'd', 'e']

    >>> ''.join(sorted(c.elements()))   # list elements with repetitions

    'aaaaabbbbcccdde'

    >>> sum(c.values())                 # total of all counts

    15

    >>> c['a']                          # count of letter 'a'

    5

    >>> for elem in 'shazam':           # update counts from an iterable

    ...     c[elem] += 1                # by adding 1 to each element's count

    >>> c['a']                          # now there are seven 'a'

    7

    >>> del c['b']                      # remove all 'b'

    >>> c['b']                          # now there are zero 'b'

    0

    >>> d = Counter('simsalabim')       # make another counter

    >>> c.update(d)                     # add in the second counter

    >>> c['a']                          # now there are nine 'a'

    9

    >>> c.clear()                       # empty the counter

    >>> c

    Counter()

    Note:  If a count is set to zero or reduced to zero, it will remain

    in the counter until the entry is deleted or the counter is cleared:

    >>> c = Counter('aaabbc')

    >>> c['b'] -= 2                     # reduce the count of 'b' by two

    >>> c.most_common()                 # 'b' is still in, but its count is zero

    [('a', 3), ('c', 1), ('b', 0)]

    '''

三、通过公共键，对字典列表排序(sorted),最大值(max)，最小值(min)

　　方法-1:使用operator.itemgetter

from operator import itemgetter

l = [

    {'name': '股票1','num': '','price': 13.7},

    {'name': '股票2','num': '','price': 15.7},

    {'name': '股票4','num': '','price': 16.7},

    {'name': '股票3','num': '','price': 10.7}

]

print(l)

sort_by_bum = sorted(l,key=itemgetter('num'))

sort_by_price = sorted(l,key=itemgetter('price'))

print(sort_by_bum)

print(sort_by_price)

---------------------------

[{'name': '股票1', 'num': '', 'price': 13.7}, {'name': '股票2', 'num': '', 'price': 15.7}, {'name': '股票4', 'num': '', 'price': 16.7}, {'name': '股票3', 'num': '', 'price': 10.7}]

[{'name': '股票1', 'num': '', 'price': 13.7}, {'name': '股票2', 'num': '', 'price': 15.7}, {'name': '股票3', 'num': '', 'price': 10.7}, {'name': '股票4', 'num': '', 'price': 16.7}]

[{'name': '股票3', 'num': '', 'price': 10.7}, {'name': '股票1', 'num': '', 'price': 13.7}, {'name': '股票2', 'num': '', 'price': 15.7}, {'name': '股票4', 'num': '', 'price': 16.7}]

---------------------------------------------------

from operator import itemgetter
l = [
    {'name': '股票1','num': '000001','price': 13.7},
    {'name': '股票2','num': '000002','price': 15.7},
    {'name': '股票3','num': '000003','price': 16.7},
    {'name': '股票3','num': '000003','price': 10.7}
]
print(l)
sort_by_num_price = sorted(l,key=itemgetter('num','price'))
print(sort_by_num_price)

---------------------------
[{'price': 13.7, 'num': '000001', 'name': '股票1'}, {'price': 15.7, 'num': '000002', 'name': '股票2'}, {'price': 16.7, 'num': '000003', 'name': '股票4'}, {'price': 10.7, 'num': '000003', 'name': '股票3'}]
[{'price': 13.7, 'num': '000001', 'name': '股票1'}, {'price': 15.7, 'num': '000002', 'name': '股票2'}, {'price': 10.7, 'num': '000003', 'name': '股票3'}, {'price': 16.7, 'num': '000003', 'name': '股票4'}]

　　方法-2:区别仅仅在于key后面函数的实现方式.无论lambda,itemgetter都是定义了一个func

　　func(x) = lambda x:x['num'] ==>return x['num']

　　func = itemgetter('num') func(r)==>return r['num']

#使用匿名函数lambda

l = [

    {'name': '股票1','num': '','price': 13.7},

    {'name': '股票2','num': '','price': 15.7},

    {'name': '股票4','num': '','price': 16.7},

    {'name': '股票3','num': '','price': 10.7}

]

print(l)

sort_by_bum = sorted(l,key=lambda x:x['num'])

sort_by_price = sorted(l,key=lambda x:x['price'])

print(sort_by_bum)

print(sort_by_price)

-------------------------

[{'name': '股票1', 'num': '', 'price': 13.7}, {'name': '股票2', 'num': '', 'price': 15.7}, {'name': '股票4', 'num': '', 'price': 16.7}, {'name': '股票3', 'num': '', 'price': 10.7}]

[{'name': '股票1', 'num': '', 'price': 13.7}, {'name': '股票2', 'num': '', 'price': 15.7}, {'name': '股票3', 'num': '', 'price': 10.7}, {'name': '股票4', 'num': '', 'price': 16.7}]

[{'name': '股票3', 'num': '', 'price': 10.7}, {'name': '股票1', 'num': '', 'price': 13.7}, {'name': '股票2', 'num': '', 'price': 15.7}, {'name': '股票4', 'num': '', 'price': 16.7}]

---------------------------------------------------分割线--
l = [
    {'name': '股票1','num': '000001','price': 13.7},
    {'name': '股票2','num': '000002','price': 15.7},
    {'name': '股票3','num': '000003','price': 16.7},
    {'name': '股票3','num': '000003','price': 10.7}
]
print(l)
sort_by_num_price = sorted(l,key=lambda x:(x['num'],x['price']))
print(sort_by_num_price)

---------------------------
[{'price': 13.7, 'num': '000001', 'name': '股票1'}, {'price': 15.7, 'num': '000002', 'name': '股票2'}, {'price': 16.7, 'num': '000003', 'name': '股票4'}, {'price': 10.7, 'num': '000003', 'name': '股票3'}]
[{'price': 13.7, 'num': '000001', 'name': '股票1'}, {'price': 15.7, 'num': '000002', 'name': '股票2'}, {'price': 10.7, 'num': '000003', 'name': '股票3'}, {'price': 16.7, 'num': '000003', 'name': '股票4'}]

最大值，最小值,max,min

from operator import itemgetter

l = [

    {'name': '股票1','num': '','price': 13.7},

    {'name': '股票2','num': '','price': 15.7},

    {'name': '股票4','num': '','price': 16.7},

    {'name': '股票3','num': '','price': 10.7}

]

max = max(l,key=itemgetter('price'))

min = min(l,key=itemgetter('price'))

print(max)

print(min)

-----------------------------

{'num': '', 'price': 16.7, 'name': '股票4'}

{'num': '', 'price': 10.7, 'name': '股票3'}

四.高阶函数map/filter/sorted

class map(object)
| map(func, *iterables) --> map object
| Make an iterator that computes the function using arguments from
| each of the iterables. Stops when the shortest iterable is exhausted.

简单解释下，map有需要两个参数func和iterables.可以这么理解，func是你根据需求编写的一个函数，iterables是你需要操作的一个可迭代对象，他们的返回值是一个iterator(迭代器),有关迭代器的概念这里不多讲，不过暂时可以简单的记住迭代器可以通过list(iterator)或者set(iterator)生成一个列表或者集合.

举个比较经典的例子吧,把一个数列a = [1,2,3,3,4,5,6]中的所有元素平方后生成一个新的数列b:

>>> a = [1,2,3,4,5,67]

>>> def func(x):

...  return x*x

...

>>> b=list(map(func,a))

>>> b

[1, 4, 9, 16, 25, 4489]

>>>

在这个例子中，假设你不清楚map用法也完全可以得到我们想要的结果b，不过那样的话你是不是至少用一个for循环呢，多谢几行代码呢？其实多写几行代码倒是没有什么，重点使用map起码能够给我们的代码增添一点点点的亮点。

class filter(object)
| filter(function or None, iterable) --> filter object
| Return an iterator yielding those items of iterable for which function(item)
| is true. If function is None, return the items that are true.

跟map一样需要两个参数func和可迭代对象，在filter执行过程中把iterable中的每一个元素迭代到func中，如果执行结果为真的把当前元素添加到迭代器中，当所有元素执行完后，返回迭代器.

看例子:

a=[4,5,6,7,8,9,0]

>>> def dayu_5(x):

...  if x>5:

...   return True

>>> list(filter(dayu_5,a))

[6, 7, 8, 9]

sorted(iterable, key=None, reverse=False)
Return a new list containing all items from the iterable in ascending order.

返回一个包含了所有可迭代对象，并且按照升序排列的一个新的列表

看例子:

>>> l=[2,34,5,61,2,4,6,89]

>>> sorted(l)

[-89, -6, 2, 2, 4, 5, 34, 61]

>>> sorted(l,key=lambda x:abs(x))

[2, 2, 4, 5, -6, 34, 61, -89]

>>> sorted(l,key=lambda x:abs(x),reverse=True)

[-89, 61, 34, -6, 5, 4, 2, 2]

#三个例子分别使用了sorted的默认排序，key自定义排序，以及reverse翻转的功能.

key其实表示的一个函数,接着看下面的例子:
>>> def ff(x):
...  return abs(x)
>>> sorted(l,key=ff,reverse=False)
[2, 2, 4, 5, -6, 34, 61, -89]
可以看到效果是一样的

五 heapq模块

nlargest()和nsmallest()

Help on function nlargest in module heapq:
nlargest(n, iterable, key=None)
Find the n largest elements in a dataset.
Equivalent to: sorted(iterable, key=key, reverse=True)[:n]

通过help函数可以看到，heapq.nlargest()作用为从可迭代对象iterable中，依据key的定义取出最大的n个值,其作用等于sorted(iterable, key=key, reverse=True)[:n]表达式

 >>> l2=[(10,2),(9,5),(8,10),(7,6),(6,1)]

 >>> l = [12,34,22,45,678,89,100,232,23]

 >>> help(heapq.nlargest)

 >>> l = [12,34,22,45,678,89,100,232,23]

 >>> heapq.nlargest(5,l)

 [678, 232, 100, 89, 45]

 >>> l2=[(10,2),(9,5),(8,10),(7,6),(6,1)]

 >>> heapq.nlargest(3,l2,key=lambda x:x[1])

 [(8, 10), (7, 6), (9, 5)]

 >>>

 >>> sorted(l,reverse=True)[:5]

 [678, 232, 100, 89, 45]

 >>>

Help on function nsmallest in module heapq:
nsmallest(n, iterable, key=None)
Find the n smallest elements in a dataset.
Equivalent to: sorted(iterable, key=key)[:n]

通过help帮助可以发现nsmallest()刚好和nlargest()作用相反

 >>> heapq.nsmallest(5,l)

 [12, 22, 23, 34, 45]

 >>> sorted(l)[:5]

 [12, 22, 23, 34, 45]

 >>> heapq.nsmallest(3,l2,key=lambda x:x[1])

 [(6, 1), (10, 2), (9, 5)]

 >>>

六去除序列中的重复项并保持顺序不变

通常情况下我们经常用set给序列排序,但是往往原来的顺序会发生改变，如下面例子

>>> l=[]

>>> for i in range(10):

...  l.append(random.randint(1,4))

...

>>> list(set(l))

[1, 2, 3, 4]

>>>

待续。。

python 数据结构中被忽视的小技巧的更多相关文章

Python对list操作的一些小技巧
Python对list操作的一些小技巧由于要搞数学建模,于是从熟悉已久的C++转战Python.虽然才上手,但是Python的语法糖就让我大呼过瘾.不得不说相比于C/C++,Python对于数据的 ...
http://www.yyne.com/python使用-urllib-quote-进行-url-编码小技巧/
http://www.yyne.com/python使用-urllib-quote-进行-url-编码小技巧/
python 设计及调试的一些小技巧
在“笨办法学习python”中介绍了一些设计函数以及调试技巧: 参考网址:http://www.jb51.net/shouce/Pythonbbf/latest/ex36.html If 语句的规则¶ ...
python 数据结构中的链表操作
链表的定义: 链表(linked list)是由一组被称为结点的数据元素组成的数据结构,每个结点都包含结点本身的信息和指向下一个结点的地址.由于每个结点都包含了可以链接起来的地址信息,所以用一个变量就 ...
Python中Template使用的一个小技巧
Python中Template是string中的一个类,可以将字符串的格式固定下来,重复利用. from string import Template s = Template("there ...
Mysql数据表字段扩充的小技巧
在开发中,往往需求变更比开发速度要快,就会存在一些问题,比如突然要增加一个字段,我们需要 alter table 表名 add [column] 字段名数据类型 [列属性] [位置]; 然后修改实体 ...
Python最好IDE：Pycharm使用小技巧总结，让你写代码更为舒适
今天整理了几个在使用python进行数据分析的常用小技巧、命令。
提高Python数据分析速度的八个小技巧 01 使用Pandas Profiling预览数据这个神器我们在之前的文章中就详细讲过,使用Pandas Profiling可以在进行数据分析之前对数据进行 ...
关于Python ，requests的小技巧
版权声明:本文为博主原创文章,未经博主允许不得转载. https://blog.csdn.net/xie_0723/article/details/52790786 关于 Python Request ...

随机推荐

NGINX 缓存使用指南
NGINX 缓存使用指南 [proxy_cache] Nginx 一个web缓存坐落于客户端和“原始服务器(origin server)”中间,它保留了所有可见内容的拷贝.如果一个客户端请求的内容在缓 ...
模仿 BootstrapValidator 自制模块化表单验证
index.html : <!DOCTYPE html> <html lang="en"> <head> <meta charset=&q ...
Docker URL REST API访问设置
Docker daemon 绑定到该端口上. 默认情况下,docker daemon使用unix socket(unix:///var/run/docker.sock) 1.先停止docker dae ...
正则化--L2正则化
请查看以下泛化曲线,该曲线显示的是训练集和验证集相对于训练迭代次数的损失. 图 1 显示的是某个模型的训练损失逐渐减少,但验证损失最终增加.换言之,该泛化曲线显示该模型与训练集中的数据过拟合.根据奥卡 ...
DirectorySearcher LDAP
1.从LDAP服务器上面获取用户名 sAMAccountName是个人的CN结点中的一个属性,例如个人的CN的sAMAccountName的值为:Amy.我命名它为shortname,即短名 publ ...
工作总结 npoi 模板导出公式 excel
Apache POI(5):公式(formula) Apache POI(5):公式(formula) 2016年08月01日 17:44:49 阅读数:1145 package com.hthk ...
Ansible@一个高效的配置管理工具--Ansible configure management--翻译（五）
无书面许可请勿转载高级Playbook Extra variables You may have seen in our template example in the previous chapt ...
解密和解压浏览器上加密的js文件
F12 -> 进入Sources -> 找到任意一个加密的js文件,如图点击最下方的 {} 即可解压
Angular Material表单提交及验证
AngularJS中一些表单验证属性: 修改过的表单,只要用户修改过表单,无论输入是否通过验证,该值都将返回false{formName}.{inputFieldName}.$dirty 合法的表单, ...
openwrt修改密码
默认情况下root是没有密码的需要设置密码后才能开启ssh 修改/etc/shadow文件: root:$1$wEehtjxj$YBu4quNfVUjzfv8p/PBo5.:0:0:99999:7: ...

python 数据结构中被忽视的小技巧

python 数据结构中被忽视的小技巧的更多相关文章

随机推荐

热门专题