http://blog.csdn.net/pipisorry/article/details/51822775

numpy排序、搜索和计数函数和方法。（重新整合过的）

],, , ], [, , ]]
array = numpy.array(list1)
array.sort()
print(array)

[[1 2 3]
[3 4 5]]
sort内建函数是就地排序，会改变原有数组，不同于python中自带的sorted函数和numpy.sort通用函数，参数也不一样。

sort内建函数返回值为None,所以不能有这样的语法：array.sort()[：5]，这相当于是对None类型进行切片操作

矩阵按其第一列元素大小顺序来对整个矩阵进行行排序

mat1=mat1[mat1[:,0].argsort()]

用numpy.sort通用函数排序

np.sort()函数则返回一个新数组，不改变原始数组(类似于python中自带的sorted函数，但numpy中没有sorted函数，参数也不一样)。

它们的axis参数默认值都为-1,即沿着数组的最后一个轴进行排序。 np.sort()函数的axis参数可以设置为None,此时它将得到平坦化之后进行排序的新数组。

>>> np.sort(a) #对每行的数据进行排序
array([[1, 3, 6, 7, 9],
[1, 2, 3,5, 8],
[0, 4, 8, 9, 9]，
[0, 1, 5, 7, 9]])
>>> np.sort(a, axis=0) #对每列的数据进行排序 array([[5,1,1, 4, 0],
[7, 1, 3, 6, 0]，
[9, 5, 9, 7, 2],
[9, 8, 9'8, 3]])

升序排序的实现:

list1 ,,], [,,]]
array = numpy.array(list1)
array )   #对第1维升序排序
#array = sort(array, axis=0)   #对第0维
print(array)
[[1 2 3]


[3 4 5]]

降序排序的实现:

#array )   #降序
[[3 2 1]
 [5 4 3]]

用numpy.argsort通用函数排序

argsort函数用法(numpy-ref-1.8.1P1240)

argsort()返冋数组的排序下标，axis参数的默认值为-1。

argsort(a, axis=-1, kind='quicksort', order=None)
Returns the indices that would sort an array.

argsort函数返回的是数组值从小到大的索引值

Examples

--------
One dimensional array:一维数组
>>> x = np.array([3, 1, 2])
>>> np.argsort(x)
array([1, 2, 0])

Two-dimensional array:二维数组
>>> x = np.array([[0, 3], [2, 2]])
>>> x
array([[0, 3],
[2, 2]])
>>> np.argsort(x, axis=0) #按列排序
array([[0, 1],
[1, 0]])
>>> np.argsort(x, axis=1) #按行排序
array([[0, 1],
[0, 1]])

>>> x = np.array([3, 1, 2])
>>> np.argsort(x) #按升序排列
array([1, 2, 0])

>>> np.argsort(-x) #按降序排列
array([0, 2, 1])

Note: 当然也可以升序排序，在处理的时候处理成降序也行，如np.argsort(index[c])[:-MAX_K:-1]

另一种方式实现按降序排序（不能用于多维数组）
>>> a
array([1, 2, 3])
>>> a[::-1]
array([3, 2, 1])

>>> x[np.argsort(x)] #通过索引值排序后的数组
array([1, 2, 3])
>>> x[np.argsort(-x)] #不能用于二维存取！！
array([3, 2, 1])

多维数组的降序排序

list1 , , ], [, , ]]
a = numpy.array(list1)
a ))])
print(a)

[[3 2 1]
 [4 3 1]]

list1 , , ], [, , ]]
a = numpy.array(list1)
sindx )
indx = numpy.meshgrid(*[numpy.arange(x) for x in a.shape], sparse=True,
                   indexing='ij')
indx[] = sindx
a = a[indx]
print(a)

[[3 2 1]
 [4 3 1]]

list1 , , ], [, , ]]
a = numpy.array(list1)
a )
print(a)

[[3 2 1]
 [4 3 1]]

[numpy中argsort函数用法]

皮皮blog

搜索Searching

一般numpy数组搜索到某些值后都要进行另外一些操作（如赋值、替换）。

比如替换numpy数组中值为0的元素为1， a[a == 0] = 1

更复杂的筛选可以通过np.minimum(arr, 255)或者result = np.clip(arr, 0, 255)实现。

`argmax`(a[, axis, out])	Returns the indices of the maximum values along an axis.
`nanargmax`(a[, axis])	Return the indices of the maximum values in the specified axis ignoring NaNs.
`argmin`(a[, axis, out])	Returns the indices of the minimum values along an axis.
`nanargmin`(a[, axis])	Return the indices of the minimum values in the specified axis ignoring NaNs.
`argwhere`(a)	Find the indices of array elements that are non-zero, grouped by element.
`nonzero`(a)	Return the indices of the elements that are non-zero.
`flatnonzero`(a)	Return indices that are non-zero in the flattened version of a.
`where`(condition, [x, y])	Return elements, either from x or y, depending on condition.
`searchsorted`(a, v[, side, sorter])	Find indices where elements should be inserted to maintain order.
`extract`(condition, arr)	Return the elements of an array that satisfy some condition.

最值

用min()和max()可以计算数组的最大值和最小值，而ptp()计算最大值和最小值之间的差。

它们都有axis和out两个参数。

用argmax()和argmin()可以求最大值和最小值的下标。如果不指定axis参数，就返回平坦化之后的数组下标。

>>> np.argmax(a) #找到数组a中最大值的下标，有多个最值时得到第一个最值的下标
2
>>> a.ravel()[2] #求平坦化之后的数组中的第二个元素
9
可以通过unravel_index()将一维下标转换为多维数组中的下标，它的第一个参数为一维下标值，第二个参数是多维数组的形状。
>>> idx = np.unravel_index(2, a.shape)
>>> idx
(0, 2)
>>> a[idx]
9

当使用axis参数时，可以沿着指定的轴计算最大值的下标。
例如下面的结果表示，在数组 a中，第0行中最大值的下标为2,第1行中最大值的下标为3:
>>> idx = np.argmax(a, axis=1)
>>> idx
array([2, 3, 0, 0])
使用idx选择出每行的最大值:
>>> a[xrange(a.shape[0]),idx]
array([9, 8, 9, 9])

nonzero(a)

返回非0元素的下标位置

其实不就是a != 0吗？

元素查找where

查找某个元素的位置

given a Numpy array, array, and a value, item, to search for.

itemindex = numpy.where(array==item)

The result is a tuple with first all the row indices, then all the column indices.

只查找一维array的第一个位置

array.tolist().index(1)

itemindex = np.argwhere(array==item)[0]; array[tuple(itemindex)]

Note:np.argwhere(a) is the same as np.transpose(np.nonzero(a)).The output of argwhere is not suitable for indexing arrays.For this purpose use where(a) instead.index = numpy.nonzero(first_array == item)[0][0]

[Is there a Numpy function to return the first index of something in an array?]

分段函数

{像python中的x = y if condition else z 或者 C语言里面的 condition？a：b，判断条件是否正确，正确则执行a，否则b}

where函数

where(condition, [x, y])

例1：计算两个矩阵的差，然后将残差进行平方

def f_norm_1(data, estimate):
   residule = 0
   for row_index in range(data.shape[0]):
     for column_index in range(data.shape[1]):
       if data[row_index][column_index] != 0:
         residule += (data[row_index][column_index] - estimate[row_index][column_index]) ** 2
   return residule

def f_norm_2(data, estimate)

return sum(where(data != 0, (data-estimate) **2, 0))

因为我需要的是考虑矩阵稀疏性，所以不能用内置的norm，函数1是用普通的python写的，不太复杂，对于规模10*10的矩阵，计算200次耗时0.15s，函数2使用了where函数和sum函数，这两个函数都是为向量计算优化过的，不仅简洁，而且耗时仅0.03s, 快了有五倍，不仅如此，有人将NumPy和matlab做过比较，NumPy稍快一些，这已经是很让人兴奋的结果。

例2：

>>> x=np.arange(10)
>>> np.where(x<5,9-x,x)
array([9, 8, 7, 6, 5, 5, 6, 7, 8, 9]) 表示的是产生一个数组0～9，然后得到另一个数组，这个数组满足：当x<5的时候它的值变为9-x,否则保持为x)。

select函数

out = select(condlist, choicelist, default=0)
其中，condlist是一个长度为N的布尔数组列表，choicelist是一个长度为N的储存候选值的数组列表，所有数组的长度都为M.如果列表元素不是数组而是单个数值，那么它相当于元素值都相同且长度为M的数组。对于从0到M-1的数组下标i,从布尔数组列表中找出满足条件“condlist[j][i]=True”的 j的最小值，则“out[i]=choicelist[j][i]”，其中out是select()的返回数组。choicelist的最后一个元素为True,表示前面所有条件都不满足时，将使用choicelist的最后一个数组中的值。也可以用default参数指定条件都不满足时的候选值数组。

>>> np.select([x<2,x>6,True],[7-x,x,2*x])
array([ 7, 6, 4, 6, 8, 10, 12, 7, 8, 9]) 表示的是当x满足第一个条件时，执行7-x,当x满足第二个条件事执行x,当二者都不满足的时候执行2*x。

piecewise()

piecewise(x, condlist, funclist)

前面两个函数都比较耗内存，所以引入piecewise()，因为它只有在满足条件的时候才计算。也就是where()和select()的所有参数都需要在调用它们之前完成计算，因此下面的实例中NumPy会计算下面4个数组：x>=c, x<c0, x/c0*hc, (c-x)/(c-c0)*hc。在计算时还会产生许多保存中间结果的数组，因此如果输入的数组x很大，将会发生大量的内存分配和释放。为了解决这个问题,可以使用piecewise()专门用于计算分段函数。

参数x是一个保存自变量值的数组.condlist是一个长度为M的布尔数组列表，其中的每个布尔数组的长度都和数组x相同。funclist是一个长度为M或M+1的函数列表，这些函数的输入和输出都是数组。它们计算分段函数中的每个片段。如果不是函数而是数值，就相当于返回此数值的函数。每个函数与condlist中下标相同的布尔数组对应，如果funclist的长度为M+l, 那么最后一个函数对应于所有条件都为False时。

np.piecewise(x, [x < 0, x >= 0], [-1, 1])

x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> np.piecewise(x, [x<2,x>6], [lambda x:7-x,lambda x:x,lambda x:2*x])
array([7, 6, 0, 2, 4, 6, 8, 0, 1, 2])

Note: piecewise中funclist如果不是数值而是函数时要使用lambda表达式，不能使用简单表达式7-x，否则会出错，如ValueError: NumPy boolean array indexing assignment cannot assign 10 input values to the 2 output values where the mask is true。

[numpy.piecewise]

实例

用一个分段函数描述三角波，三角波的样子如下

def triangle_wave(x, c, c0, hc):
x = x - x.astype(np.int) #三角波的周期为1，因此只取x坐标的小数部分进行计算
return np.where(x>=c,0,np.where(x<c0, x/c0*hc, (c-x)/(c-c0)*hc))

由于三角波形分为三段，因此需要两个嵌套的where()进行计算.由于所有的运算和循环都在C语言级别完成，因此它的计算效率比frompyfunc()高。
随着分段函数的分段数量的增加，需要嵌套更多层where(),但这样做不便于程序的编写和阅读。可以用select()解决这个问题。

def triangle._wave2(x, c, c0, hc):
x = x - x.astype(np.int)
return np.select([x>=c, x<c0, True], [0, x/c0*hc, (c-x)/(c-c0)*hc])

也可以使用default:return np.select([x>=c, x<c0], [0, x/c0*hc], default=(c-x)/(c-c0)*hc)

使用piecewise()计算三角波形

def triangle_wave3(x, c, c0, hc):
x = x - x.astype(np.int)
return np.piecewise(x,
[x>=c, x<c0],
[0, # x>=c
lambda x: x/c0*hc, # x<c0
lambda x: (c-x)/(c-c0)*hc]) # else

使用piecewise()的好处在于它只计算需要计算的值.因此在上面的例子中，表达式 “x/c0*hc”和“(c-x)/(c-c0)*hc”只对输入数组x中满足条件的部分进行计算。

调用

x = np.linspace(0, 2, 1000)
y4= triangle_wave3(x,0.6, 0.4, 1.0)

皮皮blog

计数Counting

count_nonzero(a) Counts the number of non-zero values in the array a.

统计numpy数组中非0元素的个数。

0-1array统计1个数

统计0-1array有多少个1，两种方式

np.count_nonzero(fs_predict_array)
fs_predict_array.sum()

count_nonzero速度更快，大概1.6倍快。

统计多维数组所有元素出现次数

使用pandas顶级函数pd.value_counts，value_counts是一个顶级pandas方法，可用于任何数组或序列：
>>> pd.value_counts(obj.values, sort=False)

[pandas小记：pandas高级功能]

from: http://blog.csdn.net/pipisorry/article/details/51822775

ref: [Sorting, searching, and counting]

numpy教程：排序、搜索和计数的更多相关文章

转：Numpy教程
因为用到theano写函数的时候饱受数据结构困扰于是上网找了一篇numpy教程(theano的数据类型是基于numpy的) 原文排版更好,阅读体验更佳: http://phddreamer.blog ...
006-筛选分类排序搜索查找Filter-Classificatio-Sort-Search-Find-Seek-Locate
006-筛选分类排序搜索查找Filter-Classificatio-Sort-Search-Find-Seek-Locate https://www.cnblogs.com/delphixx/p/1 ...
numpy教程
[转]CS231n课程笔记翻译:Python Numpy教程原文链接:https://zhuanlan.zhihu.com/p/20878530 译者注:本文智能单元首发,翻译自斯坦福CS231n课 ...
Java实现基于桶式排序思想和计数排序思想实现的基数排序
计数排序前提:待排序表中的所有待排序关键字必须互不相同: 思想:计数排序算法针对表中的每个记录,扫描待排序的表一趟,统计表中有多少个记录的关键码比该记录的关键码小,假设针对某一个记录,统计出的计数值 ...
Python 机器学习库 NumPy 教程
0 Numpy简单介绍 Numpy是Python的一个科学计算的库,提供了矩阵运算的功能,其一般与Scipy.matplotlib一起使用.其实,list已经提供了类似于矩阵的表示形式,不过numpy ...
Python排序搜索基本算法之归并排序实例分析
Python排序搜索基本算法之归并排序实例分析本文实例讲述了Python排序搜索基本算法之归并排序.分享给大家供大家参考,具体如下: 归并排序最令人兴奋的特点是:不论输入是什么样的,它对N个元素的序 ...
NumPy 教程目录
NumPy 教程目录 1 Lesson1--NumPy NumPy 安装 2 Lesson2--NumPy Ndarray 对象 3 Lesson3--NumPy 数据类型 4 Lesson4--Nu ...
OJ教程--排序算法
1 算法分类十种常见排序算法可以分为两大类: 非线性时间比较类排序:通过比较来决定元素间的相对次序,由于其时间复杂度不能突破O(nlogn),因此称为非线性时间比较类排序. 线性时间非比较类排序:不 ...
【转】numpy教程
[转载说明] 本来没有必要转载的,只是网上的版本排版不是太好,看的不舒服.所以转过来,重新排版,便于自己查看. 基础篇 NumPy的主要对象是同种元素的多维数组. 这是一个所有的元素都是一种类型.通过 ...

随机推荐

[FJOI2014]最短路径树问题
Description 给一个包含n个点,m条边的无向连通图.从顶点1出发,往其余所有点分别走一次并返回. 往某一个点走时,选择总长度最短的路径走.若有多条长度最短的路径,则选择经过的顶点序列字典序最 ...
TopCoder SRM 560 Div 1 - Problem 1000 BoundedOptimization & Codeforces 839 E
传送门:https://284914869.github.io/AEoj/560.html 题目简述: 定义"项"为两个不同变量相乘. 求一个由多个不同"项"相 ...
●UOJ58 [WC2013]糖果公园
题链: http://uoj.ac/problem/58题解: 树上带修莫队. 每个块的大小为$n^{\frac{2}{3}}$,在dfs时,把点集分为若干块. 然后类似序列带修莫队,三个关键字:be ...
UVALive - 3938："Ray, Pass me the dishes!"
优美的线段树 #include<cstdio> #include<cstdlib> #include<algorithm> #include<cstring& ...
SPOJ - DISUBSTR 多少个不同的子串
694. Distinct Substrings Problem code: DISUBSTR Given a string, we need to find the total number o ...
给小白看的KMP算法
浅谈KMP算法: (大部分人的KMP写法都是不一样的) 一: 先给大家推荐一个讲kmp特别好理解的一个博客:阮一峰二: 再给大家介绍一点相关概念: 栗子: P串: ABCBD 前缀:A,AB,AB ...
C指针指针和数组二维数组的指针指针应用
直接到变量名标识的存储单元中读取变量的值--直接寻址通过其他变量间接找到变量的地址读取变量的值--间接寻址指针变量存放地址显示变量的地址指针变量使用前必须初始化,若不知指向哪,可先指向NULL ...
Gradle--初识
1.Eclipse从svn导入Gradle项目 1.检出项目的时候不要选新项目,选"做为工作空间中的项目检出",然后点Finish. 2.将项目转为Gradle项目,右键导入的项目 ...
mongo索引
索引自动创建和手工创建 db.stu.drop(); db.stu.insert({"name":"张三","sex":"男&qu ...
Kinect 深度图像格式
Kinect的深度图像有16bit,2byte,如图: 第15位:标志位,不用做深度计算第14~3位:深度图像数据,即距离,以毫米为单位第0~2位:深度图中人的ID(PlayerID) 深度图有两 ...

numpy教程：排序、搜索和计数