python编程

发布日期: 2024-03-04

更新日期: 2024-07-17

文章字数: 2.5k

阅读时长: 12 分

Numpy

NumPy（Numerical Python）是Python中用于科学计算的一个重要库，它提供了一个强大的多维数组对象（称为ndarray），以及针对这些数组对象进行操作的各种函数。NumPy 是许多其他科学计算库的基础，如 SciPy、Pandas 和 Matplotlib。

通常导入numpy包简写为np：

python

import numpy as np

本文仅记录一些本人在实践中常用的包方法。

数组创建

创建随机数组：np.random.rand()

np.random.rand(d0: int, d1: int, d2: int, ...)

这是一个非常常用的创建随机数组的方法；它将创建一个形状为$(d_0, d_1, d_2, \dots)$的随机数组。

python

data = np.random.rand(2, 4, 3)
print(data)
'''
输出：
[[[0.12344769 0.62765118 0.01542036]
  [0.64811143 0.34687265 0.97168938]
  [0.5908757  0.57925054 0.1721152 ]
  [0.89463742 0.74941611 0.66578358]]

 [[0.73119619 0.67648234 0.6159516 ]
  [0.71659169 0.73652607 0.5434614 ]
  [0.64428965 0.46184996 0.33439822]
  [0.01748738 0.23332551 0.57695121]]]
'''

当然，也可以创建整数随机数组，方法为

np.random.randint(low: int, high: int, size: Tuple)

它创建一个形状为size的整数数组，其中的整数范围均为$[low, high]$。

python

int_data = np.random.randint(low=0, high=10, size=(2, 3))
print(int_data)
'''
输出：
[[4 7 4]
 [1 1 9]]
'''

创建全零/全一数组：np.zeros()/np.ones()

np.zeros(shape: Tuple, dtype)
np.ones(shape: Tuple, dtype)

创建指定形状、指定数据类型的全零/全一数组。

np.zeros_like(a: np.ndarray, dtype)
np.ones_like(a: np.ndarray, dtype)

python

data = np.zeros((2, 4))
print(data)
data = np.ones((2, 4))
print(data)
data_zero = np.zeros_like(data)
print(data_zero)
'''
输出：
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]]
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]]
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]]
'''

创建有序数组：np.arange()

np.arange(start: int, stop: int, step: int)

创建一个数组，首位元素为start，然后以step为步长取数组成数组。但是注意，这个方法只能生成整型元素组成的数组；换言之，start、stop和step均为整型。

创建指定元素数量的数组：np.linspace()

np.linspace(start: float, stop:float, num: int, endpoint: bool)

创建一个数组，首位元素为start，末位元素为stop，取共num个浮点数组成数组；endpoint指示末位元素stop是否计入数组。

若endpoint为True，则在$[start, stop]$上，以$\displaystyle \frac{stop - start}{num-1}$为步长，取num个浮点数组成数组；
若endpoint为False，则在$[start, stop)$上，以$\displaystyle \frac{stop - start}{num}$为步长，取num个浮点数组成数组。

python

>>> np.linspace(2.0, 3.0, num=5)
array([2.  , 2.25, 2.5 , 2.75, 3.  ])
>>> np.linspace(2.0, 3.0, num=5, endpoint=False)
array([2. ,  2.2,  2.4,  2.6,  2.8])
>>> np.linspace(2.0, 3.0, num=5, retstep=True)
(array([2.  ,  2.25,  2.5 ,  2.75,  3.  ]), 0.25)

创建单位矩阵（数组）：np.eye()

np.eye(N: int, M: int, k: int)

创建一个形状为（N, M）的矩阵，当参数k缺省时，其主对角线上元素均为1；否则全1的对角线会发生偏移。

python

a = np.eye(3)
print(a)
b = np.eye(3, 5)
print(b)
c = np.eye(3, 5, k=2)
print(c)
'''
输出：
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
[[1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]]
[[0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]]
'''

数组转换

即将其他数据类型转换为numpy的内置数据类型ndarray。

list转数组：np.array()

np.array(obj: Sequence)

非常常用的方法，其中Sequence常为List，即将该对象转换为数组。

读取文件为数组：np.loadtxt()

np.loadtxt(fname, dtype, delimiter)

将文件内容读入数组类型，并可指定数据类型和文件分隔符。

数组拼接

最通用拼接方法：np.concatenate()

np.concatenate(arrays: List[np.ndarray], axis: int)

将arrays中的数组，沿着axis维度进行拼接，得到新的数组。

python

array_list = []
for i in range(5):
    array_list.append(np.random.randint(0, 10, size=(2, 1, 3))) # 每一个元素都是形状为(2, 1, 3)的三维数组
# array_list是一个包含5个(2, 1, 3)形状的数组的列表
data = np.concatenate(array_list, axis=1)
print(data.shape)
'''
输出：
(2, 5, 3)
'''

按行、按列拼接：np.hstack()/np.vstack()

np.hstack(tup: List)等价于np.concatenate(arrays, axis=1)
np.vstack(tup: List)等价于np.concatenate(arrays, axis=0)

上述两个方法的使用均是数组拼接的特殊情况：np.hstack()将数组水平拼接，也即沿着第二个维度拼接；np.vstack()将数组垂直拼接，也即沿着第一个维度拼接。

python

array_list = []
for i in range(5):
    array_list.append(np.random.randint(0, 10, size=(2, 1, 3))) # 每一个元素都是形状为(2, 1, 3)的三维数组
# array_list是一个包含5个(2, 1, 3)形状的数组的列表
data_h = np.hstack(array_list)
data_v = np.vstack(array_list)
print(data_h.shape)
print(data_v.shape)
'''
输出：
(2, 5, 3)
(10, 1, 3)
'''

数组变形

最朴素的变形方法：reshape()

np.reshape(a: np.ndarray, shape: int | Tuple[int])
np.ndarray.reshape(shape: Tuple[int])或np.ndarray.reshape(d_0, d_1, d_2, ...)

这个方法非常简单实用，它可以将数组形状调整为指定形状$(d_0, d_1, d_2, \dots)$。

python

B, N, T, D = 2, 5, 10, 8
a = np.random.rand(B*N, T, D)
a_new = np.reshape(a, (B, N, T, D))
print(a_new.shape)
b_new = a.reshape(B, N, T, D)
print(b_new.shape)
'''
输出：
(2, 5, 10, 8)
(2, 5, 10, 8)
'''

展平：np.ndarray.flatten()

np.ndarray.flatten(order)

这个方法可以将（高维）数组展平为一维数组，order参数可以指定展平顺序，默认不改变顺序。

python

B, N, T, D = 2, 5, 10, 8
a = np.random.rand(B*N, T, D)
b = a.flatten()
print(b.shape)
'''
输出：
(800,)
'''

删除维度：squeeze()

np.ndarray.squeeze(axis: int)
np.squeeze(a: np.ndarray, axis: int)

这个方法可以删除指定维度，但是注意必须是单维度才可以删除。

python

a = np.random.rand(2, 1, 3)
b = a.squeeze(1) # 删除第二维，该维度上的长度是1才可以删除
print(b.shape)
c = np.squeeze(a, axis=1)
print(c.shape)
'''
输出：
(2, 3)
(2, 3)
'''

增加维度：np.expand_dims()

np.expand_dims(a: np.ndarray, axis: int)

在指定维上新增一个维度，且新增的维度长度为1。

python

a = np.random.rand(2, 3)
b = np.expand_dims(a, axis=0)
c = np.expand_dims(a, axis=1)
d = np.expand_dims(a, axis=2)
print(b.shape, c.shape, d.shape)
'''
输出：
(1, 2, 3) (2, 1, 3) (2, 3, 1)
'''

数组重复

同维度重复：repeat()

由于某些特定的需求，可能需要对数组中的某些元素进行重复，以得到新的数组。

np.repeat(a: np.ndarray, repeats: int | List, axis: int)
np.ndarray.repeat(repeats: int | List, axis: int)

这个方法将在数组的指定维度上，将数组元素重复若干次（repeats参数既可以是单个整数，也可以是整数列表），具体效果见例子。

python

a = np.random.randint(0, 10, size=(2, 3))
print(a, a.shape)
b = np.repeat(a, repeats=2, axis=0)
print(b, b.shape)
c = a.repeat(2, axis=1)
print(c, c.shape)
d = a.repeat([1, 2], axis=0)
print(d, d.shape)
'''
输出：
[[2 5 2]
 [5 7 5]] (2, 3)
[[2 5 2]
 [2 5 2]
 [5 7 5]
 [5 7 5]] (4, 3)
[[2 2 5 5 2 2]
 [5 5 7 7 5 5]] (2, 6)
[[2 5 2]
 [5 7 5]
 [5 7 5]] (3, 3)
'''

支持维度扩展：tile()

np.tile(A: np.ndarray, reps: int | Sequence[int])

这个方法和上面的np.repeat()方法很像，差别主要有两点。首先，这个方法重复排列数组元素的方式不同；其次，这个方法支持维度扩展，即允许np.ndarray.ndim和len(reps)不一致，最后的新数组形状将取两者最大值。具体地，假设数组A形状为$(d_1, d_2,\dots,d_n)$，参数reps为$(r_1,r_2,\dots,r_m)$，则

如果$n = m$即A.ndim == len(reps)，那么无需进行任何维度扩展，新数组形状即为$(d_1\times r_1, d_2\times r_2, \dots,d_n\times r_m)$；
如果$n < m$即A.ndim < len(reps)，那么将默认先对数组A进行维度扩展（前置添单维度）至形状$(c_1,c_2,\dots,c_{m-n},d_1,d_2,\dots,d_n)$，其中$c_1=c_2=\dots=c_{m-n}=1$；
如果$n>m$即A.ndim > len(reps)，那么将默认先对reps进行维度扩展（前置添单维度）至$(c_1,c_2,\dots,c_{n-m},r_1,r_2,\dots,r_m)$，其中$c_1=c_2=\dots=c_{n-m}=1$。

python

a = np.random.randint(0, 10, size=(2, 3))
print(a, a.shape) # a.ndim == 2
b = np.tile(a, reps=(2, 3)) # a.ndim == len(reps)
print(b, b.shape)
c = np.tile(a, reps=(2, 3, 2)) # a.ndim < len(reps), a默认被升维到(1, 2, 3)
print(c, c.shape)
d = np.tile(a, reps=2) # a.ndim > len(reps), reps虽然为2, 但是会被当作(1, 2)
print(d, d.shape)
'''
输出：
[[0 2 3]
 [8 2 0]] (2, 3)
[[0 2 3 0 2 3 0 2 3]
 [8 2 0 8 2 0 8 2 0]
 [0 2 3 0 2 3 0 2 3]
 [8 2 0 8 2 0 8 2 0]] (4, 9)
[[[0 2 3 0 2 3]
  [8 2 0 8 2 0]
  [0 2 3 0 2 3]
  [8 2 0 8 2 0]
  [0 2 3 0 2 3]
  [8 2 0 8 2 0]]

 [[0 2 3 0 2 3]
  [8 2 0 8 2 0]
  [0 2 3 0 2 3]
  [8 2 0 8 2 0]
  [0 2 3 0 2 3]
  [8 2 0 8 2 0]]] (2, 6, 6)
[[0 2 3 0 2 3]
 [8 2 0 8 2 0]] (2, 6)
'''

采样

有时候需要对数组中的元素进行随机采样，numpy包就可以派上用场了。

随机采样：np.random.choice()

np.random.choice(a: int | 1-D array, size: int, replace: boolean=True)

这个方法可以从序列a中随机采样，采样数量为size，而replace指定了是否重复采样：

（默认）如果replace为True，则进行放回采样，得到的样本可能有重复；
如果replace为False，则进行不放回采样，得到的样本不会有重复。

python

data = [3, 5, 1, 9, 4, -1, 8]
choice_data = np.random.choice(data, size=5, replace=False)
print(choice_data, type(choice_data))
'''
输出：
[5 1 4 9 8] <class 'numpy.ndarray'>
'''

乱序排列：np.random.permutation()

np.random.permutation(x: int | array)

它将随机打乱数组x，如果x为整数，则随机打乱一维数组$[0,1,\dots,x-1]$；而如果x为高维数组，则只会随机打乱第一维。

python

out_of_order_data = np.random.permutation(10)
print(out_of_order_data)
print('----------------------')
data = np.random.randint(0, 10, size=(5, 3))
print(data, '(before shuffle)')
out_of_order_data = np.random.permutation(data)
print(out_of_order_data, 'after shuffle')
'''
[0 5 1 9 7 3 4 6 2 8]
----------------------
[[4 5 4]
 [0 8 6]
 [6 2 7]
 [2 9 6]
 [2 9 9]] (before shuffle)
[[0 8 6]
 [2 9 6]
 [2 9 9]
 [4 5 4]
 [6 2 7]] after shuffle
'''

范数

np.linalg.norm()

numpy.linalg是范数常用的库，包括计算二范数、无穷范数等等。

np.linalg.norm(x: ArrayLike, ord: int)

这个方法通过参数ord指定要计算的范数，而参数x必须为一维或者二维数组。

鹿卿

https://luqingbys.github.io/posts/63aa.html

本博客所有文章除特別声明外，均采用 CC BY 4.0 许可协议。转载请注明来源鹿卿 !

numpy

来发评论吧~

常用技巧-pytorch包

2024-03-06 python编程

pytorch

常用技巧-pandas包

2024-03-04 python编程

pandas

常用技巧-numpy包

Numpy

数组创建

创建随机数组：np.random.rand()

创建全零/全一数组：np.zeros()/np.ones()

创建有序数组：np.arange()

创建指定元素数量的数组：np.linspace()

创建单位矩阵（数组）：np.eye()

数组转换

list转数组：np.array()

读取文件为数组：np.loadtxt()

数组拼接

最通用拼接方法：np.concatenate()

按行、按列拼接：np.hstack()/np.vstack()

数组变形

最朴素的变形方法：reshape()

展平：np.ndarray.flatten()

删除维度：squeeze()

增加维度：np.expand_dims()

数组重复

同维度重复：repeat()

支持维度扩展：tile()

采样

随机采样：np.random.choice()

乱序排列：np.random.permutation()

范数

np.linalg.norm()

微信扫一扫：分享

你的赏识是我前进的动力