Reshapeing operations

Suppose we have the following tensor:

t = torch.tensor([
[1,1,1,1],
[2,2,2,2],
[3,3,3,3]
], dtype=torch.float32)

We have two ways to get the shape:

> t.size()
torch.Size([3, 4]) > t.shape
torch.Size([3, 4])

The rank of a tensor is equal to the length of the tensor's shape.

> len(t.shape)
2

We can also deduce the number of elements contained within the tensor.

> torch.tensor(t.shape).prod()
tensor(12)

In PyTorch, there is a dedicated function for this:

> t.numel()
12

Reshaping a tensor in PyTorch

> t.reshape([2,6])
tensor([[1., 1., 1., 1., 2., 2.],
[2., 2., 3., 3., 3., 3.]]) > t.reshape([3,4])
tensor([[1., 1., 1., 1.],
[2., 2., 2., 2.],
[3., 3., 3., 3.]]) > t.reshape([4,3])
tensor([[1., 1., 1.],
[1., 2., 2.],
[2., 2., 3.],
[3., 3., 3.]]) > t.reshape(6,2)
tensor([[1., 1.],
[1., 1.],
[2., 2.],
[2., 2.],
[3., 3.],
[3., 3.]]) > t.reshape(12,1)
tensor([[1.],
[1.],
[1.],
[1.],
[2.],
[2.],
[2.],
[2.],
[3.],
[3.],
[3.],
[3.]])

In this example, we increase the rank to 3 :

> t.reshape(2,2,3)
tensor(
[
[
[1., 1., 1.],
[1., 2., 2.]
], [
[2., 2., 3.],
[3., 3., 3.]
]
])

Note:PyTorch has another function view() that does the same thing as the reshape().

Changing shape by squeezing and unsqueezing

These functions allow us to expand or shrink the rank (number of dimensions) of our tensor.

> print(t.reshape([1,12]))
> print(t.reshape([1,12]).shape)
tensor([[1., 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.]])
torch.Size([1, 12]) > print(t.reshape([1,12]).squeeze())
> print(t.reshape([1,12]).squeeze().shape)
tensor([1., 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.])
torch.Size([12]) > print(t.reshape([1,12]).squeeze().unsqueeze(dim=0))
> print(t.reshape([1,12]).squeeze().unsqueeze(dim=0).shape)
tensor([[1., 1., 1., 1., 2., 2., 2., 2., 3., 3., 3., 3.]])
torch.Size([1, 12])

Let’s look at a common use case for squeezing a tensor by building a flatten function.

Flatten a tensor

Flattening a tensor means to remove all of the dimensions except for one.

A flatten operation on a tensor reshapes the tensor to have a shape that is equal to the number of elements contained in the tensor. This is the same thing as a 1d-array of elements.

def flatten(t):
t = t.reshape(1, -1)
t = t.squeeze()
return t
> t = torch.ones(4, 3)
> t
tensor([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]]) > flatten(t)
tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

We'll see that flatten operations are required when passing an output tensor from a convolutional layer to a linear layer.

In these examples, we have flattened the entire tensor, however, it is possible to flatten only specific parts of a tensor. For example, suppose we have a tensor of shape [2,1,28,28] for a CNN. This means that we have a batch of 2 grayscale images with height and width dimensions of 28 x 28, respectively.

Here, we can specifically flatten the two images. To get the following shape: [2,1,784]. We could also squeeze off the channel axes to get the following shape: [2,784].

Concatenating tensors

We combine tensors using the cat() function

> t1 = torch.tensor([
[1,2],
[3,4]
])
> t2 = torch.tensor([
[5,6],
[7,8]
])

We can combine t1 and t2 row-wise (axis-0) in the following way:

> torch.cat((t1, t2), dim=0)
tensor([[1, 2],
[3, 4],
[5, 6],
[7, 8]])

We can combine them column-wise (axis-1) like this:

> torch.cat((t1, t2), dim=1)
tensor([[1, 2, 5, 6],
[3, 4, 7, 8]])

Flatten operation for a batch of image inputs to a CNN

Flattening specific axes of a tensor

We know that the tensor inputs to a convolutional neural network typically have 4 axes, one for batch size, one for color channels, and one each for height and width.

\[(Batch Size, Channels, Height, Width)
\]

To start, suppose we have the following three tensors.

t1 = torch.tensor([
[1,1,1,1],
[1,1,1,1],
[1,1,1,1],
[1,1,1,1]
]) t2 = torch.tensor([
[2,2,2,2],
[2,2,2,2],
[2,2,2,2],
[2,2,2,2]
]) t3 = torch.tensor([
[3,3,3,3],
[3,3,3,3],
[3,3,3,3],
[3,3,3,3]
])

Remember, batches are represented using a single tensor, so we’ll need to combine these three tensors into a single larger tensor that has three axes instead of 2.

> t = torch.stack((t1, t2, t3))
> t.shape torch.Size([3, 4, 4])

Here, we used the stack() function to concatenate our sequence of three tensors along a new axis.

> t
tensor([[[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]], [[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2]], [[3, 3, 3, 3],
[3, 3, 3, 3],
[3, 3, 3, 3],
[3, 3, 3, 3]]])

All we need to do now to get this tensor into a form that a CNN expects is add an axis for the color channels. We basically have an implicit single color channel for each of these image tensors, so in practice, these would be grayscale images.

> t = t.reshape(3,1,4,4)
> t
tensor(
[
[
[
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]
]
],
[
[
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2],
[2, 2, 2, 2]
]
],
[
[
[3, 3, 3, 3],
[3, 3, 3, 3],
[3, 3, 3, 3],
[3, 3, 3, 3]
]
]
])
Flattening the tensor batch

Here are some alternative implementations of the flatten() function.

> t.reshape(1,-1)[0]
tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]) > t.reshape(-1)
tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]) > t.view(t.numel())
tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]) > t.flatten()
tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3])

This flattened batch won’t work well inside our CNN because we need individual predictions for each image within our batch tensor, and now we have a flattened mess.

The solution here, is to flatten each image while still maintaining the batch axis. This means we want to flatten only part of the tensor. We want to flatten the, color channel axis with the height and width axes.

> t.flatten(start_dim=1).shape
torch.Size([3, 16]) > t.flatten(start_dim=1)
tensor(
[
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3]
]
)

Reshapeing operations的更多相关文章

  1. backup, file manipulation operations (such as ALTER DATABASE ADD FILE) and encryption changes on a database must be serialized.

    昨天在检查YourSQLDba备份时,发现有台数据库做备份时出现了下面错误信息,如下所示: <Exec>   <ctx>yMaint.ShrinkLog</ctx> ...

  2. HDU 5938 Four Operations(四则运算)

    p.MsoNormal { margin: 0pt; margin-bottom: .0001pt; text-align: justify; font-family: Calibri; font-s ...

  3. ios基础篇(二十九)—— 多线程(Thread、Cocoa operations和GCD)

    一.进程与线程 1.进程 进程是指在系统中正在运行的一个应用程序,每个进程之间是独立的,每个进程均运行在其专用且受保护的内存空间内: 如果我们把CPU比作一个工厂,那么进程就好比工厂的车间,一个工厂有 ...

  4. OpenCascade Modeling Algorithms Boolean Operations

    Modeling Algorithms Boolean Operations of Opencascade eryar@163.com 布尔操作(Boolean Operations)是通过两个形状( ...

  5. A.Kaw矩阵代数初步学习笔记 4. Unary Matrix Operations

    “矩阵代数初步”(Introduction to MATRIX ALGEBRA)课程由Prof. A.K.Kaw(University of South Florida)设计并讲授. PDF格式学习笔 ...

  6. A.Kaw矩阵代数初步学习笔记 3. Binary Matrix Operations

    “矩阵代数初步”(Introduction to MATRIX ALGEBRA)课程由Prof. A.K.Kaw(University of South Florida)设计并讲授. PDF格式学习笔 ...

  7. mouse scrollings and zooming operations in linux & windows are opposite

    mouse scrollings and zooming operations in linux & windows are opposite. windows中, 鼠标滚动的方向是: 查看页 ...

  8. MongoDB—— 写操作 Core MongoDB Operations (CRUD)

    MongoDB使用BSON文件存储在collection中,本文主要介绍MongoDB中的写操作和优化策略. 主要有三种写操作:        Create        Update        ...

  9. MongoDB—— 读操作 Core MongoDB Operations (CRUD)

    本文主要介绍内容:从MongoDB中请求数据的不同的方法 Note:All of the examples in this document use the mongo shell interface ...

随机推荐

  1. POJ 1704 Georgia and Bob【博弈】

    题目链接: http://poj.org/problem?id=1704 题意: 给定棋子及其在格子上的坐标,两个人轮流选择一个棋子向左移动,每次至少移动一格,但是不可以碰到其他棋子.无路可走的时候视 ...

  2. 【永久激活,视频教程,超级详细】IntelliJ idea 2018.3安装+激活+汉化

    简介 IDEA 全称IntelliJ IDEA,是用于java语言开发的集成环境(也可用于其他语言),IntelliJ在业界被公认为最好的java开发工具之一,尤其在智能代码助手.代码自动提示.重构. ...

  3. Spring + RMI

    服务端: RmiServer.xml <?xml version="1.0" encoding="UTF-8"?> <beans xmlns= ...

  4. cmd的操作命令导出导入.dmp文件

    利用cmd的操作命令导出,详情如下(备注:方法二是转载网上的教程): 1:G:\Oracle\product\10.1.0\Client_1\NETWORK\ADMIN目录下有个tnsname.ora ...

  5. 在学习c++过程中,总结类的三个用户以及使用权限,感觉非常实用

    首先我们需要知道类的三个用户分别是:类的实现者,类的普通用户和类的继承者(派生类),接下来分别讲解这几种用户的区别. 1 .类的实现者:顾明思议,就是类的设计者,拥有最大的权限,可以访问类中任何权限的 ...

  6. 找不到方法 Void Newtonsoft.Json.JsonConvert.set_DefaultSettings

    找不到方法 Void Newtonsoft.Json.JsonConvert.set_DefaultSettings 因为 Newtonsoft.Json.dll 的版本号问题: C:\Program ...

  7. Mac Chrome-点击书签页在新的标签打开之方法

    PS:一直使用的是Firefox,但是现在Firefox有些不能满足我现在的需求,所以下载了chrome.可是当使用chrome时发现有一个很实用的功能它不能设置,这个让我很抓狂. 当点击标签时不能新 ...

  8. Android:内存控制及OOM处理

      1. OOM(内存溢出)和Memory Leak(内存泄露)有什么关系? OOM可能是因为Memory Leak,也可能是你的应用本身就比较耗内存(比如图片浏览型的).所以,出现OOM不一定是Me ...

  9. JavaScript的高大强

    1,JavaScript的引入方式 1.1>Script标签内写代码 <Script> //这里写JS代码的地方 </Script> 1.2>引入额外的JS文件 & ...

  10. mp3 pcm

    mp3  pcm javaMP3转pcm 百度语音识别 - 且学且珍惜 - SegmentFault 思否 https://segmentfault.com/a/1190000013383967