PyTorch Tensor Basics¶

This section covers: * Converting NumPy arrays to PyTorch tensors * Creating tensors from scratch

Perform standard imports¶

import torch
import numpy as np

Confirm you're using PyTorch version 1.1.0+

torch.__version__

'1.5.0+cu101'

Converting NumPy arrays to PyTorch tensors¶

A torch.Tensor is a multi-dimensional matrix containing elements of a single data type.
Calculations between tensors can only happen if the tensors share the same dtype.
In some cases tensors are used as a replacement for NumPy to use the power of GPUs (more on this later).

arr = np.array([1,2,3,4,5])
print(arr)
print(arr.dtype)
print(type(arr))

[1 2 3 4 5]
int64
<class 'numpy.ndarray'>

x = torch.from_numpy(arr)
# Equivalent to x = torch.as_tensor(arr)

print(x)

tensor([1, 2, 3, 4, 5])

# Print the type of data held by the tensor
print(x.dtype)

torch.int64

# Print the tensor object type
print(type(x))
print(x.type()) # this is more specific!

<class 'torch.Tensor'>
torch.LongTensor

arr2 = np.arange(0.,12.).reshape(4,3)
print(arr2)

[[ 0.  1.  2.]
 [ 3.  4.  5.]
 [ 6.  7.  8.]
 [ 9. 10. 11.]]

x2 = torch.from_numpy(arr2)
print(x2)
print(x2.type())

tensor([[ 0.,  1.,  2.],
        [ 3.,  4.,  5.],
        [ 6.,  7.,  8.],
        [ 9., 10., 11.]], dtype=torch.float64)
torch.DoubleTensor

Here torch.DoubleTensor refers to 64-bit floating point data.

Tensor Datatypes ¶

TYPE	NAME	EQUIVALENT	TENSOR TYPE
32-bit integer (signed)	torch.int32	torch.int	IntTensor
64-bit integer (signed)	torch.int64	torch.long	LongTensor
16-bit integer (signed)	torch.int16	torch.short	ShortTensor
32-bit floating point	torch.float32	torch.float	FloatTensor
64-bit floating point	torch.float64	torch.double	DoubleTensor
16-bit floating point	torch.float16	torch.half	HalfTensor
8-bit integer (signed)	torch.int8		CharTensor
8-bit integer (unsigned)	torch.uint8		ByteTensor

torch.from_numpy()
torch.as_tensor()
torch.tensor()

There are a number of different functions available for creating tensors. When using torch.from_numpy() and torch.as_tensor(), the PyTorch tensor and the source NumPy array share the same memory. This means that changes to one affect the other. However, the torch.tensor() function always makes a copy.

# Using torch.from_numpy(), shares same memory
arr = np.arange(0,5)
t = torch.from_numpy(arr)
print(t)

tensor([0, 1, 2, 3, 4])

arr[2]=77
print(t)

tensor([ 0,  1, 77,  3,  4])

# Using torch.tensor(), makes a copy
arr = np.arange(0,5)
t = torch.tensor(arr)
print(t)

tensor([0, 1, 2, 3, 4])

arr[2]=77
print(t)

tensor([0, 1, 2, 3, 4])

Class constructors¶

torch.Tensor()
torch.FloatTensor()
torch.LongTensor(), etc.

There's a subtle difference between using the factory function torch.tensor(data) and the class constructor torch.Tensor(data).
The factory function determines the dtype from the incoming data, or from a passed-in dtype argument.
The class constructor torch.Tensor()is simply an alias for torch.FloatTensor(data). Consider the following:

data = np.array([1,2,3])

a = torch.Tensor(data)  # Equivalent to cc = torch.FloatTensor(data)
print(a, a.type())

tensor([1., 2., 3.]) torch.FloatTensor

b = torch.tensor(data)
print(b, b.type())

tensor([1, 2, 3]) torch.LongTensor

c = torch.tensor(data, dtype=torch.long)
print(c, c.type())

tensor([1, 2, 3]) torch.LongTensor

Creating tensors from scratch¶

Uninitialized tensors with `.empty()`¶

torch.empty() returns an uninitialized tensor. Essentially a block of memory is allocated according to the size of the tensor, and any values already sitting in the block are returned. This is similar to the behavior of numpy.empty().

x = torch.empty(4, 3)
print(x)

tensor([[4.4866e-36, 0.0000e+00, 3.3631e-44],
        [0.0000e+00,        nan, 0.0000e+00],
        [1.1578e+27, 1.1362e+30, 7.1547e+22],
        [4.5828e+30, 1.2121e+04, 7.1846e+22]])

Initialized tensors with `.zeros()` and `.ones()`¶

torch.zeros(size)
torch.ones(size)
It's a good idea to pass in the intended dtype.

x = torch.zeros(4, 3, dtype=torch.int64)
print(x)

tensor([[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]])

Tensors from ranges¶

torch.arange(start,end,step)
torch.linspace(start,end,steps)
Note that with .arange(), end is exclusive, while with linspace(), end is inclusive.

x = torch.arange(0,18,2).reshape(3,3)
print(x)

tensor([[ 0,  2,  4],
        [ 6,  8, 10],
        [12, 14, 16]])

x = torch.linspace(0,18,12).reshape(3,4)
print(x)

tensor([[ 0.0000,  1.6364,  3.2727,  4.9091],
        [ 6.5455,  8.1818,  9.8182, 11.4545],
        [13.0909, 14.7273, 16.3636, 18.0000]])

Tensors from data¶

torch.tensor() will choose the dtype based on incoming data:

x = torch.tensor([1, 2, 3, 4])
print(x)
print(x.dtype)
print(x.type())

tensor([1, 2, 3, 4])
torch.int64
torch.LongTensor

# Converting type
x = x.type(torch.int16)
print(x.dtype)
print(x.type())

torch.int16
torch.ShortTensor

Alternatively you can set the type by the tensor method used. For a list of tensor types visit https://pytorch.org/docs/stable/tensors.html

x = torch.FloatTensor([5,6,7])
print(x)
print(x.dtype)
print(x.type())

tensor([5., 6., 7.])
torch.float32
torch.FloatTensor

You can also pass the dtype in as an argument. For a list of dtypes visit https://pytorch.org/docs/stable/tensor_attributes.html#torch.torch.dtype

x = torch.tensor([8,9,-3], dtype=torch.int)
print(x)
print(x.dtype)
print(x.type())

tensor([ 8,  9, -3], dtype=torch.int32)
torch.int32
torch.IntTensor

Changing the dtype of existing tensors¶

Don't be tempted to use x = torch.tensor(x, dtype=torch.type) as it will raise an error about improper use of tensor cloning.
Instead, use the tensor .type() method.

print('Old:', x.type())

x = x.type(torch.int64)

print('New:', x.type())

Old: torch.IntTensor
New: torch.LongTensor

Random number tensors¶

torch.rand(size) returns random samples from a uniform distribution over [0, 1)
torch.randn(size) returns samples from the "standard normal" distribution [σ = 1]
Unlike rand which is uniform, values closer to zero are more likely to appear.
torch.randint(low,high,size) returns random integers from low (inclusive) to high (exclusive)

x = torch.rand(4, 3)
print(x)

tensor([[0.6682, 0.3914, 0.5425],
        [0.2000, 0.3056, 0.9103],
        [0.7039, 0.5021, 0.9170],
        [0.4305, 0.7270, 0.6577]])

x = torch.randn(4, 3)
print(x)

tensor([[ 0.4623, -0.2561, -0.5399],
        [-0.6609, -0.6707,  0.6866],
        [-0.9742, -0.3833,  0.1253],
        [ 0.1251, -0.7600, -1.8088]])

x = torch.randint(0, 5, (4, 3))
print(x)

tensor([[0, 2, 2],
        [3, 1, 2],
        [3, 2, 1],
        [2, 0, 4]])

Random number tensors that follow the input size¶

torch.rand_like(input)
torch.randn_like(input)
torch.randint_like(input,low,high)
these return random number tensors with the same size as input

x = torch.zeros(2,5)
print(x)

tensor([[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]])

x2 = torch.randn_like(x)
print(x2)

tensor([[-0.5288,  0.5442,  1.8976,  0.5154,  2.7177],
        [-1.7115, -1.4005,  0.2681, -0.0782, -0.6214]])

The same syntax can be used with
torch.zeros_like(input)
torch.ones_like(input)

x3 = torch.ones_like(x2)
print(x3)

tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])

Setting the random seed¶

torch.manual_seed(int) is used to obtain reproducible results

torch.manual_seed(42)
x = torch.rand(2, 3)
print(x)

tensor([[0.8823, 0.9150, 0.3829],
        [0.9593, 0.3904, 0.6009]])

torch.manual_seed(42)
x = torch.rand(2, 3)
print(x)

tensor([[0.8823, 0.9150, 0.3829],
        [0.9593, 0.3904, 0.6009]])

Tensor attributes¶

Besides dtype, we can look at other tensor attributes like shape, device and layout

x.shape

torch.Size([2, 3])

x.size()  # equivalent to x.shape

torch.Size([2, 3])

x.device

device(type='cpu')

PyTorch supports use of multiple devices, harnessing the power of one or more GPUs in addition to the CPU. We won't explore that here, but you should know that operations between tensors can only happen for tensors installed on the same device.

x.layout

torch.strided

PyTorch has a class to hold the memory layout option. The default setting of strided will suit our purposes throughout the course.