Commit 4a8d46a6 authored by Erik Senn's avatar Erik Senn
Browse files

Replace 2_optional_tensor_intro.ipynb

parent c8f9dc43
Loading
Loading
Loading
Loading
+30 −1
Original line number Diff line number Diff line
%% Cell type:markdown id: tags:

# Setup and data

GPU required? No

%% Cell type:code id: tags:

``` python
# Imports (note that you also need imports from the .py function files)
import numpy as np
import torch  # PyTorch / ML tool
```

%% Cell type:markdown id: tags:

# Tensors*
Tensors are basically n-dimensional arrays similar to numpy ndarrays with additional functionalities, which make them very useful for machine-learning tasks.

*Note: Some tasks here can also be done using standard np.arrays.*

**Below, look at some features of tensors**:

%% Cell type:markdown id: tags:

## GPU Support

Tensors **computations can be conducted on the GPU** (and also CPU).
For this, we need to move the data to the computing device.

*Note*: This will only work when a GPU is available

%% Cell type:code id: tags:

``` python
# Identify available devices: Take GPU if available, else CPU
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

# Define tensor and move to computing device



tensor = torch.tensor([1, 2, 3]).to(device)  # GPU support
tensor = tensor + tensor
print(tensor)
```

%% Output

    cpu
    tensor([2, 4, 6])

%% Cell type:markdown id: tags:

## Gradient computation

Tensors support **automatic differentiation** via the autograd system.
This allows to **compute gradients for any computational graph**.
Gradients are required for gradient-based learning of optimal parameters of a model (training process).

Explanation of the code example and result below:
- Goal: Compute gradient $d(c)/d(a)$ (change in c when a changes).
- We define a computational graph from a to b to c (*forward pass*).
- The gradient computation from c to a uses the chain rule:
  - $\frac{d(c)}{d(a)} = \frac{d(c)}{d(c)} \cdot \frac{d(c)}{d(b)} \cdot \frac{d(b)}{d(a)} = 1 \cdot 5 \cdot 2a = 1 \cdot 5 \cdot 3 = 15$

*Note*: When training a neural net, the backpropagation algorithm will compute the gradients as above and then update the trainable parameters using e.g. stochastic gradient descent.

%% Cell type:code id: tags:

``` python
a = torch.tensor(1.5, requires_grad=True)
b = a**2
c = 5 * b
c.backward()
print(a.grad)
```

%% Output

    tensor(15.)

%% Cell type:markdown id: tags:

When requires_grad = False, the backward pass does not work.

%% Cell type:code id: tags:

``` python
try:
    a = torch.tensor(1.5, requires_grad=False)
    b = a**2
    c = 5 * b
    c.backward()
    print(a.grad)
except Exception as e:
    print("Error: ", e)
```

%% Output

    Error:  element 0 of tensors does not require grad and does not have a grad_fn

%% Cell type:markdown id: tags:

## Conversion to numpy

Tensors and np.ndarrays **can be converted** to each other (without the gradient information).

%% Cell type:code id: tags:

``` python
torch_tensor = torch.tensor([1.0, 2.0, 3.0], requires_grad=False)
numpy_array = torch_tensor.numpy()
print(numpy_array)
torch_tensor = torch.from_numpy(numpy_array)
print(torch_tensor)
```