Replace 2_optional_tensor_intro.ipynb (4a8d46a6) · Commits · Erik Senn / llm_class_public

notebooks/2_optional_tensor_intro.ipynb

+30 −1

Original line number	Diff line number	Diff line
		%% Cell type:markdown id: tags:

		# Setup and data

		GPU required? No

		%% Cell type:code id: tags:

		``` python
		# Imports (note that you also need imports from the .py function files)
		import numpy as np
		import torch # PyTorch / ML tool
		```

		%% Cell type:markdown id: tags:

		# Tensors*
		Tensors are basically n-dimensional arrays similar to numpy ndarrays with additional functionalities, which make them very useful for machine-learning tasks.

		Note: Some tasks here can also be done using standard np.arrays.

		Below, look at some features of tensors:

		%% Cell type:markdown id: tags:

		## GPU Support

		Tensors computations can be conducted on the GPU (and also CPU).
		For this, we need to move the data to the computing device.

		Note: This will only work when a GPU is available

		%% Cell type:code id: tags:

		``` python
		# Identify available devices: Take GPU if available, else CPU
		device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
		print(device)

		# Define tensor and move to computing device



		tensor = torch.tensor([1, 2, 3]).to(device) # GPU support
		tensor = tensor + tensor
		print(tensor)
		```

		%% Output

		cpu
		tensor([2, 4, 6])

		%% Cell type:markdown id: tags:

		## Gradient computation

		Tensors support automatic differentiation via the autograd system.
		This allows to compute gradients for any computational graph.
		Gradients are required for gradient-based learning of optimal parameters of a model (training process).

		Explanation of the code example and result below:
		- Goal: Compute gradient $d(c)/d(a)$ (change in c when a changes).
		- We define a computational graph from a to b to c (forward pass).
		- The gradient computation from c to a uses the chain rule:
		- $\frac{d(c)}{d(a)} = \frac{d(c)}{d(c)} \cdot \frac{d(c)}{d(b)} \cdot \frac{d(b)}{d(a)} = 1 \cdot 5 \cdot 2a = 1 \cdot 5 \cdot 3 = 15$

		Note: When training a neural net, the backpropagation algorithm will compute the gradients as above and then update the trainable parameters using e.g. stochastic gradient descent.

		%% Cell type:code id: tags:

		``` python
		a = torch.tensor(1.5, requires_grad=True)
		b = a**2
		c = 5 * b
		c.backward()
		print(a.grad)
		```

		%% Output

		tensor(15.)

		%% Cell type:markdown id: tags:

		When requires_grad = False, the backward pass does not work.

		%% Cell type:code id: tags:

		``` python
		try:
		a = torch.tensor(1.5, requires_grad=False)
		b = a**2
		c = 5 * b
		c.backward()
		print(a.grad)
		except Exception as e:
		print("Error: ", e)
		```

		%% Output

		Error: element 0 of tensors does not require grad and does not have a grad_fn

		%% Cell type:markdown id: tags:

		## Conversion to numpy

		Tensors and np.ndarrays can be converted to each other (without the gradient information).

		%% Cell type:code id: tags:

		``` python
		torch_tensor = torch.tensor([1.0, 2.0, 3.0], requires_grad=False)
		numpy_array = torch_tensor.numpy()
		print(numpy_array)
		torch_tensor = torch.from_numpy(numpy_array)
		print(torch_tensor)
		```