## Tensor Libraries

Data in deep neural networks is typically represented as multi-dimensional arrays, also called *tensors*. Tensors are nowadays the basic data structures running under the hood of most machine learning libraries and are used to learn different data representations through various operations and layers. The most popular library is probably Tensorflow by Google. However, thanks to its ease of use the Keras library (which is based on Tensorflow) even overtook vanilla Tensorflow most recently. It can also be seen that Pytorch gained significant traction of the last years, too.

## Tensors in R

Tensors can be seen as generalizations of vectors and matrices to an arbitrary number of dimensions.

In R we have the analogy of `vector`

and `matrix`

objects for 1- or 2-dimensional tensors and `array`

for higher dimensions. Training deep neural networks means that we need to find data representations from the input layer through the (hidden) layers to the output which fit into *tensors*.

Tensors can be described through the following properties:

**Rank**: The rank describes the number of axes or dimensions, e.g a matrix has rank 2 (2 dimensions), a 3D tensor has rank 3.**Shape**: Describing the dimension size along each axis.**Data type**: The type of the Tensor which typically falls in the `numeric`

category (`integer`

, `double`

).

See also Deep Learning with R, page 29.

## Exercise: Tensor Attributes

Let’s examine the attributes of our `x_train`

data from the MNIST dataset.

```
library(keras)
mnist <- dataset_mnist()
x_train <- mnist$train$x
y_train <- mnist$train$y
x_test <- mnist$test$x
y_test <- mnist$test$y
```

Given our loaded MNIST dataset, we can get the *shape* using `dim()`

:

The number of axes (*rank*) is simply the length of the *shape* vector:

The *type* of the tensor can be retrieved through `typeof()`

as

Example based on Deep Learning with R, page 31

## Practical Examples

From Deep Learning with R, page 32:

For practical applications the data will almost always fall into one of the following categories:

**Vector data**: 2D tensors of shape `(samples, features)`

.**Time series data or sequence data**: 3D tensors of shape `(samples, timesteps, features)`

**Images**: 4D tensors of shape `(samples, height, width, channels)`

or `(samples, channels, height, width)`

**Video**: 5D tensors of shape `(samples, frames, height, width, channels)`

or `(samples, frames, channels, height, width)`

How could each of the following real-world examples be encoded as data-tensors?

## Tensor Operations

A deep neural network consists of a set a layers which can be stacked on top of each other using the pipe `%>%`

operator. Thanks to the intuitive interface we do not have to think of how the layers are exactly wired together.

We use the function `layer_dense`

, define the number of hidden `units`

and the popular `"relu"`

`activation`

function which stands for *rectified linear unit* :

`layer_dense(units = 512, activation = "relu")`

Activations functions can be used through `layer_activation()`

, or using the `activation`

argument supported by all forward layers. See also the RStudio keras documentation for other available activation functions and further documentation.

The actual operation has a 2D tensor as input and outputs another 2D tensor as follows:

\[relu( W \cdot input + b)\]

\(input \dots\) input tensor

\(W \dots\) weight tensor

\(b \dots\) bias tensor

We could also write the operations performed using the code below:

`output = relu(dot(W, input) + b)`

## Element-wise Operations

Basic arithmetic (addition, subtraction, multiplication) and

`relu`

are

*element-wise* operations. These operations are performed independently on each element of the tensor and can thus be easily parallelized on multiple CPUs/GPUs. The

`relu`

operation is is defined as

\[relu(x) = max(x, 0)\]See also Deep Learning with R, page 35

## Different Dimensions

The R function `sweep()`

allows us to calculate operations on tensors of different dimensionality. Try to guess the shape and result of the output `z`

first before evaluating the result.

### Tensor Dot

The tensor `dot`

operation is comparable to a matrix multiplication `%*%`

but extending to higher dimensions, e.g.

`(a, b, c, d) . (d) -> (a, b, c)`

`(a, b, c, d) . (d, e) -> (a, b, c, e)`

See also Deep Learning with R, page 36

## Tensor Reshaping

Another important tensor operation is *reshaping* using `array_reshape`

for data-preprocessing as

```
library(keras)
mnist <- dataset_mnist()
x_train <- mnist$train$x
dim(x_train)
```

`## [1] 60000 28 28`

```
x_train <- array_reshape(x_train, c(60000, 28*28))
dim(x_train)
```

`## [1] 60000 784`

Given the following matrix as

```
x <- matrix(c(0, 1,
2, 3,
4, 5),
nrow = 3, ncol = 2, byrow = TRUE)
x
```

let’s reshape x to have 6 rows and 1 column:

Or 2 rows and 3 columns:

We can also reshape tensor by *transposition* or simply exchanging rows and columns of a matrix.

The `t()`

function can be used as

```
x <- matrix(0, nrow = 300, ncol = 20)
dim(x)
```

`## [1] 300 20`

```
x <- t(x)
dim(x)
```

`## [1] 20 300`

See also Deep Learning with R, page 38-39