Understanding Tensors and Tensor Decompositions: Part 3

Published on:

April 10, 2023

Published by:

Professor Ishwar Sethi

This post was originally published by one of our partners on:

https://iksinc.tech/understanding-tensors-and-tensor-decompositions-part-3/

This is my third post on tensors and tensor decompositions. The first post on this topic primarily introduced tensors and some related terminology. The second post was meant to illustrate a particularly simple tensor decomposition method, called the CP decomposition. In this post, I will describe another tensor decomposition method, known as the Tucker decomposition. While the CP decomposition’s chief merit is its simplicity, it is limited in its approximation capability and it requires the same number of components in each mode. The Tucker decomposition, on the other hand, is extremely efficient in terms of approximation and allows different number of components in different modes. Before going any further, lets look at factor matrices and n-mode product of a tensor and a matrix.

Factor Matrices

Recall the CP decomposition of an order three tensor expressed as

X \approx \sum_{r = 1}^{R} a^{r} \circ b^{r} \circ c^{r},

where (∘∘) represents the outer product. We can also represent this decomposition in terms of organizing the vectors, ar,br,cr,r=1,⋯R, into three matrices, A, B, and C, as

B = [b^{1} b^{2} \dots b^{R}], and

The CP decomposition is then expressed as

B = [b^{1} b^{2} \dots b^{R}], and

where Λ is a super-diagonal tensor with all zero entries except the diagonal elements. The matrices A, B, and C are called the factor matrices.

Next, lets try to understand the n-mode product.

Multiplying a Tensor and a Matrix

Lets look at an example to better understand the n-mode product. Lets consider a 2x2x3 tensor whose frontal slices are:

Thus, the n-mode product of a tensor with a matrix yields a new tensor. You really do not have to worry about manually calculating n-mode product. The package rTensor in R provides a function ttm, tensor times matrix, for this task. Similar functions are present in Tensorly and Tensor Toolbox.

Tucker Decomposition

In Tucker decomposition, we express a tensor via a core tensor and n-mode products through factor matrices which transform the core tensor into the given tensor. Thus, given a tensor, X∈RI×J× K, its Tucker decomposition is given as

X \approx G \times_{1} A \times_{2} B \times_{3} C = \sum_{p = 1}^{P} \sum_{q = 1}^{Q} \sum_{r = 1}^{R} g_{p q r} a^{p} \circ b^{q} \circ c^{r}

where G� is the core tensor of size P×Q×R�×�×�, and factor matrices A, B, and C are of size I×P,J×Q,and K×R, respectively. The columns of the factor matrices span the corresponding mode space. The factor matrices are usually orthogonal and represent the principal components in each mode. The elements of the core tensor represent the interactions between the different modes. The mode ranks of the core tensor P, Q, and R are generally chosen smaller than the mode ranks I, J, and K of the original tensor. Thus, the core tensor is often viewed as a compressed representation of the original tensor. The figure below shows a pictorial representation of the Tucker decomposition.

There are a few points worth mentioning here. First, the CP decomposition is a special case of the Tucker decomposition when the core tensor is superdiagonal and P, Q, and R are all identical. Second, the Tucker decomposition is not unique. This is because we can perform n-mode product with any nonsingular matrix, say U, on core tensor to transform it and still maintain the exact same level of approximation of the original tensor as long as we multiply the n-mode factor matrix with inverse of matrix U. Third, there are a few variations on the Tucker decomposition wherein certain constraints are imposed on factor matrices or core tensor. One of these variations is known as higher-order SVD (HOSVD) that is the basis for computing Tucker decomposition.

Higher-Order SVD (HOSVD) and Computing Tucker Decomposition

The HOSVD of a tensor is an extension of singular-value decomposition (SVD) of matrices. In this case, the resulting core tensor and all factor matrices are orthogonal, ensuring that variations in each mode are captured independent of the other modes. The HOSVD of a tensor is efficiently computed by performing SVD on n-mode unfolded matrices of the given tensor and using the left singular matrices as factor matrices in HOSVD. To calculate the core tensor, the n-mode product operation is performed on the original tensor with transposed factor matrices.

Since a numerical example is always valueable for better understanding, I am going to show the steps involved in HOSVD using the 2x2x3 tensor that we used earlier for n-mode product. The mode-1 unfolded matrix for this tensor was earlier obtained as

X_{(1)} = [\begin{matrix} 1 & 3 & 5 & 7 & 9 & 11 \\ 2 & 4 & 6 & 8 & 10 & 12 \end{matrix}] .

The HOSVD method says that we use this unfolded matrix in a SVD factorization and use the resulting left singular matrix as the factor matrix A in our tensor decomposition. I perform this step in R using the following code:

[code language=”r”]
> tnsr <- as.tensor(array(1:12,dim=c(2,2,3)))
# Perform mode-1 unfolding
> xmat <-k_unfold(tnsr,1)@data
# compute SVD of the unfolded matrix
> s <-svd(xmat) # Obtain left singular matrix
> print(A <-s$u)
[,1] [,2]
[1,] -0.6632348 -0.7484114
[2,] -0.7484114 0.6632348
[/code]

The matrix at the end of the code block above is the factor matrix A in the HOSVD of the given tensor. In a similar manner, we can obtain unfolded matrices for mode-2 and mode-3 and do SVD to get B and C factor matrices. These are shown in the code block below:

[code language=”r”]
> xmat <-k_unfold(tnsr,2)@data
> s <-svd(xmat)
> print(B <-s$u)
[,1] [,2]
[1,] -0.6157085 -0.7879740
[2,] -0.7879740 0.6157085
> xmat <-k_unfold(tnsr,3)@data
> s <-svd(xmat)
> print(C <-s$u)
[,1] [,2] [,3]
[1,] -0.2067359 -0.8891533 0.4082483
[2,] -0.5182887 -0.2543818 -0.8164966
[3,] -0.8298416 0.3803896 0.4082483
[/code]

Next, lets obtain the core tensor. We do this by performing mode-n multiplications with the original tensor and transposed factor matrices as shown below.

[code language=”r”]
> g <- ttm(ttm(ttm(tnsr,t(A),m=1),t(B),m=2),t(C),m=3) > g
Numeric Tensor of 3 Modes
Modes: 2 2 3
Data:
[1] -25.436122 0.011316 0.005752 0.190090 0.002073 0.758965
[/code]

Just to check whether the hosvd function in R, meant for HOSVD, gives us the same result, I perform the decomposition of the same 2x2x3 tensor and display the factor matrices, returned as U matrices, below. The core tensor is returned as Z, and est represents the reconstructed/estimated tensor from decomposed components. Just to show that HOSVD yields an orthogonal core tensor, I show the unfolded mode-1 matrix whose rows can be verified to be orthogonal. You can also compare the entries of the unfolded mode-1 matrix with the elements of the core tensor computed earlier by tensor to matrix multiplication.

[code language=”r”]
> hosvdD <- hosvd(tnsr, c(2,2,3)) |=======================================================================| 100% > str(hosvdD)
List of 4
$ Z :Formal class ‘Tensor’ [package “rTensor”] with 3 slots
.. ..@ num_modes: int 3
.. ..@ modes : int [1:3] 2 2 3
.. ..@ data : num [1:2, 1:2, 1:3] -25.43612 0.01132 0.00575 0.19009 0.00207 …
$ U :List of 3
..$ : num [1:2, 1:2] -0.663 -0.748 -0.748 0.663
..$ : num [1:2, 1:2] -0.616 -0.788 -0.788 0.616
..$ : num [1:3, 1:3] -0.207 -0.518 -0.83 -0.889 -0.254 …
$ est :Formal class ‘Tensor’ [package “rTensor”] with 3 slots
.. ..@ num_modes: int 3
.. ..@ modes : int [1:3] 2 2 3
.. ..@ data : num [1:2, 1:2, 1:3] 1 2 3 4 5 …
$ fnorm_resid: num 3.43e-14
> hosvdD$U[[1]]
[,1] [,2]
[1,] -0.6632348 -0.7484114
[2,] -0.7484114 0.6632348
> hosvdD$U[[2]]
[,1] [,2]
[1,] -0.6157085 -0.7879740
[2,] -0.7879740 0.6157085
> hosvdD$U[[3]]
[,1] [,2] [,3]
[1,] -0.2067359 -0.8891533 0.4082483
[2,] -0.5182887 -0.2543818 -0.8164966
[3,] -0.8298416 0.3803896 0.4082483
> hmat <- k_unfold(hosvdD$Z,1)@data > hmat
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] -25.43612 0.005752 0.002073 1.5352 -1.776e-15 -3.331e-16
[2,] 0.01132 0.190090 0.758965 0.1858 9.714e-17 1.735e-16
[/code]

‍

You will notice that while using the hosvd function for decomposition above, I specified the mode ranks for the core tensor as 2, 2, and 3, identical to the mode ranks of the original tensor. When the mode ranks of the core tensor are smaller than the starting tensor, the resulting decomposition is known as the truncated HOSVD. It turns out that the truncated HOSVD makes an excellent starting solution for the Tucker decomposition of a tensor. This initial solution is iteratively refined through the alternating least square algorithm. This approach for Tucker decomposition is known as Higher-Order Orthogonal Iteration (HOOI).

Tucker Decomposition Example

Lets look at a simple example of applying Tucker decomposition. For this, I will use a 256×256 RGB image which is a mode-3 tensor. Lets first decompose this mode-3 tensor with a core tensor having mode ranks as 128, 128, and 3, as shown below.

[code language=”r”]
> XS<- readPNG(“mendrl.png”) > tuckerD <-tucker(as.tensor(XS),rank = c(128,128,3)) |=======================================================================| 100% > str(tuckerD)
List of 7
$ Z :Formal class ‘Tensor’ [package “rTensor”] with 3 slots
.. ..@ num_modes: int 3
.. ..@ modes : int [1:3] 128 128 3
.. ..@ data : num [1:128, 1:128, 1:3] -223.82 -2.08 1.62 1.95 -1.23 …
$ U :List of 3
..$ : num [1:256, 1:128] -0.0559 -0.0542 -0.0543 -0.0523 -0.0531 …
..$ : num [1:256, 1:128] -0.0589 -0.0594 -0.0598 -0.0601 -0.0613 …
..$ : num [1:3, 1:3] -0.613 -0.585 -0.53 0.75 -0.221 …
$ conv : logi TRUE
$ est :Formal class ‘Tensor’ [package “rTensor”] with 3 slots
.. ..@ num_modes: int 3
.. ..@ modes : int [1:3] 256 256 3
.. ..@ data : num [1:256, 1:256, 1:3] 0.44 0.419 0.389 0.3 0.269 …
$ norm_percent: num 94.6
$ fnorm_resid : num 12.8
$ all_resids : num [1:6] 12.8 12.8 12.8 12.8 12.8 …
[/code]

The output, $est, represents the reconstructed/estimated tensor. The output quantity, $fnorm_resid, represents the error, using the Frobenius norm, between the original and the reconstructed tensor. The original tensor, mendril image, consists of 256x256x3 = 196,608 elements. The Tucker decomposition results in 128x128x3 elements of the core tensor, and (256x128x2 + 3×3) elements of factor matrices with a total of 114,697. This implies a compression of 41.66% with an approximation error of 12.8%. Just to relate the compression with the approximation error, I reran the decomposition for two other sets of mode ranks, 64x64x3 and 32×32,3. The results are shown below with the original and the three reconstructed images along with the % compression and error values. This should suggest to you that Tucker decomposition is a pretty good tool for compressing multidimensional data.

We can also look at the above example from the PCA/SVD perspective. Lets view mendril image as a collection of three gray images. Also note that I mentioned earlier that the columns of factor matrices span the corresponding mode space. Thus, we can extract a representation for each mode, called eigenmodes, similar to eigenvector representation performed with matrices. This can be done by performing tensor to matrix multiplications as shown below. (Think of forming eigenimages by multiplying left and right singular matrices of SVD decomposition of an image)

[code language=”r”]
> xaa <-ttm(tuckerD$Z,tuckerD$U[[1]],m=1)
> xaa1 <- ttm(xaa,tuckerD$U[[2]],m=2)
> writePNG(xaa1@data[,,1],”mode1eig.png”)
> writePNG(xaa1@data[,,2],”mode2eig.png”)
> writePNG(xaa1@data[,,3],”mode3eig.png”)
[/code]

The three eigenmode images generated via the above code are shown below in the second row. The first row shows the three input mode images (RGB channels of mendril image). The mode-1 eigenimage is shown inverted as the elements in this image are all negative. This viewpoint of Tucker decomposition has been used to create tensor faces, an extension of the eigenfaces concept.

Hopefully, my three-part series on tensors has been helpfull. If you want to explore further, I suggest looking up an excellent overview paper on tensors, and an application oriented paper. Recently, tensor decomposition and its capability to compress representations is being exploited in convolutional deep learning networks where a trained set of convolutional layers is treated as a tensor for decomposition to gain computational advantages. We should be expecting more work along these lines coming out from different research groups.

Benson Muite has created a Git repository for the code used in this and its two previous blog posts on tensors. Pl. visit the repository at https://gitlab.com/bkmgit/understanding-tensors-and-tensor-decomposition/. Many thanks to him.

‍

Check Out These Brilliant Topics

Artificial Intelligence (AI) Fundamentals

Explore the foundations of AI and its various applications in today's world. Our blog covers essential concepts, algorithms, and techniques that form the basis of AI, providing students and educators with a solid understanding of this transformative technology and its potential impact across industries.

Machine Learning Essentials

Uncover the potential of machine learning as a powerful tool for data-driven decision-making. Learn about the core principles, algorithms, and techniques that underpin machine learning, and explore practical examples and applications that demonstrate its capabilities in solving real-world problems.

Deep Learning for Advanced Analytics

Delve into the world of deep learning and its applications in advanced analytics. Our blog explores the fundamentals of deep learning, including neural networks, convolutional neural networks (CNNs), and recurrent neural networks (RNNs), as well as their applications in areas such as image and speech recognition, natural language processing, and autonomous systems.

Clustering Techniques for Data Analysis

Discover the power of clustering techniques in analyzing and understanding complex datasets. Our blog dives into the applications of clustering algorithms in various domains, such as pattern recognition, image segmentation, and customer behavior analysis, offering valuable insights for students and researchers alike.

Natural Language Processing (NLP) in the Digital Age

Explore the role of NLP in transforming human-computer interactions and enabling seamless communication between people and machines. Our blog showcases the core concepts and techniques of NLP, including text processing, sentiment analysis, and machine translation, providing students and educators with a comprehensive understanding of this fascinating field.

Sit back Let's learnAbout