Documentation
MatrixCalculus provides matrix calculus for everyone. It is an online tool that computes vector and matrix derivatives (matrix calculus).
Valid input examples are:
- 0.5*x'*A*x
- A*exp(x)
- (y.*v)'*x
- a^b
- norm1(A*x-y)
- norm2(A*x-y)^2
- sum(log(exp(-y.*(X*w)) + vector(1)))
- tr(A*X'*B*X*C)
- log(det(inv(X)))
By default:
- a, b, ..., g are scalars,
- h, i, ..., z are vectors,
- A, B, ..., Z are matrices, and
- eye is the identity matrix.
Output:
- \(\odot\) - element-wise multiply
- \(\oslash\) - element-wise divide
- \(\otimes\) - Kronecker product
- \(\mathbb{I}\) - identity matrix
- \(\mathbb{T}\) - matrix transpose tensor
- \(\mathrm{diag}(v)\) - diagonal matrix with vector v as its diagonal
- \(\mathrm{diag}(X)\) - diagonal vector of matrix X
- \(\mathrm{inv}\) - inverse matrix
- \(\mathrm{adj}\) - adjugate matrix
Valid input operators are:
- +, -, *, /, ^
- .*, ./, .^ - element-wise operations
- sin, cos, tan, arcsin, arccos, arctan, exp, log, tanh, abs, sign, relu - element-wise operations (not matrix exponentials, etc.!)
- sum - sum of all entries of a vector or matrix
- norm1 - 1-norm of a vector or element-wise 1-norm of a matrix
- norm2 - Euclidean norm of a vector or Frobenius norm of a matrix
- tr, det, logdet, inv
- vector, matrix
Layout conventions:
There are different
layout conventions (
numerator layout,
denominator layout,
mixed layout). Numerator layout is just the transpose of the denominator layout and mixed layout is a mixture of both.
We use a mixed layout convention here. The resulting derivative is such that it can be used in a linear approximation of the function by forming a contraction along the corresponding last axes of the gradient. It is best illustrated by a few examples.
- \(\frac{\partial}{\partial x} \left(v^\top x\right) = v\)
- \(\frac{\partial}{\partial x} \left(A*x\right) = A\)
- \(\frac{\partial}{\partial X} \left(tr(A*X)\right) = A^\top\)
Common error messages:
- Cannot display this 3rd/4th order tensor.
Only scalars, vectors, and matrices are displayed as output. If the derivative is a higher order tensor it will be computed but it cannot be displayed in matrix notation. Sometimes higher order tensors are represented using Kronecker products. However, this can be ambiguous in some cases. Here, only in unambiguous cases the result is displayed using Kronecker products. The python code still works on the true higher order tensors.