Exported Terms

Nonnegative Matrix-Tensor Factorization

MatrixTensorFactor.nnmtfFunction
nnmtf(Y::AbstractArray, R::Integer; kwargs...)

Non-negatively matrix-tensor factorizes an order N tensor Y with a given "rank" R.

For an order $N=3$ tensor, this factorizes $Y \approx A B$ where $\displaystyle Y[i,j,k] \approx \sum_{r=1}^R A[i,r]*B[r,j,k]$ and the factors $A, B \geq 0$ are nonnegative.

For higher orders, this becomes $\displaystyle Y[i1,i2,...,iN] \approx \sum_{r=1}^R A[i1,r]*B[r,i2,...,iN].$

Note there may NOT be a unique optimal solution

Arguments

  • Y::AbstractArray{T,N}: tensor to factorize
  • R::Integer: rank to factorize Y (size(A)[2] and size(B)[1])

Keywords

  • maxiter::Integer=100: maxmimum number of iterations
  • tol::Real=1e-3: desiered tolerance for the convergence criterion
  • rescale_AB::Bool=true: scale B at each iteration so that the factors (horizontal slices) have similar 3-fiber sums.
  • rescale_Y::Bool=true: Preprocesses the input Y to have normalized 3-fiber sums (on average), and rescales the final B so Y=A*B.
  • normalize::Symbol=:fibres: part of B that should be normalized (must be in IMPLIMENTED_NORMALIZATIONS)
  • projection::Symbol=:nnscale: constraint to use and method for enforcing it (must be in IMPLIMENTED_PROJECTIONS)
  • criterion::Symbol=:ncone: how to determine if the algorithm has converged (must be in IMPLIMENTED_CRITERIA)
  • stepsize::Symbol=:lipshitz: used for the gradient decent step (must be in IMPLIMENTED_STEPSIZES)
  • momentum::Bool=false: use momentum updates
  • delta::Real=0.9999: safeguard for maximum amount of momentum (see eq (3.5) Xu & Yin 2013)
  • R_max::Integer=size(Y)[1]: maximum rank to try if R is not given
  • projectionA::Symbol=projection: projection to use on factor A (must be in IMPLIMENTED_PROJECTIONS)
  • projectionB::Symbol=projection: projection to use on factor B (must be in IMPLIMENTED_PROJECTIONS)
  • A_init::AbstractMatrix=nothing: initial A for the iterative algorithm. Should be kept as nothing if R is not given.
  • B_init::AbstractArray=nothing: initial B for the iterative algorithm. Should be kept as nothing if R is not given.

Returns

  • A::Matrix{Float64}: the matrix A in the factorization Y ≈ A * B
  • B::Array{Float64, N}: the tensor B in the factorization Y ≈ A * B
  • rel_errors::Vector{Float64}: relative errors at each iteration
  • norm_grad::Vector{Float64}: norm of the full gradient at each iteration
  • dist_Ncone::Vector{Float64}: distance of the -gradient to the normal cone at each iteration
  • If R was estimated, also returns the optimal R::Integer

Implimentation of block coordinate decent updates

We calculate the partial gradients and corresponding Lipshitz constants like so:

\[\begin{align} \boldsymbol{P}^{t}[q,r] &=\textstyle{\sum}_{jk} \boldsymbol{\mathscr{B}}^n[q,j,k] \boldsymbol{\mathscr{B}}^n[r,j,k]\\ \boldsymbol{Q}^{t}[i,r] &=\textstyle{\sum}_{jk}\boldsymbol{\mathscr{Y}}[i,j,k] \boldsymbol{\mathscr{B}}^n[r,j,k] \\ \nabla_{A} f(\boldsymbol{A}^{t},\boldsymbol{\mathscr{B}}^{t}) &= \boldsymbol{A}^{t} \boldsymbol{P}^{t} - \boldsymbol{Q}^{t} \\ L_{A} &= \left\lVert \boldsymbol{P}^{t} \right\rVert_{2}. \end{align}\]

Similarly for $\boldsymbol{\mathscr{B}}$:

\[\begin{align} \boldsymbol{T}^{t+1}&=(\boldsymbol{A}^{t+\frac12})^\top \boldsymbol{A}^{t+\frac12}\\ \boldsymbol{\mathscr{U}}^{t+1}&=(\boldsymbol{A}^{t+\frac12})^\top \boldsymbol{\mathscr{Y}} \\ \nabla_\boldsymbol{\mathscr{B}} f(\boldsymbol{A}^{t+\frac12},\boldsymbol{\mathscr{B}}^{t}) &= \boldsymbol{T}^{t+1} \boldsymbol{\mathscr{B}}^{t} - \boldsymbol{\mathscr{U}}^{t+1} \\ L_B &= \left\lVert \boldsymbol{T}^{t+1} \right\rVert_{2}. \end{align}\]

To ensure the iterates stay "close" to normalized, we introduce a renormalization step after the projected gradient updates:

\[\begin{align} \boldsymbol{C} [r,r]&=\frac{1}{J}\textstyle{\sum}_{jk} \boldsymbol{\mathscr{B}}^{t+\frac12}[r,j,k]\\ \boldsymbol{A}^{t+1}&= \boldsymbol{A}^{t+\frac12} \boldsymbol{C}\\ \boldsymbol{\mathscr{B}}^{t+1}&= (\boldsymbol{C}^{t+1})^{-1}\boldsymbol{\mathscr{B}}^{t+\frac12}. \end{align}\]

We typicaly use the following convergence criterion:

\[d(-\nabla \ell(\boldsymbol{A}^{t},\boldsymbol{\mathscr{B}}^{t}), N_{\mathcal{C}}(\boldsymbol{A}^{t},\boldsymbol{\mathscr{B}}^{t}))^2\leq\delta^2 R(I+JK).\]

source

Implimented Factorization Options

MatrixTensorFactor.IMPLIMENTED_NORMALIZATIONSConstant
IMPLIMENTED_NORMALIZATIONS::Set{Symbol}
  • :fibre: set $\sum_{k=1}^K B[r,j,k] = 1$ for all $r, j$, or when projection==:nnscale, set $\sum_{j=1}^J\sum_{k=1}^K B[r,j,k] = J$ for all $r$
  • :slice: set $\sum_{j=1}^J\sum_{k=1}^K B[r,j,k] = 1$ for all $r$
  • :nothing: does not enforce any normalization of B
source
MatrixTensorFactor.IMPLIMENTED_PROJECTIONSConstant
IMPLIMENTED_PROJECTIONS::Set{Symbol}
  • :nnscale: Two stage block coordinate decent; 1) projected gradient decent onto nonnegative orthant, 2) shift any weight from B to A according to normalization. Equivilent to :nonnegative when normalization==:nothing.
  • :simplex: Euclidian projection onto the simplex blocks accoring to normalization
  • :nonnegative: zero out negative entries
source
MatrixTensorFactor.IMPLIMENTED_CRITERIAConstant
IMPLIMENTED_CRITERIA::Set{Symbol}
  • :ncone: vector-set distance between the -gradient of the objective and the normal cone
  • :iterates: A,B before and after one iteration are close in L2 norm
  • :objective: objective is small
  • :relativeerror: relative error is small (when normalize=:nothing) or mean relative error averaging fibres or slices when the normalization is :fibres or :slices respectfuly.
source

Constants

Kernel Density Estimation

Constants

1D

MatrixTensorFactor.default_bandwidthFunction
default_bandwidth(data; alpha=0.9, inner_percentile=100)

Coppied from KernelDensity since this function is not exported. I want access to it so that the same bandwidth can be used for different densities for the same measurements.

source
MatrixTensorFactor.make_densitiesFunction
make_densities(s::Sink; kwargs...)
make_densities(s::Sink, domains::AbstractVector{<:AbstractVector}; kwargs...)

Estimates the densities for each measurement in a Sink.

When given domains, a list where each entry is a domain for a different measurement, resample the kernel on this domain.

Parameters

  • bandwidths::AbstractVector{<:Real}: list of bandwidths used for each measurement's

density estimation

  • inner_percentile::Integer=100: value between 0 and 100 that filters out each measurement

by using the inner percentile range. This can help remove outliers and focus in on where the bulk of the data is.

Returns

  • density_estimates::Vector{UnivariateKDE}
source
MatrixTensorFactor.standardize_KDEsFunction
standardize_KDEs(KDEs::AbstractVector{UnivariateKDE}; n_samples=DEFAULT_N_SAMPLES,)

Resample the densities so they all are sampled from the same domain.

source

Resample the densities within each sink/source so that like-measurements use the same scale.

source
MatrixTensorFactor.standardize_2d_KDEsFunction
standardize_2d_KDEs(KDEs::AbstractVector{BivariateKDE}; n_samples=DEFAULT_N_SAMPLES,)

Resample the densities so they all are sampled from the same x and y coordinates.

source

2D

MatrixTensorFactor.repeatcoordFunction
repeatcoord(coordinates, values)

Repeates coordinates the number of times given by values.

Both lists should be the same length.

Example

coordinates = [(0,0), (1,1), (1,2)] values = [1, 3, 2] repeatcoord(coordinates, values)

[(0,0), (1,1), (1,1), (1,1), (1,2), (1,2)]

source
MatrixTensorFactor.kde2dFunction
kde2d((xs, ys), values)

Performs a 2d KDE based on two lists of coordinates, and the value at those coordinates. Input ––-

  • xs, ys::Vector{Real}: coordinates/locations of samples
  • values::Vector{Integer}: value of the sample

Returns

  • f::BivariateKDE use f.x, f.y for the location of the (re)sampled KDE,

and f.density for the sample values of the KDE

source
MatrixTensorFactor.coordzipFunction
coordzip(rcoords)

Zips the "x" and "y" values together into a list of x coords and y coords. Example –––- coordzip([(0,0), (1,1), (1,1), (1,1), (1,2), (1,2)])

[[0, 1, 1, 1, 1, 1], [0, 1, 1, 1, 2, 2]]

source

Approximations

MatrixTensorFactor.d_dxFunction
d_dx(y::AbstractVector{<:Real})

Approximates the 1nd derivative of a function using only given samples y of that function.

Assumes y came from f(x) where x was an evenly sampled, unit intervel grid. Note the approximation uses centered three point finite differences for the next-to-end points, and foward/backward three point differences for the begining/end points respectively. The remaining interior points use five point differences.

Will use the largest order method possible by defult (currently 5 points), but can force a specific order method with the keyword order. See d2_dx2.

source
MatrixTensorFactor.d2_dx2Function
d2_dx2(y::AbstractVector{<:Real}; order::Integer=length(y))

Approximates the 2nd derivative of a function using only given samples y of that function.

Assumes y came from f(x) where x was an evenly sampled, unit intervel grid. Note the approximation uses centered three point finite differences for the next-to-end points, and foward/backward three point differences for the begining/end points respectively. The remaining interior points use five point differences.

Will use the largest order method possible by defult (currently 5 points), but can force a specific order method with the keyword order. See d_dx.

source
MatrixTensorFactor.curvatureFunction
curvature(y::AbstractVector{<:Real})

Approximates the signed curvature of a function given evenly spaced samples.

Uses d_dx and d2_dx2 to approximate the first two derivatives.

source

Other Functions

Base.:*Method
Base.*(A::AbstractMatrix, B::AbstractArray)

Computes the AbstractArray C where $C_{i_1 i_2 \dots i_d} = \sum_{l=1}^L A_{i_1 l} * B_{l i_2 \dots i_d}$.

This is equivilent to the $1$-mode product $B \times_1 A$.

Generalizes @einsum C[i,j,k] := A[i,l] * B[l,j,k].

source
MatrixTensorFactor.combined_normFunction
combined_norm(u, v, ...)

Compute the combined norm of the arguments as if all arguments were part of one large array.

This is equivilent to norm(cat(u, v, ...)), but this implimentation avoids creating an intermediate array.

u = [3 0]
v = [0 4 0]
combined_norm(u, v)

# output

5.0
source
MatrixTensorFactor.rel_errorFunction
rel_error(x, xhat)

Compute the relative error between x (true value) and xhat (its approximation).

The relative error is given by:

\[\frac{\lVert \hat{x} - x \rVert}{\lVert x \rVert}\]

See also mean_rel_error.

source
MatrixTensorFactor.mean_rel_errorFunction
mean_rel_error(X, Xhat; dims=(1,2))

Compute the mean relative error between the dims-order slices of X and Xhat.

The mean relative error is given by:

\[\frac{1}{N}\sum_{j=1}^N\frac{\lVert \hat{X}_j - X_j \rVert}{\lVert X_j \rVert}\]

See also rel_error.

source
MatrixTensorFactor.relative_errorFunction
relative_error(Yhat, Y; normalize=:nothing)

Wrapper to use the relative error calculation according to the normalization used.

  • normalize==:nothing: entry-wise L2 relative error between the two arrays
  • normalize==:fibres: average L2 relative error between all 3-fibres
  • normalize==:slices: average L2 relative error between all 1-mode slices

See also rel_error, mean_rel_error.

source
MatrixTensorFactor.slicewise_dotFunction
slicewise_dot(A::AbstractArray, B::AbstractArray)

Constracts all but the first dimentions of A and B by performing a dot product over each dim=1 slice.

Generalizes @einsum C[s,r] := A[s,j,k]*B[r,j,k] to arbitrary dimentions.

source

Index