Exported Terms
MatrixTensorFactor
— ModuleMatrix-Tensor Factorization
Nonnegative Matrix-Tensor Factorization
MatrixTensorFactor.nnmtf
— Functionnnmtf(Y::AbstractArray, R::Integer; kwargs...)
Non-negatively matrix-tensor factorizes an order N tensor Y with a given "rank" R.
For an order $N=3$ tensor, this factorizes $Y \approx A B$ where $\displaystyle Y[i,j,k] \approx \sum_{r=1}^R A[i,r]*B[r,j,k]$ and the factors $A, B \geq 0$ are nonnegative.
For higher orders, this becomes $\displaystyle Y[i1,i2,...,iN] \approx \sum_{r=1}^R A[i1,r]*B[r,i2,...,iN].$
Note there may NOT be a unique optimal solution
Arguments
Y::AbstractArray{T,N}
: tensor to factorizeR::Integer
: rank to factorize Y (size(A)[2] and size(B)[1])
Keywords
maxiter::Integer=100
: maxmimum number of iterationstol::Real=1e-3
: desiered tolerance for the convergence criterionrescale_AB::Bool=true
: scale B at each iteration so that the factors (horizontal slices) have similar 3-fiber sums.rescale_Y::Bool=true
: Preprocesses the inputY
to have normalized 3-fiber sums (on average), and rescales the finalB
soY=A*B
.normalize::Symbol=:fibres
: part of B that should be normalized (must be in IMPLIMENTED_NORMALIZATIONS)projection::Symbol=:nnscale
: constraint to use and method for enforcing it (must be in IMPLIMENTED_PROJECTIONS)criterion::Symbol=:ncone
: how to determine if the algorithm has converged (must be in IMPLIMENTED_CRITERIA)stepsize::Symbol=:lipshitz
: used for the gradient decent step (must be in IMPLIMENTED_STEPSIZES)momentum::Bool=false
: use momentum updatesdelta::Real=0.9999
: safeguard for maximum amount of momentum (see eq (3.5) Xu & Yin 2013)R_max::Integer=size(Y)[1]
: maximum rank to try if R is not givenprojectionA::Symbol=projection
: projection to use on factor A (must be in IMPLIMENTED_PROJECTIONS)projectionB::Symbol=projection
: projection to use on factor B (must be in IMPLIMENTED_PROJECTIONS)A_init::AbstractMatrix=nothing
: initial A for the iterative algorithm. Should be kept as nothing ifR
is not given.B_init::AbstractArray=nothing
: initial B for the iterative algorithm. Should be kept as nothing ifR
is not given.
Returns
A::Matrix{Float64}
: the matrix A in the factorization Y ≈ A * BB::Array{Float64, N}
: the tensor B in the factorization Y ≈ A * Brel_errors::Vector{Float64}
: relative errors at each iterationnorm_grad::Vector{Float64}
: norm of the full gradient at each iterationdist_Ncone::Vector{Float64}
: distance of the -gradient to the normal cone at each iteration- If R was estimated, also returns the optimal
R::Integer
Implimentation of block coordinate decent updates
We calculate the partial gradients and corresponding Lipshitz constants like so:
\[\begin{align} \boldsymbol{P}^{t}[q,r] &=\textstyle{\sum}_{jk} \boldsymbol{\mathscr{B}}^n[q,j,k] \boldsymbol{\mathscr{B}}^n[r,j,k]\\ \boldsymbol{Q}^{t}[i,r] &=\textstyle{\sum}_{jk}\boldsymbol{\mathscr{Y}}[i,j,k] \boldsymbol{\mathscr{B}}^n[r,j,k] \\ \nabla_{A} f(\boldsymbol{A}^{t},\boldsymbol{\mathscr{B}}^{t}) &= \boldsymbol{A}^{t} \boldsymbol{P}^{t} - \boldsymbol{Q}^{t} \\ L_{A} &= \left\lVert \boldsymbol{P}^{t} \right\rVert_{2}. \end{align}\]
Similarly for $\boldsymbol{\mathscr{B}}$:
\[\begin{align} \boldsymbol{T}^{t+1}&=(\boldsymbol{A}^{t+\frac12})^\top \boldsymbol{A}^{t+\frac12}\\ \boldsymbol{\mathscr{U}}^{t+1}&=(\boldsymbol{A}^{t+\frac12})^\top \boldsymbol{\mathscr{Y}} \\ \nabla_\boldsymbol{\mathscr{B}} f(\boldsymbol{A}^{t+\frac12},\boldsymbol{\mathscr{B}}^{t}) &= \boldsymbol{T}^{t+1} \boldsymbol{\mathscr{B}}^{t} - \boldsymbol{\mathscr{U}}^{t+1} \\ L_B &= \left\lVert \boldsymbol{T}^{t+1} \right\rVert_{2}. \end{align}\]
To ensure the iterates stay "close" to normalized, we introduce a renormalization step after the projected gradient updates:
\[\begin{align} \boldsymbol{C} [r,r]&=\frac{1}{J}\textstyle{\sum}_{jk} \boldsymbol{\mathscr{B}}^{t+\frac12}[r,j,k]\\ \boldsymbol{A}^{t+1}&= \boldsymbol{A}^{t+\frac12} \boldsymbol{C}\\ \boldsymbol{\mathscr{B}}^{t+1}&= (\boldsymbol{C}^{t+1})^{-1}\boldsymbol{\mathscr{B}}^{t+\frac12}. \end{align}\]
We typicaly use the following convergence criterion:
\[d(-\nabla \ell(\boldsymbol{A}^{t},\boldsymbol{\mathscr{B}}^{t}), N_{\mathcal{C}}(\boldsymbol{A}^{t},\boldsymbol{\mathscr{B}}^{t}))^2\leq\delta^2 R(I+JK).\]
MatrixTensorFactor.nnmtf_proxgrad_online
— Functionnnmtf using Online proximal (projected) gradient decent alternating through blocks (BCD)
Updates Y each iteration with a new sample
Implimented Factorization Options
MatrixTensorFactor.IMPLIMENTED_OPTIONS
— ConstantLists all implimented options
MatrixTensorFactor.IMPLIMENTED_NORMALIZATIONS
— ConstantIMPLIMENTED_NORMALIZATIONS::Set{Symbol}
:fibre
: set $\sum_{k=1}^K B[r,j,k] = 1$ for all $r, j$, or whenprojection==:nnscale
, set $\sum_{j=1}^J\sum_{k=1}^K B[r,j,k] = J$ for all $r$:slice
: set $\sum_{j=1}^J\sum_{k=1}^K B[r,j,k] = 1$ for all $r$:nothing
: does not enforce any normalization ofB
MatrixTensorFactor.IMPLIMENTED_PROJECTIONS
— ConstantIMPLIMENTED_PROJECTIONS::Set{Symbol}
:nnscale
: Two stage block coordinate decent; 1) projected gradient decent onto nonnegative orthant, 2) shift any weight fromB
toA
according to normalization. Equivilent to :nonnegative whennormalization==:nothing
.:simplex
: Euclidian projection onto the simplex blocks accoring tonormalization
:nonnegative
: zero out negative entries
MatrixTensorFactor.IMPLIMENTED_CRITERIA
— ConstantIMPLIMENTED_CRITERIA::Set{Symbol}
:ncone
: vector-set distance between the -gradient of the objective and the normal cone:iterates
: A,B before and after one iteration are close in L2 norm:objective
: objective is small:relativeerror
: relative error is small (whennormalize=:nothing
) or mean relative error averaging fibres or slices when the normalization is:fibres
or:slices
respectfuly.
MatrixTensorFactor.IMPLIMENTED_STEPSIZES
— ConstantIMPLIMENTED_STEPSIZES::Set{Symbol}
:lipshitz
: gradient step 1/L for lipshitz constant L:spg
: spectral projected gradient stepsize
Constants
MatrixTensorFactor.MAX_STEP
— ConstantMAX_STEP = 1e10
Maximum step size allowed for spg stepsize method.
MatrixTensorFactor.MIN_STEP
— ConstantMIN_STEP = 1e-10
Minimum step size allowed for spg stepsize method.
Kernel Density Estimation
Constants
MatrixTensorFactor.DEFAULT_N_SAMPLES
— ConstantDEFAULT_N_SAMPLES = 64::Integer
Number of samples to use when standardizing a vector of density estimates.
MatrixTensorFactor.DEFAULT_ALPHA
— ConstantDEFAULT_ALPHA = 0.9::Real
Smoothing parameter for calculating a kernel's bandwidth.
1D
MatrixTensorFactor.default_bandwidth
— Functiondefault_bandwidth(data; alpha=0.9, inner_percentile=100)
Coppied from KernelDensity since this function is not exported. I want access to it so that the same bandwidth can be used for different densities for the same measurements.
MatrixTensorFactor.make_densities
— Functionmake_densities(s::Sink; kwargs...)
make_densities(s::Sink, domains::AbstractVector{<:AbstractVector}; kwargs...)
Estimates the densities for each measurement in a Sink.
When given domains, a list where each entry is a domain for a different measurement, resample the kernel on this domain.
Parameters
bandwidths::AbstractVector{<:Real}
: list of bandwidths used for each measurement's
density estimation
inner_percentile::Integer=100
: value between 0 and 100 that filters out each measurement
by using the inner percentile range. This can help remove outliers and focus in on where the bulk of the data is.
Returns
density_estimates::Vector{UnivariateKDE}
MatrixTensorFactor.make_densities2d
— Functionmakedensities2d(s::Sink; kwargs...) makedensities2d(s::Sink, domains::AbstractVector{<:AbstractVector}; kwargs...)
Similar to make_densities
but performs the KDE on 2 measurements jointly.
MatrixTensorFactor.standardize_KDEs
— Functionstandardize_KDEs(KDEs::AbstractVector{UnivariateKDE}; n_samples=DEFAULT_N_SAMPLES,)
Resample the densities so they all are sampled from the same domain.
Resample the densities within each sink/source so that like-measurements use the same scale.
MatrixTensorFactor.standardize_2d_KDEs
— Functionstandardize_2d_KDEs(KDEs::AbstractVector{BivariateKDE}; n_samples=DEFAULT_N_SAMPLES,)
Resample the densities so they all are sampled from the same x and y coordinates.
MatrixTensorFactor.filter_inner_percentile
— FunctionFilters elements so only the ones in the inner P percentile remain. See filter_2d_inner_percentile
.
MatrixTensorFactor.filter_2d_inner_percentile
— FunctionFilters 2d elements so only the ones in the inner P percentile remain. See filter_inner_percentile
.
2D
MatrixTensorFactor.repeatcoord
— Functionrepeatcoord(coordinates, values)
Repeates coordinates the number of times given by values.
Both lists should be the same length.
Example
coordinates = [(0,0), (1,1), (1,2)] values = [1, 3, 2] repeatcoord(coordinates, values)
[(0,0), (1,1), (1,1), (1,1), (1,2), (1,2)]
MatrixTensorFactor.kde2d
— Functionkde2d((xs, ys), values)
Performs a 2d KDE based on two lists of coordinates, and the value at those coordinates. Input ––-
xs, ys::Vector{Real}
: coordinates/locations of samplesvalues::Vector{Integer}
: value of the sample
Returns
f::BivariateKDE
use f.x, f.y for the location of the (re)sampled KDE,
and f.density for the sample values of the KDE
MatrixTensorFactor.coordzip
— Functioncoordzip(rcoords)
Zips the "x" and "y" values together into a list of x coords and y coords. Example –––- coordzip([(0,0), (1,1), (1,1), (1,1), (1,2), (1,2)])
[[0, 1, 1, 1, 1, 1], [0, 1, 1, 1, 2, 2]]
Approximations
MatrixTensorFactor.d_dx
— Functiond_dx(y::AbstractVector{<:Real})
Approximates the 1nd derivative of a function using only given samples y of that function.
Assumes y came from f(x) where x was an evenly sampled, unit intervel grid. Note the approximation uses centered three point finite differences for the next-to-end points, and foward/backward three point differences for the begining/end points respectively. The remaining interior points use five point differences.
Will use the largest order method possible by defult (currently 5 points), but can force a specific order method with the keyword order
. See d2_dx2
.
MatrixTensorFactor.d2_dx2
— Functiond2_dx2(y::AbstractVector{<:Real}; order::Integer=length(y))
Approximates the 2nd derivative of a function using only given samples y of that function.
Assumes y came from f(x) where x was an evenly sampled, unit intervel grid. Note the approximation uses centered three point finite differences for the next-to-end points, and foward/backward three point differences for the begining/end points respectively. The remaining interior points use five point differences.
Will use the largest order method possible by defult (currently 5 points), but can force a specific order method with the keyword order
. See d_dx
.
MatrixTensorFactor.curvature
— Functioncurvature(y::AbstractVector{<:Real})
Approximates the signed curvature of a function given evenly spaced samples.
Uses d_dx
and d2_dx2
to approximate the first two derivatives.
MatrixTensorFactor.standard_curvature
— Functionstandard_curvature(y::AbstractVector{<:Real})
Approximates the signed curvature of a function, scaled to the unit box $[0,1]^2$.
See curvature
.
Other Functions
Base.:*
— MethodBase.*(A::AbstractMatrix, B::AbstractArray)
Computes the AbstractArray C where $C_{i_1 i_2 \dots i_d} = \sum_{l=1}^L A_{i_1 l} * B_{l i_2 \dots i_d}$.
This is equivilent to the $1$-mode product $B \times_1 A$.
Generalizes @einsum C[i,j,k] := A[i,l] * B[l,j,k].
MatrixTensorFactor.combined_norm
— Functioncombined_norm(u, v, ...)
Compute the combined norm of the arguments as if all arguments were part of one large array.
This is equivilent to norm(cat(u, v, ...))
, but this implimentation avoids creating an intermediate array.
u = [3 0]
v = [0 4 0]
combined_norm(u, v)
# output
5.0
MatrixTensorFactor.dist_to_Ncone
— Functiondist_to_Ncone(grad_A, grad_B, A, B)
Calculate the distance of the -gradient to the normal cone of the positive orthant.
MatrixTensorFactor.rel_error
— Functionrel_error(x, xhat)
Compute the relative error between x (true value) and xhat (its approximation).
The relative error is given by:
\[\frac{\lVert \hat{x} - x \rVert}{\lVert x \rVert}\]
See also mean_rel_error
.
MatrixTensorFactor.mean_rel_error
— Functionmean_rel_error(X, Xhat; dims=(1,2))
Compute the mean relative error between the dims-order slices of X and Xhat.
The mean relative error is given by:
\[\frac{1}{N}\sum_{j=1}^N\frac{\lVert \hat{X}_j - X_j \rVert}{\lVert X_j \rVert}\]
See also rel_error
.
MatrixTensorFactor.relative_error
— Functionrelative_error(Yhat, Y; normalize=:nothing)
Wrapper to use the relative error calculation according to the normalization used.
normalize==:nothing
: entry-wise L2 relative error between the two arraysnormalize==:fibres
: average L2 relative error between all 3-fibresnormalize==:slices
: average L2 relative error between all 1-mode slices
See also rel_error
, mean_rel_error
.
MatrixTensorFactor.slicewise_dot
— Functionslicewise_dot(A::AbstractArray, B::AbstractArray)
Constracts all but the first dimentions of A and B by performing a dot product over each dim=1
slice.
Generalizes @einsum C[s,r] := A[s,j,k]*B[r,j,k]
to arbitrary dimentions.
Index
MatrixTensorFactor
MatrixTensorFactor.DEFAULT_ALPHA
MatrixTensorFactor.DEFAULT_N_SAMPLES
MatrixTensorFactor.IMPLIMENTED_CRITERIA
MatrixTensorFactor.IMPLIMENTED_NORMALIZATIONS
MatrixTensorFactor.IMPLIMENTED_OPTIONS
MatrixTensorFactor.IMPLIMENTED_PROJECTIONS
MatrixTensorFactor.IMPLIMENTED_STEPSIZES
MatrixTensorFactor.MAX_STEP
MatrixTensorFactor.MIN_STEP
Base.:*
MatrixTensorFactor.combined_norm
MatrixTensorFactor.coordzip
MatrixTensorFactor.curvature
MatrixTensorFactor.d2_dx2
MatrixTensorFactor.d_dx
MatrixTensorFactor.default_bandwidth
MatrixTensorFactor.dist_to_Ncone
MatrixTensorFactor.filter_2d_inner_percentile
MatrixTensorFactor.filter_inner_percentile
MatrixTensorFactor.kde2d
MatrixTensorFactor.make_densities
MatrixTensorFactor.make_densities2d
MatrixTensorFactor.mean_rel_error
MatrixTensorFactor.nnmtf
MatrixTensorFactor.nnmtf_proxgrad_online
MatrixTensorFactor.rel_error
MatrixTensorFactor.relative_error
MatrixTensorFactor.repeatcoord
MatrixTensorFactor.slicewise_dot
MatrixTensorFactor.standard_curvature
MatrixTensorFactor.standardize_2d_KDEs
MatrixTensorFactor.standardize_KDEs