Exported Terms
MatrixTensorFactor — ModuleMatrix-Tensor Factorization
Nonnegative Matrix-Tensor Factorization
MatrixTensorFactor.nnmtf — Functionnnmtf(Y::AbstractArray, R::Integer; kwargs...)Non-negatively matrix-tensor factorizes an order N tensor Y with a given "rank" R.
For an order $N=3$ tensor, this factorizes $Y \approx A B$ where $\displaystyle Y[i,j,k] \approx \sum_{r=1}^R A[i,r]*B[r,j,k]$ and the factors $A, B \geq 0$ are nonnegative.
For higher orders, this becomes $\displaystyle Y[i1,i2,...,iN] \approx \sum_{r=1}^R A[i1,r]*B[r,i2,...,iN].$
Note there may NOT be a unique optimal solution
Arguments
Y::AbstractArray{T,N}: tensor to factorizeR::Integer: rank to factorize Y (size(A)[2] and size(B)[1])
Keywords
maxiter::Integer=100: maxmimum number of iterationstol::Real=1e-3: desiered tolerance for the convergence criterionrescale_AB::Bool=true: scale B at each iteration so that the factors (horizontal slices) have similar 3-fiber sums.rescale_Y::Bool=true: Preprocesses the inputYto have normalized 3-fiber sums (on average), and rescales the finalBsoY=A*B.normalize::Symbol=:fibres: part of B that should be normalized (must be in IMPLIMENTED_NORMALIZATIONS)projection::Symbol=:nnscale: constraint to use and method for enforcing it (must be in IMPLIMENTED_PROJECTIONS)criterion::Symbol=:ncone: how to determine if the algorithm has converged (must be in IMPLIMENTED_CRITERIA)stepsize::Symbol=:lipshitz: used for the gradient decent step (must be in IMPLIMENTED_STEPSIZES)momentum::Bool=false: use momentum updatesdelta::Real=0.9999: safeguard for maximum amount of momentum (see eq (3.5) Xu & Yin 2013)R_max::Integer=size(Y)[1]: maximum rank to try if R is not givenprojectionA::Symbol=projection: projection to use on factor A (must be in IMPLIMENTED_PROJECTIONS)projectionB::Symbol=projection: projection to use on factor B (must be in IMPLIMENTED_PROJECTIONS)A_init::AbstractMatrix=nothing: initial A for the iterative algorithm. Should be kept as nothing ifRis not given.B_init::AbstractArray=nothing: initial B for the iterative algorithm. Should be kept as nothing ifRis not given.
Returns
A::Matrix{Float64}: the matrix A in the factorization Y ≈ A * BB::Array{Float64, N}: the tensor B in the factorization Y ≈ A * Brel_errors::Vector{Float64}: relative errors at each iterationnorm_grad::Vector{Float64}: norm of the full gradient at each iterationdist_Ncone::Vector{Float64}: distance of the -gradient to the normal cone at each iteration- If R was estimated, also returns the optimal
R::Integer
Implimentation of block coordinate decent updates
We calculate the partial gradients and corresponding Lipshitz constants like so:
\[\begin{align} \boldsymbol{P}^{t}[q,r] &=\textstyle{\sum}_{jk} \boldsymbol{\mathscr{B}}^n[q,j,k] \boldsymbol{\mathscr{B}}^n[r,j,k]\\ \boldsymbol{Q}^{t}[i,r] &=\textstyle{\sum}_{jk}\boldsymbol{\mathscr{Y}}[i,j,k] \boldsymbol{\mathscr{B}}^n[r,j,k] \\ \nabla_{A} f(\boldsymbol{A}^{t},\boldsymbol{\mathscr{B}}^{t}) &= \boldsymbol{A}^{t} \boldsymbol{P}^{t} - \boldsymbol{Q}^{t} \\ L_{A} &= \left\lVert \boldsymbol{P}^{t} \right\rVert_{2}. \end{align}\]
Similarly for $\boldsymbol{\mathscr{B}}$:
\[\begin{align} \boldsymbol{T}^{t+1}&=(\boldsymbol{A}^{t+\frac12})^\top \boldsymbol{A}^{t+\frac12}\\ \boldsymbol{\mathscr{U}}^{t+1}&=(\boldsymbol{A}^{t+\frac12})^\top \boldsymbol{\mathscr{Y}} \\ \nabla_\boldsymbol{\mathscr{B}} f(\boldsymbol{A}^{t+\frac12},\boldsymbol{\mathscr{B}}^{t}) &= \boldsymbol{T}^{t+1} \boldsymbol{\mathscr{B}}^{t} - \boldsymbol{\mathscr{U}}^{t+1} \\ L_B &= \left\lVert \boldsymbol{T}^{t+1} \right\rVert_{2}. \end{align}\]
To ensure the iterates stay "close" to normalized, we introduce a renormalization step after the projected gradient updates:
\[\begin{align} \boldsymbol{C} [r,r]&=\frac{1}{J}\textstyle{\sum}_{jk} \boldsymbol{\mathscr{B}}^{t+\frac12}[r,j,k]\\ \boldsymbol{A}^{t+1}&= \boldsymbol{A}^{t+\frac12} \boldsymbol{C}\\ \boldsymbol{\mathscr{B}}^{t+1}&= (\boldsymbol{C}^{t+1})^{-1}\boldsymbol{\mathscr{B}}^{t+\frac12}. \end{align}\]
We typicaly use the following convergence criterion:
\[d(-\nabla \ell(\boldsymbol{A}^{t},\boldsymbol{\mathscr{B}}^{t}), N_{\mathcal{C}}(\boldsymbol{A}^{t},\boldsymbol{\mathscr{B}}^{t}))^2\leq\delta^2 R(I+JK).\]
MatrixTensorFactor.nnmtf_proxgrad_online — Functionnnmtf using Online proximal (projected) gradient decent alternating through blocks (BCD)
Updates Y each iteration with a new sample
Implimented Factorization Options
MatrixTensorFactor.IMPLIMENTED_OPTIONS — ConstantLists all implimented options
MatrixTensorFactor.IMPLIMENTED_NORMALIZATIONS — ConstantIMPLIMENTED_NORMALIZATIONS::Set{Symbol}:fibre: set $\sum_{k=1}^K B[r,j,k] = 1$ for all $r, j$, or whenprojection==:nnscale, set $\sum_{j=1}^J\sum_{k=1}^K B[r,j,k] = J$ for all $r$:slice: set $\sum_{j=1}^J\sum_{k=1}^K B[r,j,k] = 1$ for all $r$:nothing: does not enforce any normalization ofB
MatrixTensorFactor.IMPLIMENTED_PROJECTIONS — ConstantIMPLIMENTED_PROJECTIONS::Set{Symbol}:nnscale: Two stage block coordinate decent; 1) projected gradient decent onto nonnegative orthant, 2) shift any weight fromBtoAaccording to normalization. Equivilent to :nonnegative whennormalization==:nothing.:simplex: Euclidian projection onto the simplex blocks accoring tonormalization:nonnegative: zero out negative entries
MatrixTensorFactor.IMPLIMENTED_CRITERIA — ConstantIMPLIMENTED_CRITERIA::Set{Symbol}:ncone: vector-set distance between the -gradient of the objective and the normal cone:iterates: A,B before and after one iteration are close in L2 norm:objective: objective is small:relativeerror: relative error is small (whennormalize=:nothing) or mean relative error averaging fibres or slices when the normalization is:fibresor:slicesrespectfuly.
MatrixTensorFactor.IMPLIMENTED_STEPSIZES — ConstantIMPLIMENTED_STEPSIZES::Set{Symbol}:lipshitz: gradient step 1/L for lipshitz constant L:spg: spectral projected gradient stepsize
Constants
MatrixTensorFactor.MAX_STEP — ConstantMAX_STEP = 1e10Maximum step size allowed for spg stepsize method.
MatrixTensorFactor.MIN_STEP — ConstantMIN_STEP = 1e-10Minimum step size allowed for spg stepsize method.
Kernel Density Estimation
Constants
MatrixTensorFactor.DEFAULT_N_SAMPLES — ConstantDEFAULT_N_SAMPLES = 64::IntegerNumber of samples to use when standardizing a vector of density estimates.
MatrixTensorFactor.DEFAULT_ALPHA — ConstantDEFAULT_ALPHA = 0.9::RealSmoothing parameter for calculating a kernel's bandwidth.
1D
MatrixTensorFactor.default_bandwidth — Functiondefault_bandwidth(data; alpha=0.9, inner_percentile=100)Coppied from KernelDensity since this function is not exported. I want access to it so that the same bandwidth can be used for different densities for the same measurements.
MatrixTensorFactor.make_densities — Functionmake_densities(s::Sink; kwargs...)
make_densities(s::Sink, domains::AbstractVector{<:AbstractVector}; kwargs...)Estimates the densities for each measurement in a Sink.
When given domains, a list where each entry is a domain for a different measurement, resample the kernel on this domain.
Parameters
bandwidths::AbstractVector{<:Real}: list of bandwidths used for each measurement's
density estimation
inner_percentile::Integer=100: value between 0 and 100 that filters out each measurement
by using the inner percentile range. This can help remove outliers and focus in on where the bulk of the data is.
Returns
density_estimates::Vector{UnivariateKDE}
MatrixTensorFactor.make_densities2d — Functionmakedensities2d(s::Sink; kwargs...) makedensities2d(s::Sink, domains::AbstractVector{<:AbstractVector}; kwargs...)
Similar to make_densities but performs the KDE on 2 measurements jointly.
MatrixTensorFactor.standardize_KDEs — Functionstandardize_KDEs(KDEs::AbstractVector{UnivariateKDE}; n_samples=DEFAULT_N_SAMPLES,)Resample the densities so they all are sampled from the same domain.
Resample the densities within each sink/source so that like-measurements use the same scale.
MatrixTensorFactor.standardize_2d_KDEs — Functionstandardize_2d_KDEs(KDEs::AbstractVector{BivariateKDE}; n_samples=DEFAULT_N_SAMPLES,)Resample the densities so they all are sampled from the same x and y coordinates.
MatrixTensorFactor.filter_inner_percentile — FunctionFilters elements so only the ones in the inner P percentile remain. See filter_2d_inner_percentile.
MatrixTensorFactor.filter_2d_inner_percentile — FunctionFilters 2d elements so only the ones in the inner P percentile remain. See filter_inner_percentile.
2D
MatrixTensorFactor.repeatcoord — Functionrepeatcoord(coordinates, values)Repeates coordinates the number of times given by values.
Both lists should be the same length.
Example
coordinates = [(0,0), (1,1), (1,2)] values = [1, 3, 2] repeatcoord(coordinates, values)
[(0,0), (1,1), (1,1), (1,1), (1,2), (1,2)]
MatrixTensorFactor.kde2d — Functionkde2d((xs, ys), values)Performs a 2d KDE based on two lists of coordinates, and the value at those coordinates. Input ––-
xs, ys::Vector{Real}: coordinates/locations of samplesvalues::Vector{Integer}: value of the sample
Returns
f::BivariateKDEuse f.x, f.y for the location of the (re)sampled KDE,
and f.density for the sample values of the KDE
MatrixTensorFactor.coordzip — Functioncoordzip(rcoords)Zips the "x" and "y" values together into a list of x coords and y coords. Example –––- coordzip([(0,0), (1,1), (1,1), (1,1), (1,2), (1,2)])
[[0, 1, 1, 1, 1, 1], [0, 1, 1, 1, 2, 2]]
Approximations
MatrixTensorFactor.d_dx — Functiond_dx(y::AbstractVector{<:Real})Approximates the 1nd derivative of a function using only given samples y of that function.
Assumes y came from f(x) where x was an evenly sampled, unit intervel grid. Note the approximation uses centered three point finite differences for the next-to-end points, and foward/backward three point differences for the begining/end points respectively. The remaining interior points use five point differences.
Will use the largest order method possible by defult (currently 5 points), but can force a specific order method with the keyword order. See d2_dx2.
MatrixTensorFactor.d2_dx2 — Functiond2_dx2(y::AbstractVector{<:Real}; order::Integer=length(y))Approximates the 2nd derivative of a function using only given samples y of that function.
Assumes y came from f(x) where x was an evenly sampled, unit intervel grid. Note the approximation uses centered three point finite differences for the next-to-end points, and foward/backward three point differences for the begining/end points respectively. The remaining interior points use five point differences.
Will use the largest order method possible by defult (currently 5 points), but can force a specific order method with the keyword order. See d_dx.
MatrixTensorFactor.curvature — Functioncurvature(y::AbstractVector{<:Real})Approximates the signed curvature of a function given evenly spaced samples.
Uses d_dx and d2_dx2 to approximate the first two derivatives.
MatrixTensorFactor.standard_curvature — Functionstandard_curvature(y::AbstractVector{<:Real})Approximates the signed curvature of a function, scaled to the unit box $[0,1]^2$.
See curvature.
Other Functions
Base.:* — MethodBase.*(A::AbstractMatrix, B::AbstractArray)Computes the AbstractArray C where $C_{i_1 i_2 \dots i_d} = \sum_{l=1}^L A_{i_1 l} * B_{l i_2 \dots i_d}$.
This is equivilent to the $1$-mode product $B \times_1 A$.
Generalizes @einsum C[i,j,k] := A[i,l] * B[l,j,k].
MatrixTensorFactor.combined_norm — Functioncombined_norm(u, v, ...)Compute the combined norm of the arguments as if all arguments were part of one large array.
This is equivilent to norm(cat(u, v, ...)), but this implimentation avoids creating an intermediate array.
u = [3 0]
v = [0 4 0]
combined_norm(u, v)
# output
5.0MatrixTensorFactor.dist_to_Ncone — Functiondist_to_Ncone(grad_A, grad_B, A, B)Calculate the distance of the -gradient to the normal cone of the positive orthant.
MatrixTensorFactor.rel_error — Functionrel_error(x, xhat)Compute the relative error between x (true value) and xhat (its approximation).
The relative error is given by:
\[\frac{\lVert \hat{x} - x \rVert}{\lVert x \rVert}\]
See also mean_rel_error.
MatrixTensorFactor.mean_rel_error — Functionmean_rel_error(X, Xhat; dims=(1,2))Compute the mean relative error between the dims-order slices of X and Xhat.
The mean relative error is given by:
\[\frac{1}{N}\sum_{j=1}^N\frac{\lVert \hat{X}_j - X_j \rVert}{\lVert X_j \rVert}\]
See also rel_error.
MatrixTensorFactor.relative_error — Functionrelative_error(Yhat, Y; normalize=:nothing)Wrapper to use the relative error calculation according to the normalization used.
normalize==:nothing: entry-wise L2 relative error between the two arraysnormalize==:fibres: average L2 relative error between all 3-fibresnormalize==:slices: average L2 relative error between all 1-mode slices
See also rel_error, mean_rel_error.
MatrixTensorFactor.slicewise_dot — Functionslicewise_dot(A::AbstractArray, B::AbstractArray)Constracts all but the first dimentions of A and B by performing a dot product over each dim=1 slice.
Generalizes @einsum C[s,r] := A[s,j,k]*B[r,j,k] to arbitrary dimentions.
Index
MatrixTensorFactorMatrixTensorFactor.DEFAULT_ALPHAMatrixTensorFactor.DEFAULT_N_SAMPLESMatrixTensorFactor.IMPLIMENTED_CRITERIAMatrixTensorFactor.IMPLIMENTED_NORMALIZATIONSMatrixTensorFactor.IMPLIMENTED_OPTIONSMatrixTensorFactor.IMPLIMENTED_PROJECTIONSMatrixTensorFactor.IMPLIMENTED_STEPSIZESMatrixTensorFactor.MAX_STEPMatrixTensorFactor.MIN_STEPBase.:*MatrixTensorFactor.combined_normMatrixTensorFactor.coordzipMatrixTensorFactor.curvatureMatrixTensorFactor.d2_dx2MatrixTensorFactor.d_dxMatrixTensorFactor.default_bandwidthMatrixTensorFactor.dist_to_NconeMatrixTensorFactor.filter_2d_inner_percentileMatrixTensorFactor.filter_inner_percentileMatrixTensorFactor.kde2dMatrixTensorFactor.make_densitiesMatrixTensorFactor.make_densities2dMatrixTensorFactor.mean_rel_errorMatrixTensorFactor.nnmtfMatrixTensorFactor.nnmtf_proxgrad_onlineMatrixTensorFactor.rel_errorMatrixTensorFactor.relative_errorMatrixTensorFactor.repeatcoordMatrixTensorFactor.slicewise_dotMatrixTensorFactor.standard_curvatureMatrixTensorFactor.standardize_2d_KDEsMatrixTensorFactor.standardize_KDEs