Constrained Factorization
Example
A number of constraints are ready-to-use out of the box. Say you would like to perform a rank 5 CP decomposition of a 3rd order tensor, such that every column is constrainted to the simplex. This can be accomplished by the following code.
X, stats, kwargs = factorize(Y; model=CPDecomposition, rank=5, constraints=simplex_cols!)Only one constraint is given when there are 3 factors in the CP decomposition of Y, so it is assumed this constraint applies to every factor.
Say you instead you only want the 1st factor to have columns constrained to the simplex, and the 3rd factor to be nonnegative. You can use the following code.
X, stats, kwargs = factorize(Y; model=CPDecomposition, rank=5,
constraints=[simplex_cols!, noconstraint, nonnegative!])Ready-to-use Constraints
Entrywise
BlockTensorFactorization.Core.Entrywise — Type
Entrywise <: AbstractConstraint
Entrywise(apply::Function, check::Function)Entry-wise constraint. Both apply and check need are performed entry-wise on an array.
The constraints nonnegative! and binary! ensure each entry in a factor is nonnegative and either 0 or 1 respectively.
BlockTensorFactorization.Core.nonnegative! — Function
Entrywise(ReLU, isnonnegative)
BlockTensorFactorization.Core.binary! — Function
Entrywise(binaryproject, x -> x in (0, 1))
If you want every entry to be in the closed interval a, b, you can use IntervalConstraint(a, b).
BlockTensorFactorization.Core.IntervalConstraint — Function
Entrywise(x -> clamp(x, a, b), x -> a ≤ x ≤ b)
Normalizations
BlockTensorFactorization.Core.ProjectedNormalization — Type
ProjectedNormalization(projection, norm; whats_normalized=identity)Main constructor for the constraint where norm of whats_normalized equals scale.
Scale can be a single Real, or an AbstractArray{<:Real}, but should be the same size as the output of whats_normalized.
These ensure a factor is normalized according to the L1, L2, & L infinity norms. This is accomplished through a Euclidean projections onto the unit ball.
BlockTensorFactorization.Core.l1normalize! — Function
l1normalize! <: ProjectedNormalizationAlias for
ProjectedNormalization(l1norm, l1project!; whats_normalized=identityslice).
BlockTensorFactorization.Core.l2normalize! — Function
l2normalize! <: ProjectedNormalizationAlias for
ProjectedNormalization(l2norm, l2project!; whats_normalized=identityslice).
BlockTensorFactorization.Core.linftynormalize! — Function
linftynormalize! <: ProjectedNormalizationAlias for
ProjectedNormalization(linftynorm, linftyproject!; whats_normalized=identityslice).
Each come in versions that constrain each row, column, order-1 slice, & order-(1, 2) slice to the associated unit norm ball.
l1normalize_rows!
l1normalize_cols!
l1normalize_1slices!
l1normalize_12slices!
l2normalize_rows!
l2normalize_cols!
l2normalize_1slices!
l2normalize_12slices!
linftynormalize_rows!
linftynormalize_cols!
linftynormalize_1slices!
linftynormalize_12slices!Projection onto the unit norm ball from the origin is not unique at the origin so do not expect consistent behaviour with an all-zeros input.
Scaling Constraints
BlockTensorFactorization.Core.ScaledNormalization — Type
ScaledNormalization(norm; whats_normalized=identity, scale=1)Main constructor for the constraint where norm of whats_normalized equals scale.
Scale can be a single Real, or an AbstractArray{<:Real}, but should be broadcast-able with the output of whats_normalized. Lastly, scale can be a Function which will act on an AbstractArray{<:Real} and return something that is broadcast-able whats_normalized.
Each previously listed normalization has an associated scaled normalization. These ensure the relevant subarrays are normalized, but rather than enforce these by a Euclidean projection, they simply divide by its norm. This is equivalent to the Euclidean projection onto the L2 norm ball, but is a different operation for the L1 and L infinity balls. This offers the advantage that other factors can be "rescaled" to compensate for this division which is not normally possible with the projections onto the L1 and L infinity balls.
By default, when these constraints are applied, they will "rescale" the other factors to minimize the change in the product of all the factors.
Example
Say we are performing CPDecomposition on a matrix Y. This is equivalent to factorizing Y = A * B'. If we would like all columns of B (rows of B transpose) to be on the L1 ball, rather than projecting each column, we can instead divide each column of B by its L1 norm, and multiply the associated column of A by this amount. This has the advantage of enforcing our constraint without effecting the product A * B', whereas a projection would possibly change this product.
l1scale!
l1scale_rows!
l1scale_cols!
l1scale_1slices!
l1scale_12slices!
l2scale!
l2scale_rows!
l2scale_cols!
l2scale_1slices!
l2scale_12slices!
linftyscale!
linftyscale_rows!
linftyscale_cols!
linftyscale_1slices!
linftyscale_12slices!There is also a set of constraints that ensure the order-(1,2) slices are scaled on average. This makes preserving a Tucker1 product possible where you would like each order-(1,2) normalized.
l1scale_average12slices!
l2scale_average12slices!
linftyscale_average12slices!Simplex Constraint
Similar to the L1 normalization constraint, these constraints ensure the relevant subarrays are on the L1 ball. But these also ensure all entries are positive. This is enforced with a single Euclidean projection onto the relevant simplex.
simplex!
simplex_cols!
simplex_1slices!
simplex_12slices!List of built-in ProjectedNormalizations and ScaledNormalizations
BlockTensorFactorization.Core.BUILT_IN_CONSTRAINTS — Constant
List of symbols of the built-in constraint functions.
l1normalize!
l1normalize_rows!
l1normalize_cols!
l1normalize_1slices!
l1normalize_12slices!
l2normalize!
l2normalize_rows!
l2normalize_cols!
l2normalize_1slices!
l2normalize_12slices!
linftynormalize!
linftynormalize_rows!
linftynormalize_cols!
linftynormalize_1slices!
linftynormalize_12slices!
simplex!
simplex_rows!
simplex_cols!
simplex_1slices!
simplex_12slices!
l1scale!
l1scale_rows!
l1scale_cols!
l1scale_1slices!
l1scale_12slices!
l1scale_average12slices!
l2scale!
l2scale_rows!
l2scale_cols!
l2scale_1slices!
l2scale_12slices!
l2scale_average12slices!
linftyscale!
linftyscale_rows!
linftyscale_cols!
linftyscale_1slices!
linftyscale_12slices!
linftyscale_average12slices!Linear Constraints
BlockTensorFactorization.Core.LinearConstraint — Type
LinearConstraint(A::T, B::AbstractArray) where {T <: Union{Function, AbstractArray}}The constraint AX = B for a linear operator A and array B.
When A is a matrix, this projects onto the subspace with the solution given by
X .-= A' * ( (A*A') \ (A*X .- b) ).
Advanced Constraints
Ending with a different set of constraints
It is possible to apply a different set of constraints at the end of the algorithm than what is enforced during the iterations. For example, you can perform nonnegative Tucker factorization of a tensor Y, but apply a simplex constraint on the core at the very end.
X, stats, kwargs = factorize(Y; model=Tucker, rank=2,
constraints=nonnegative!,
final_constraints=[simplex!, noconstraint, noconstraint, noconstraint])In the case where constraints effect other factors (e.g. l2scale!), you may want to perform a final pass of the constraints to ensure each factor is scaled correctly, without rescaling/effecting other factors.
X, stats, kwargs = factorize(Y; model=CPDecomposition, rank=3,
constraints=l2scale_cols!,
constrain_output=true)Composing Constraints
AbstractConstraint types can be composed with \circ (and hitting tab to make ∘) creating a ComposedConstraint.
BlockTensorFactorization.Core.ComposedConstraint — Type
ComposedConstraint{T<:AbstractConstraint, U<:AbstractConstraint}
outer_constraint ∘ inner_constraintComposing any two AbstractConstraints with ∘ will return this type.
Applies the inner constraint first, then the outer constraint. Checking a ComposedConstraint will check both constraints are satisfied.
All ComposedConstraints do is apply the two constraints in series and does not do anything intelligent like finding the intersection of the constraints. For this reason, the following three constraints are all different.
l1normalize! ∘ nonnegative!
nonnegative! ∘ l1normalize!
simplex!Custom Constraints
You can define your own ProjectedNormalization, ScaledNormalization, or Entrywise constraint using the following constructors.
ScaledNormalization(norm; whats_normalized=identityslice, scale=1)
ProjectedNormalization(norm, projection; whats_normalized=identityslice)
Entrywise(apply, check)In fact, these are how the ready-to-use constraints are made. Here are some examples.
l2scale_1slices! = ScaledNormalization(l2norm; whats_normalized=(x -> eachslice(x; dims=1)))
l1normalize_rows! = ProjectedNormalization(l1norm, l1project!; whats_normalized=eachrow)
nonnegative! = Entrywise(ReLU, isnonnegative)
IntervalConstraint(a, b) = Entrywise(x -> clamp(x, a, b), x -> a <= x <= b)You can also make a custom constraint with GenericConstraint.
BlockTensorFactorization.Core.GenericConstraint — Type
GenericConstraint <: AbstractConstraintGeneral constraint. Simply applies the function apply and checks it was successful with check.
Calling a GenericConstraint on an AbstractArray will use the function in the field apply. Use check(C::GenericConstraint, A) to use the function in the field check.
Manual Constraint Updates
You can manually define the ConstraintUpdate that gets applied as part of the block decomposition method. These will be automatically inserted into the order of updates immediately following the last update of the matching block with smart_interlace!.
As an example, if we are performing CP decomposition on an order 3 tensor, the unconstrained block optimization would look something like this.
BlockUpdate(
GradientDescent(1, gradient, step)
GradientDescent(2, gradient, step)
GradientDescent(3, gradient, step)
)Here the 1, 2, and 3 denote which factor gets updated. If we want to apply a simplex constraint to the second factor, and nonnegative to the 3rd, you can do the following.
X, stats, kwargs = factorize(Y; model=CPDecomposition, rank=5, constraints=[ConstraintUpdate(3, nonnegative!), Projection(2, simplex!)])This would result in the following block update.
BlockUpdate(
GradientDescent(1, gradient, step)
GradientDescent(2, gradient, step)
Projection(2, simplex!)
GradientDescent(3, gradient, step)
Projection(3, nonnegative!)
)Note the order [ConstraintUpdate(3, nonnegative!), Projection(2, simplex!)] does not matter. Also, using the (abstract) constructor ConstraintUpdate will reduce to the concrete types when possible.
BlockTensorFactorization.Core.ConstraintUpdate — Type
ConstraintUpdate(n, constraint)Converts an AbstractConstraint to a ConstraintUpdate on the factor n
When using ScaledNormalizations, you may want to manually define what gets rescaled using the downstream ConstraintUpdate concrete type: Rescale.
BlockTensorFactorization.Core.Rescale — Type
Rescale{T<:Union{Nothing,Missing,Function}} <: ConstraintUpdate
Rescale(n, scale::ScaledNormalization, whats_rescaled::T)Applies the scaled normalization scale to factor n, and tries to multiply the scaling of factor n to other factors.
If whats_rescaled=nothing, then it will not rescale any other factor.
If whats_rescaled=missing, then it will try to evenly distribute the weight to all other factors using the (N-1) root of each weight where N is the number of factors. If the weights are not broadcastable, (e.g. you want to scale each row but each factor has a different number of rows), will use the geometric mean of the weights as the single weight to distribute evenly among the other factors.
If typeof(whats_rescaled) <: Function, will broadcast the weight to the output of calling this function on the entire decomposition. For example, whats_rescale = x -> eachcol(factor(x, 2)) will rescale each column of the second factor of the decomposition.