ILNumerics Ultimate VS

MachineLearningem Method

ILNumerics Ultimate VS Documentation
ILNumerics - Technical Application Development
Expectation maximization algorithm.

[ILNumerics Machine Learning Toolbox]

Namespace:  ILNumerics.Toolboxes
Assembly:  ILNumerics.Toolboxes.MachineLearning (in ILNumerics.Toolboxes.MachineLearning.dll) Version: 5.5.0.0 (5.5.7503.3146)
Syntax

public static RetArray<double> em(
	InArray<double> Samples,
	int k,
	OutArray<double> Sigma = null,
	EMInitializationMethod method = EMInitializationMethod.KMeans_random,
	InArray<double> UserCenters = null,
	int maxiterexit = 10000,
	double centerconverg_exit = 0,001
)

Parameters

Samples
Type: ILNumericsInArrayDouble
Input data, data points in columns.
k
Type: SystemInt32
Number of clusters.
Sigma (Optional)
Type: ILNumericsOutArrayDouble
[Output] Covariance estimation for all clusters, size d x d x k, d = samples.D[0].
method (Optional)
Type: ILNumerics.ToolboxesEMInitializationMethod
[Optional] Method used for initializing the cluster centers, default: kmeans_random.
UserCenters (Optional)
Type: ILNumericsInArrayDouble
[Optional] For method 'user': initial cluster centers, size samples.D[0] x k, for other methods ignored.
maxiterexit (Optional)
Type: SystemInt32
[Optional] Break after that number of iterations, if no convergence was reached.
centerconverg_exit (Optional)
Type: SystemDouble
[Optional] Exit iteration if norm(L) falls below that value, default: 0.001.

Return Value

Type: RetArrayDouble
Estimated centers for all clusters, size samples.D[0] x k.
Remarks

The EM algorithm expects the data samples to be drawn from k multivariate normal distributions. It estimates the parameters 'center' and 'sigma (covariance)' of every distribution. Therefore, the position and 'shape' of each distribution is calculated in such a way, that the likelyhood of generating the given sample points is maximized.

The parameter k must be determined by the user. This reflects the a priori knowledge of the number of distributions or clusters in the data.

The algorithm exits, if one of the exit criteria is reached:

  • norm(L) < 'centerconverg_exit' - where L is the difference between the centers from the last step and the centers just computed in the current step
  • the number of iteration steps exceeds the limit of 'maxiterexit' iterations.

[ILNumerics Machine Learning Toolbox]

See Also

Reference