Clustering Operation API

class jange.ops.cluster.ClusterOperation(model: sklearn.base.ClusterMixin, name: str = 'cluster')[source]

Operation for clustering. This class uses scikit-learn clustering models.

Models under sklearn.cluster can be used as the underlying model to perform clustering.

Parameters:
  • model (sklearn.base.ClusterMixin) – See this module’s SUPPORTED_CLASSES attribute to check what models are supported
  • name (str) – name of this operation, default cluster
Variables:
  • model (sklearn.base.ClusterMixin) – underlying clustering model
  • name (str) – name of this operation

Example

>>> ds = DataStream(...)
>>> ds.apply(ClusterOperation(model=sklearn.cluster.KMeans(3)))
jange.ops.cluster.kmeans(n_clusters: int, name: str = 'kmeans', **kwargs) → jange.ops.cluster.ClusterOperation[source]

Returns ClusterOperation with kmeans algorithm

Parameters:
  • n_clusters (int) – number of clusters to create
  • name (str) – name of this operation, default kmeans
  • kwargs – keyword arguments to pass to sklearn.cluster.KMeans class
Returns:

Operation with KMeans algorithm

Return type:

ClusterOperation

Example

>>> op = kmeans(n_clusters=10)
jange.ops.cluster.minibatch_kmeans(n_clusters: int, name: str = 'minibatch_kmeans', **kwargs) → jange.ops.cluster.ClusterOperation[source]

Returns ClusterOperation with mini-batchkmeans algorithm

Parameters:
  • n_clusters (int) – number of clusters to create
  • name (str) – name of this operation, default minibatch_kmeans
  • kwargs – keyword arguments to pass to sklearn.cluster.MiniBatchKMeans class
Returns:

Operation with MiniBatchKMeans algorithm

Return type:

ClusterOperation

Example

>>> op = minibatch_kmeans(n_clusters=10)