circleclust.circleclust.CircleClust

class circleclust.circleclust.CircleClust(data=None, period=None, window=None, max_screen_divisor=32, max_screen_iter=2, train_frac=0.7, random_seed=0, verbose=False, debug=False)[source]

Bases: object

Circular clustering with automatic detection of centroids as distribution peaks.

This class clusters circular data (e.g., times of day, hues) by: - Binning data on a circle with periodic boundaries. - Applying a fixed Hann smoothing window of 9 bins with circular padding. - Screening bin counts via k = 1..max_screen_divisor where B = 9*k and

window width (in radians of the unit circle) is 2π/k.

  • Selecting the binning that minimizes test RMS between histogram and smoothed envelope, averaged over max_screen_iter random splits.

  • Detecting circular local maxima on the smoothed histogram as cluster centers.

Parameters:
  • data (Iterable[float] | None) – Optional data to fit immediately upon construction.

  • period (float, default 2π) – Period of the input values; inputs are wrapped into [0, range). Important to provide valid data range, for example: - 1 if your data period is 1 (e.g. color hues in HLS or HSV space) - 24 if your data is measured in hours (e.g. times of day) - 360 if your data is measured in degrees - 2π if your data is measured in radians (e.g. angles)

  • window (float | None) – Manual override for smoothing window width (in radians on unit circle). If provided, the screener will be skipped and nearest k≈2π/window used.

  • max_screen_divisor (int, default 32) – Maximum divisor k in screening; B = 9*k.

  • max_screen_iter (int, default 2) – Number of screening repetitions with different train/test splits; RMS is averaged across repetitions before selecting k.

  • train_frac (float, default 0.7) – Fraction of samples for training during screening.

  • random_seed (int, default 0) – Base random seed for reproducible splits.

  • verbose (bool, default False) – Enable informational prints via verbose_print.

  • debug (bool)

__init__(data=None, period=None, window=None, max_screen_divisor=32, max_screen_iter=2, train_frac=0.7, random_seed=0, verbose=False, debug=False)[source]
Parameters:

Methods

__init__([data, period, window, ...])

fit(data[, period, verbose])

Fits the model by selecting smoothing binning and finding centers.

get_centroids([as_list])

Returns detected centroids.

get_clusters([as_list])

Returns detected clusters.

get_peaks([as_list])

Returns detected peaks.

predict(x[, width_scale])

Predicts cluster labels by nearest circular distance to centers.

set_window(window)

Sets manually specified smoothing window.

show_centroids([output])

Shows centroids in the histogram.

show_clusters([output])

Shows clusters in the histogram.

show_peaks([color, title, output])

Shows peaks in the histogram.

__init__(data=None, period=None, window=None, max_screen_divisor=32, max_screen_iter=2, train_frac=0.7, random_seed=0, verbose=False, debug=False)[source]
Parameters:
set_window(window)[source]

Sets manually specified smoothing window.

Parameters:

window (float) – Smoothing window width (in radians on unit circle).

fit(data, period=None, verbose=None)[source]

Fits the model by selecting smoothing binning and finding centers.

Parameters:
  • data (Iterable[float]) – Input data to fit.

  • period (float, default 2π) – Period of the input values; inputs are wrapped into [0, period). Important to provide valid data range, for example: - 1 if your data period is 1 (e.g. color hues in HLS or HSV space) - 24 if your data is measured in hours (e.g. times of day) - 360 if your data is measured in degrees - 2π if your data is measured in radians (e.g. angles)

  • verbose (bool | None, default None) – Whether to print verbose output. If None, uses self.verbose.

Returns:

self – The fitted instance with detected peak_idx, peak_std, centroid and centroid_std arrays.

Return type:

CircleClust

predict(x, width_scale=1.0)[source]

Predicts cluster labels by nearest circular distance to centers.

Parameters:
  • x (Iterable[float]) – Input values (wrapped to [0, range)). Units must match range used during fitting (e.g., radians, minutes, or fraction of circle).

  • width_scale (float, default 1.0) – Multiplies the peak width (std) to define the distance threshold for assignment.

Returns:

Integer labels in [0, n_centers-1], or -1 for outliers.

Return type:

np.ndarray

get_peaks(as_list=False)[source]

Returns detected peaks.

Parameters:

as_list (bool, default False) – If True, returns a list of dicts [{‘centroid’: c, ‘std’: s}, …]. If False, returns a dict with numpy arrays {‘centroid’: array, ‘std’: array}.

Returns:

Peaks information as a dict of arrays or a list of per-peak dicts.

Return type:

dict | list

get_centroids(as_list=False)[source]

Returns detected centroids.

Parameters:

as_list (bool, default False) – If True, returns a list of dicts [{‘centroid’: c, ‘std’: s}, …]. If False, returns a dict with numpy arrays {‘centroid’: array, ‘std’: array}.

Returns:

Centroids information as a dict of arrays or a list of per-centroid dicts.

Return type:

dict | list

get_clusters(as_list=False)[source]

Returns detected clusters.

Parameters:

as_list (bool, default False) – If True, returns a list of dicts [{‘centroid’: c, ‘std’: s}, …]. If False, returns a dict with numpy arrays {‘centroid’: array, ‘std’: array}.

Returns:

Clusters information as a dict of arrays or a list of per-cluster dicts.

Return type:

dict | list

show_peaks(color='#2980BA', title='CircleClust peaks', output=None)[source]

Shows peaks in the histogram.

Parameters:
  • color (Optional[str], default "#2980BA") – Color of the peaks.

  • title (str, default "CircleClust peaks") – Title of the plot.

  • output (Optional[str], default None) – Output file name. If None, shows plot in a window.

show_centroids(output=None)[source]

Shows centroids in the histogram.

Parameters:

output (Optional[str], default None) – Output file name. If None, shows plot in a window.

show_clusters(output=None)[source]

Shows clusters in the histogram.

Parameters:

output (Optional[str], default None) – Output file name. If None, shows plot in a window.