circleclust.circleclust

Classes

CircleClust([data, period, window, ...])

Circular clustering with automatic detection of centroids as distribution peaks.

class circleclust.circleclust.CircleClust(data=None, period=None, window=None, max_screen_divisor=32, max_screen_iter=2, train_frac=0.7, random_seed=0, verbose=False, debug=False)[source]

Bases: object

Circular clustering with automatic detection of centroids as distribution peaks.

This class clusters circular data (e.g., times of day, hues) by: - Binning data on a circle with periodic boundaries. - Applying a fixed Hann smoothing window of 9 bins with circular padding. - Screening bin counts via k = 1..max_screen_divisor where B = 9*k and

window width (in radians of the unit circle) is 2π/k.

Selecting the binning that minimizes test RMS between histogram and smoothed envelope, averaged over max_screen_iter random splits.
Detecting circular local maxima on the smoothed histogram as cluster centers.

Parameters:

data (Iterable[float] | None) – Optional data to fit immediately upon construction.
period (float, default 2π) – Period of the input values; inputs are wrapped into [0, range). Important to provide valid data range, for example: - 1 if your data period is 1 (e.g. color hues in HLS or HSV space) - 24 if your data is measured in hours (e.g. times of day) - 360 if your data is measured in degrees - 2π if your data is measured in radians (e.g. angles)
window (float | None) – Manual override for smoothing window width (in radians on unit circle). If provided, the screener will be skipped and nearest k≈2π/window used.
max_screen_divisor (int, default 32) – Maximum divisor k in screening; B = 9*k.
max_screen_iter (int, default 2) – Number of screening repetitions with different train/test splits; RMS is averaged across repetitions before selecting k.
train_frac (float, default 0.7) – Fraction of samples for training during screening.
random_seed (int, default 0) – Base random seed for reproducible splits.
verbose (bool, default False) – Enable informational prints via verbose_print.
debug (bool)

__init__(data=None, period=None, window=None, max_screen_divisor=32, max_screen_iter=2, train_frac=0.7, random_seed=0, verbose=False, debug=False)[source]

Parameters:

data (Iterable[float] | None)
period (float)
window (float | None)
max_screen_divisor (int)
max_screen_iter (int)
train_frac (float)
random_seed (int)
verbose (bool)
debug (bool)

set_window(window)[source]

Sets manually specified smoothing window.

Parameters:: window (float) – Smoothing window width (in radians on unit circle).

fit(data, period=None, verbose=None)[source]

Fits the model by selecting smoothing binning and finding centers.

Parameters:

data (Iterable[float]) – Input data to fit.
period (float, default 2π) – Period of the input values; inputs are wrapped into [0, period). Important to provide valid data range, for example: - 1 if your data period is 1 (e.g. color hues in HLS or HSV space) - 24 if your data is measured in hours (e.g. times of day) - 360 if your data is measured in degrees - 2π if your data is measured in radians (e.g. angles)
verbose (bool | None, default None) – Whether to print verbose output. If None, uses self.verbose.

Returns:

self – The fitted instance with detected peak_idx, peak_std, centroid and centroid_std arrays.

Return type:

CircleClust

predict(x, width_scale=1.0)[source]

Predicts cluster labels by nearest circular distance to centers.

Parameters:

x (Iterable[float]) – Input values (wrapped to [0, range)). Units must match range used during fitting (e.g., radians, minutes, or fraction of circle).
width_scale (float, default 1.0) – Multiplies the peak width (std) to define the distance threshold for assignment.

Returns:

Integer labels in [0, n_centers-1], or -1 for outliers.

Return type:

np.ndarray

get_peaks(as_list=False)[source]

Returns detected peaks.

Parameters:: as_list (bool, default False) – If True, returns a list of dicts [{‘centroid’: c, ‘std’: s}, …]. If False, returns a dict with numpy arrays {‘centroid’: array, ‘std’: array}.
Returns:: Peaks information as a dict of arrays or a list of per-peak dicts.
Return type:: dict | list

get_centroids(as_list=False)[source]

Returns detected centroids.

Parameters:: as_list (bool, default False) – If True, returns a list of dicts [{‘centroid’: c, ‘std’: s}, …]. If False, returns a dict with numpy arrays {‘centroid’: array, ‘std’: array}.
Returns:: Centroids information as a dict of arrays or a list of per-centroid dicts.
Return type:: dict | list

get_clusters(as_list=False)[source]

Returns detected clusters.

Parameters:: as_list (bool, default False) – If True, returns a list of dicts [{‘centroid’: c, ‘std’: s}, …]. If False, returns a dict with numpy arrays {‘centroid’: array, ‘std’: array}.
Returns:: Clusters information as a dict of arrays or a list of per-cluster dicts.
Return type:: dict | list

show_peaks(color='#2980BA', title='CircleClust peaks', output=None)[source]

Shows peaks in the histogram.

Parameters:

color (Optional[str], default "#2980BA") – Color of the peaks.
title (str, default "CircleClust peaks") – Title of the plot.
output (Optional[str], default None) – Output file name. If None, shows plot in a window.

show_centroids(output=None)[source]

Shows centroids in the histogram.

Parameters:: output (Optional[str], default None) – Output file name. If None, shows plot in a window.

show_clusters(output=None)[source]

Shows clusters in the histogram.

Parameters:: output (Optional[str], default None) – Output file name. If None, shows plot in a window.