circleclust.circleclust
Classes
|
Circular clustering with automatic detection of centroids as distribution peaks. |
- class circleclust.circleclust.CircleClust(data=None, period=None, window=None, max_screen_divisor=32, max_screen_iter=2, train_frac=0.7, random_seed=0, verbose=False, debug=False)[source]
Bases:
objectCircular clustering with automatic detection of centroids as distribution peaks.
This class clusters circular data (e.g., times of day, hues) by: - Binning data on a circle with periodic boundaries. - Applying a fixed Hann smoothing window of 9 bins with circular padding. - Screening bin counts via k = 1..max_screen_divisor where B = 9*k and
window width (in radians of the unit circle) is 2π/k.
Selecting the binning that minimizes test RMS between histogram and smoothed envelope, averaged over max_screen_iter random splits.
Detecting circular local maxima on the smoothed histogram as cluster centers.
- Parameters:
data (Iterable[float] | None) – Optional data to fit immediately upon construction.
period (float, default 2π) – Period of the input values; inputs are wrapped into [0, range). Important to provide valid data range, for example: - 1 if your data period is 1 (e.g. color hues in HLS or HSV space) - 24 if your data is measured in hours (e.g. times of day) - 360 if your data is measured in degrees - 2π if your data is measured in radians (e.g. angles)
window (float | None) – Manual override for smoothing window width (in radians on unit circle). If provided, the screener will be skipped and nearest k≈2π/window used.
max_screen_divisor (int, default 32) – Maximum divisor k in screening; B = 9*k.
max_screen_iter (int, default 2) – Number of screening repetitions with different train/test splits; RMS is averaged across repetitions before selecting k.
train_frac (float, default 0.7) – Fraction of samples for training during screening.
random_seed (int, default 0) – Base random seed for reproducible splits.
verbose (bool, default False) – Enable informational prints via verbose_print.
debug (bool)
- __init__(data=None, period=None, window=None, max_screen_divisor=32, max_screen_iter=2, train_frac=0.7, random_seed=0, verbose=False, debug=False)[source]
- set_window(window)[source]
Sets manually specified smoothing window.
- Parameters:
window (float) – Smoothing window width (in radians on unit circle).
- fit(data, period=None, verbose=None)[source]
Fits the model by selecting smoothing binning and finding centers.
- Parameters:
data (Iterable[float]) – Input data to fit.
period (float, default 2π) – Period of the input values; inputs are wrapped into [0, period). Important to provide valid data range, for example: - 1 if your data period is 1 (e.g. color hues in HLS or HSV space) - 24 if your data is measured in hours (e.g. times of day) - 360 if your data is measured in degrees - 2π if your data is measured in radians (e.g. angles)
verbose (bool | None, default None) – Whether to print verbose output. If None, uses self.verbose.
- Returns:
self – The fitted instance with detected peak_idx, peak_std, centroid and centroid_std arrays.
- Return type:
- predict(x, width_scale=1.0)[source]
Predicts cluster labels by nearest circular distance to centers.
- Parameters:
- Returns:
Integer labels in [0, n_centers-1], or -1 for outliers.
- Return type:
np.ndarray
- show_peaks(color='#2980BA', title='CircleClust peaks', output=None)[source]
Shows peaks in the histogram.