Clustering
C++20 header-only: DBSCAN, HDBSCAN, k-means.
Loading...
Searching...
No Matches
clustering::kmeans::AutoSeeder< T, Mode > Class Template Reference

Seeder that picks between greedy k-means++ and AFK-MC2 against workload shape. More...

#include <clustering/kmeans/policy/auto_seeder.h>

Public Member Functions

void run (const NDArray< T, 2 > &X, std::size_t k, std::uint64_t seed, math::Pool pool, NDArray< T, 2 > &outCentroids)
 Seed outCentroids with the dispatched seeder; see the class docs for the dispatch rule.

Static Public Attributes

static constexpr std::size_t afkmc2NThreshold = 500000
 n threshold above which AFK-MC2 is preferred over greedy k-means++.
static constexpr std::size_t afkmc2KFloor = AfkMc2Seeder<T>::kFloor
 Minimum k at which AFK-MC2 is considered; mirrors AfkMc2Seeder::kFloor.
static constexpr std::size_t afkmc2BestOfNThreshold = 5000
 Best-of restart threshold on n above which AFK-MC2 is preferred.
static constexpr std::size_t afkmc2BestOfDThreshold = 8
 Best-of restart threshold on d above which AFK-MC2 is preferred.
static constexpr std::size_t afkmc2BestOfKFloor = 16
 Minimum k for best-of restart AFK-MC2 dispatch.
static constexpr std::size_t afkmc2BestOfWorkThreshold = 80000
 Minimum n * d work envelope for best-of restart AFK-MC2 dispatch.
static constexpr std::size_t afkmc2ChainLengthDefault = AfkMc2Seeder<T>::chainLengthDefault
 Default Markov-chain length passed through to AFK-MC2.

Detailed Description

template<class T, AutoSeederMode Mode = AutoSeederMode::kSingleRun>
class clustering::kmeans::AutoSeeder< T, Mode >

Seeder that picks between greedy k-means++ and AFK-MC2 against workload shape.

The default AutoSeederMode::kSingleRun mode keeps AFK-MC2 on the large-k envelope (n >= afkmc2NThreshold and k >= afkmc2KFloor). AutoSeederMode::kBestOf additionally enables AFK-MC2 at the tuned restart envelope (n >= afkmc2BestOfNThreshold, d >= afkmc2BestOfDThreshold, k >= afkmc2BestOfKFloor, and n * d >= afkmc2BestOfWorkThreshold). The alternative is re-picked lazily when the (n, d, k) shape changes between run calls so repeated fits at a stable shape preserve the held seeder's scratch.

Template Parameters
TElement type of the point cloud.
ModeDispatch envelope to use for this seeder instance.

Definition at line 30 of file auto_seeder.h.

Member Function Documentation

◆ run()

template<class T, AutoSeederMode Mode = AutoSeederMode::kSingleRun>
void clustering::kmeans::AutoSeeder< T, Mode >::run ( const NDArray< T, 2 > & X,
std::size_t k,
std::uint64_t seed,
math::Pool pool,
NDArray< T, 2 > & outCentroids )
inline

Seed outCentroids with the dispatched seeder; see the class docs for the dispatch rule.

Definition at line 102 of file auto_seeder.h.

Member Data Documentation

◆ afkmc2BestOfDThreshold

template<class T, AutoSeederMode Mode = AutoSeederMode::kSingleRun>
std::size_t clustering::kmeans::AutoSeeder< T, Mode >::afkmc2BestOfDThreshold = 8
staticconstexpr

Best-of restart threshold on d above which AFK-MC2 is preferred.

Definition at line 70 of file auto_seeder.h.

◆ afkmc2BestOfKFloor

template<class T, AutoSeederMode Mode = AutoSeederMode::kSingleRun>
std::size_t clustering::kmeans::AutoSeeder< T, Mode >::afkmc2BestOfKFloor = 16
staticconstexpr

Minimum k for best-of restart AFK-MC2 dispatch.

Definition at line 82 of file auto_seeder.h.

◆ afkmc2BestOfNThreshold

template<class T, AutoSeederMode Mode = AutoSeederMode::kSingleRun>
std::size_t clustering::kmeans::AutoSeeder< T, Mode >::afkmc2BestOfNThreshold = 5000
staticconstexpr

Best-of restart threshold on n above which AFK-MC2 is preferred.

Definition at line 57 of file auto_seeder.h.

◆ afkmc2BestOfWorkThreshold

template<class T, AutoSeederMode Mode = AutoSeederMode::kSingleRun>
std::size_t clustering::kmeans::AutoSeeder< T, Mode >::afkmc2BestOfWorkThreshold = 80000
staticconstexpr

Minimum n * d work envelope for best-of restart AFK-MC2 dispatch.

Definition at line 95 of file auto_seeder.h.

◆ afkmc2ChainLengthDefault

template<class T, AutoSeederMode Mode = AutoSeederMode::kSingleRun>
std::size_t clustering::kmeans::AutoSeeder< T, Mode >::afkmc2ChainLengthDefault = AfkMc2Seeder<T>::chainLengthDefault
staticconstexpr

Default Markov-chain length passed through to AFK-MC2.

Definition at line 99 of file auto_seeder.h.

◆ afkmc2KFloor

template<class T, AutoSeederMode Mode = AutoSeederMode::kSingleRun>
std::size_t clustering::kmeans::AutoSeeder< T, Mode >::afkmc2KFloor = AfkMc2Seeder<T>::kFloor
staticconstexpr

Minimum k at which AFK-MC2 is considered; mirrors AfkMc2Seeder::kFloor.

Definition at line 46 of file auto_seeder.h.

◆ afkmc2NThreshold

template<class T, AutoSeederMode Mode = AutoSeederMode::kSingleRun>
std::size_t clustering::kmeans::AutoSeeder< T, Mode >::afkmc2NThreshold = 500000
staticconstexpr

n threshold above which AFK-MC2 is preferred over greedy k-means++.

Definition at line 42 of file auto_seeder.h.


The documentation for this class was generated from the following file: