Clustering
C++20 header-only: DBSCAN, HDBSCAN, k-means.
Loading...
Searching...
No Matches
clustering::hdbscan::AutoMstBackend< T > Class Template Reference

MST backend that dispatches between Prim, Boruvka, and NN-Descent on input shape. More...

#include <clustering/hdbscan/policy/auto_mst_backend.h>

Public Member Functions

 AutoMstBackend ()=default
void run (const NDArray< T, 2 > &X, std::size_t minSamples, math::Pool pool, MstOutput< T > &out)
 Fit a backend arm chosen on the input shape and delegate run to it.
std::size_t heldIndex () const noexcept
 Index of the currently held variant arm.
std::size_t peekArm (std::size_t n, std::size_t d)
 Resolve the variant arm for (n, d) without running the pipeline.

Static Public Member Functions

static constexpr bool primFitsBudget (std::size_t n) noexcept
 Whether the Prim regime applies at n under the dense-MRD byte budget.

Static Public Attributes

static constexpr std::size_t boruvkaLowDimCeil = 16
 Low-dimensional ceiling at or below which Boruvka is preferred regardless of N.
static constexpr std::size_t boruvkaDimCeil = 60
 Dimensional ceiling above which KDTree Boruvka gives way to NN-Descent when Prim is out of byte budget.

Detailed Description

template<class T>
class clustering::hdbscan::AutoMstBackend< T >

MST backend that dispatches between Prim, Boruvka, and NN-Descent on input shape.

The dispatcher holds a std::variant over the three concrete backends and re-emplaces the active arm whenever the (n, d) shape of the input changes between run calls. Each arm runs its own run on the held point cloud; the condensed-tree pipeline downstream is monomorphic and reads from MstOutput.

Selection rules, total over the full (N, d) domain:

The dimensional ceiling is overridable via CLUSTERING_HDBSCAN_BORUVKA_DIM_CEIL. The low-d ceiling is overridable via CLUSTERING_HDBSCAN_BORUVKA_LOW_DIM_CEIL. The Prim regime is gated directly by the byte budget so no override choice can push Prim above the documented memory ceiling.

Staleness uses an (n, d) shape tuple only, mirroring AutoSeeder. The dispatcher does not cache data-dependent state across calls: every run rebuilds the held backend's data-dependent index (KDTree, kNN graph), and the variant arm itself is reconstructed when a shape change crosses a dispatch boundary.

Template Parameters
TElement type of the point cloud. Only float is supported.

Definition at line 48 of file auto_mst_backend.h.

Constructor & Destructor Documentation

◆ AutoMstBackend()

template<class T>
clustering::hdbscan::AutoMstBackend< T >::AutoMstBackend ( )
default

Member Function Documentation

◆ heldIndex()

template<class T>
std::size_t clustering::hdbscan::AutoMstBackend< T >::heldIndex ( ) const
inlinenodiscardnoexcept

Index of the currently held variant arm.

Returns 0 for Boruvka, 1 for Prim, and 2 for NN-Descent, matching the variant's declared order. Useful for dispatcher-level tests that need to confirm which backend was selected for an input shape without running the full pipeline.

Definition at line 124 of file auto_mst_backend.h.

◆ peekArm()

template<class T>
std::size_t clustering::hdbscan::AutoMstBackend< T >::peekArm ( std::size_t n,
std::size_t d )
inline

Resolve the variant arm for (n, d) without running the pipeline.

Emplaces the correct variant arm per the dispatch rules and returns heldIndex. No MST is built; minSamples and pool are not consulted. Intended for dispatch-verification tests and benchmark harnesses that want to know which backend a shape would hit without paying the full-fit cost.

Definition at line 134 of file auto_mst_backend.h.

◆ primFitsBudget()

template<class T>
constexpr bool clustering::hdbscan::AutoMstBackend< T >::primFitsBudget ( std::size_t n)
inlinestaticconstexprnoexcept

Whether the Prim regime applies at n under the dense-MRD byte budget.

Prim materialises an n*n matrix of T, so the admissible set is n*n*sizeof(T) <= kPrimMrdMatrixByteBudget. Exposed as a static helper so callers and tests share the exact boundary the dispatcher uses.

Definition at line 94 of file auto_mst_backend.h.

◆ run()

template<class T>
void clustering::hdbscan::AutoMstBackend< T >::run ( const NDArray< T, 2 > & X,
std::size_t minSamples,
math::Pool pool,
MstOutput< T > & out )
inline

Fit a backend arm chosen on the input shape and delegate run to it.

Parameters
XPoint cloud; caller retains ownership for the call's lifetime.
minSamplesCore-distance neighbour count; forwarded to the dispatched backend.
poolWorker pool handle; forwarded to the dispatched backend.
outDestination for edges and core distances; written by the dispatched backend.

Definition at line 112 of file auto_mst_backend.h.

Member Data Documentation

◆ boruvkaDimCeil

template<class T>
std::size_t clustering::hdbscan::AutoMstBackend< T >::boruvkaDimCeil = 60
staticconstexpr

Dimensional ceiling above which KDTree Boruvka gives way to NN-Descent when Prim is out of byte budget.

Definition at line 80 of file auto_mst_backend.h.

◆ boruvkaLowDimCeil

template<class T>
std::size_t clustering::hdbscan::AutoMstBackend< T >::boruvkaLowDimCeil = 16
staticconstexpr

Low-dimensional ceiling at or below which Boruvka is preferred regardless of N.

Definition at line 66 of file auto_mst_backend.h.


The documentation for this class was generated from the following file: