|
Clustering
C++20 header-only: DBSCAN, HDBSCAN, k-means.
|
MST backend that dispatches between Prim, Boruvka, and NN-Descent on input shape. More...
#include <clustering/hdbscan/policy/auto_mst_backend.h>
Public Member Functions | |
| AutoMstBackend ()=default | |
| void | run (const NDArray< T, 2 > &X, std::size_t minSamples, math::Pool pool, MstOutput< T > &out) |
Fit a backend arm chosen on the input shape and delegate run to it. | |
| std::size_t | heldIndex () const noexcept |
| Index of the currently held variant arm. | |
| std::size_t | peekArm (std::size_t n, std::size_t d) |
Resolve the variant arm for (n, d) without running the pipeline. | |
Static Public Member Functions | |
| static constexpr bool | primFitsBudget (std::size_t n) noexcept |
Whether the Prim regime applies at n under the dense-MRD byte budget. | |
Static Public Attributes | |
| static constexpr std::size_t | boruvkaLowDimCeil = 16 |
Low-dimensional ceiling at or below which Boruvka is preferred regardless of N. | |
| static constexpr std::size_t | boruvkaDimCeil = 60 |
| Dimensional ceiling above which KDTree Boruvka gives way to NN-Descent when Prim is out of byte budget. | |
MST backend that dispatches between Prim, Boruvka, and NN-Descent on input shape.
The dispatcher holds a std::variant over the three concrete backends and re-emplaces the active arm whenever the (n, d) shape of the input changes between run calls. Each arm runs its own run on the held point cloud; the condensed-tree pipeline downstream is monomorphic and reads from MstOutput.
Selection rules, total over the full (N, d) domain:
d <= AutoMstBackend::boruvkaLowDimCeil -> BoruvkaMstBackend (KDTree pruning dominates at every n; dense Prim's quadratic overhead does not amortise).d > AutoMstBackend::boruvkaLowDimCeil and N*N*sizeof(T) <= kPrimMrdMatrixByteBudget -> PrimMstBackend (streaming dense Prim beats KDTree fan-out once AABB pruning decays with d).d <= AutoMstBackend::boruvkaDimCeil and Prim is out of budget -> BoruvkaMstBackend (KDTree-accelerated, exact; AABB pruning still fires enough at moderate d).d > AutoMstBackend::boruvkaDimCeil and Prim is out of budget -> NnDescentMstBackend (approximate kNN-graph + Kruskal with a connectivity fallback).The dimensional ceiling is overridable via CLUSTERING_HDBSCAN_BORUVKA_DIM_CEIL. The low-d ceiling is overridable via CLUSTERING_HDBSCAN_BORUVKA_LOW_DIM_CEIL. The Prim regime is gated directly by the byte budget so no override choice can push Prim above the documented memory ceiling.
Staleness uses an (n, d) shape tuple only, mirroring AutoSeeder. The dispatcher does not cache data-dependent state across calls: every run rebuilds the held backend's data-dependent index (KDTree, kNN graph), and the variant arm itself is reconstructed when a shape change crosses a dispatch boundary.
| T | Element type of the point cloud. Only float is supported. |
Definition at line 48 of file auto_mst_backend.h.
|
default |
|
inlinenodiscardnoexcept |
Index of the currently held variant arm.
Returns 0 for Boruvka, 1 for Prim, and 2 for NN-Descent, matching the variant's declared order. Useful for dispatcher-level tests that need to confirm which backend was selected for an input shape without running the full pipeline.
Definition at line 124 of file auto_mst_backend.h.
|
inline |
Resolve the variant arm for (n, d) without running the pipeline.
Emplaces the correct variant arm per the dispatch rules and returns heldIndex. No MST is built; minSamples and pool are not consulted. Intended for dispatch-verification tests and benchmark harnesses that want to know which backend a shape would hit without paying the full-fit cost.
Definition at line 134 of file auto_mst_backend.h.
|
inlinestaticconstexprnoexcept |
Whether the Prim regime applies at n under the dense-MRD byte budget.
Prim materialises an n*n matrix of T, so the admissible set is n*n*sizeof(T) <= kPrimMrdMatrixByteBudget. Exposed as a static helper so callers and tests share the exact boundary the dispatcher uses.
Definition at line 94 of file auto_mst_backend.h.
|
inline |
Fit a backend arm chosen on the input shape and delegate run to it.
| X | Point cloud; caller retains ownership for the call's lifetime. |
| minSamples | Core-distance neighbour count; forwarded to the dispatched backend. |
| pool | Worker pool handle; forwarded to the dispatched backend. |
| out | Destination for edges and core distances; written by the dispatched backend. |
Definition at line 112 of file auto_mst_backend.h.
|
staticconstexpr |
Dimensional ceiling above which KDTree Boruvka gives way to NN-Descent when Prim is out of byte budget.
Definition at line 80 of file auto_mst_backend.h.
|
staticconstexpr |
Low-dimensional ceiling at or below which Boruvka is preferred regardless of N.
Definition at line 66 of file auto_mst_backend.h.