|
Clustering
C++20 header-only: DBSCAN, HDBSCAN, k-means.
|
KDTree-accelerated Boruvka MST backend over mutual-reachability distances. More...
#include <clustering/hdbscan/policy/boruvka_mst_backend.h>
Public Member Functions | |
| BoruvkaMstBackend ()=default | |
| void | run (const NDArray< T, 2 > &X, std::size_t minSamples, math::Pool pool, MstOutput< T > &out) |
Build the MRD-weighted minimum spanning tree of X. | |
KDTree-accelerated Boruvka MST backend over mutual-reachability distances.
Given a point matrix X and core-distance parameter minSamples, the backend:
X.minSamples-th nearest-neighbour squared distance (self-excluded) via KDTree::knnQuery.clustering::hdbscan::detail::nearestOutComponent to find the minimum-MRD-weight out-of-component edge per component via per-point KDTree traversals pruned on each subtree's axis-aligned bounding box, then unions those edges into the spanning tree under the union-find component structure.The backend satisfies MstBackendStrategy and holds no persistent state across calls; data-dependent state (the KDTree) is built fresh each run in line with the HDBSCAN pipeline contract that forbids cross-call caching of pointer-keyed data.
| T | Element type of the point cloud. Only float is supported; a double specialization is out of scope in line with HDBSCAN's top-level restriction. |
Definition at line 41 of file boruvka_mst_backend.h.
|
default |
|
inline |
Build the MRD-weighted minimum spanning tree of X.
minSamples is positive and strictly less than X.dim(0). std::int32_t; the HDBSCAN class shell asserts this before a call reaches the backend.| X | Contiguous (n x d) dataset; caller retains ownership. |
| minSamples | Neighbour count driving the core-distance definition. |
| pool | Worker pool; forwarded to the kNN query and the per-round KDTree traversal. A null pool runs single-threaded. |
| out | Destination; edges filled with n - 1 entries in ascending-weight order and coreDistances sized to n. |
Definition at line 63 of file boruvka_mst_backend.h.