|
Clustering
C++20 header-only: DBSCAN, HDBSCAN, k-means.
|
Exact minimum-spanning-tree backend over mutual-reachability distance, streaming Prim. More...
#include <clustering/hdbscan/policy/prim_mst_backend.h>
Public Member Functions | |
| PrimMstBackend ()=default | |
| void | run (const NDArray< T, 2 > &X, std::size_t minSamples, math::Pool pool, MstOutput< T > &out) |
Build the MRD-weighted minimum spanning tree of X. | |
Exact minimum-spanning-tree backend over mutual-reachability distance, streaming Prim.
Given a point matrix X and a core-distance parameter minSamples, the backend:
minSamples-th nearest-neighbour distance (self-excluded, squared). The large-n / small-minSamples path uses an exact dense symmetric pass; other shapes fall back to KDTree::knnQuery.0. Per popped vertex, the relax step recomputes the d-wide squared Euclidean distance to every unvisited point on the fly (no n*n matrix is ever materialised) and lifts to MRD via the per-pair max with both core distances. The dense-core path reuses row norms and dot products for the squared-distance identity.edgeWeight.n*n distance matrix is materialised.X sequentially in d-wide rows. Total bytes touched per iteration is n * d * sizeof(T), which fits in L2 for the dispatcher's (n, d) window.The backend satisfies MstBackendStrategy and carries no persistent state.
| T | Element type of the point cloud. Only float is supported. |
Definition at line 84 of file prim_mst_backend.h.
|
default |
|
inline |
Build the MRD-weighted minimum spanning tree of X.
minSamples is positive and strictly less than X.dim(0). kPrimMaxN. A violation fires CLUSTERING_ALWAYS_ASSERT before any allocation.| X | Contiguous (n x d) dataset; caller retains ownership. |
| minSamples | Neighbour count driving the core-distance definition. |
| pool | Worker pool; forwarded to the KDTree build and kNN query. |
| out | Destination; edges filled with n - 1 entries in insertion order and coreDistances sized to n. |
Definition at line 104 of file prim_mst_backend.h.