Clustering
C++20 header-only: DBSCAN, HDBSCAN, k-means.
Loading...
Searching...
No Matches
clustering::hdbscan::PrimMstBackend< T > Class Template Reference

Exact minimum-spanning-tree backend over mutual-reachability distance, streaming Prim. More...

#include <clustering/hdbscan/policy/prim_mst_backend.h>

Public Member Functions

 PrimMstBackend ()=default
void run (const NDArray< T, 2 > &X, std::size_t minSamples, math::Pool pool, MstOutput< T > &out)
 Build the MRD-weighted minimum spanning tree of X.

Detailed Description

template<class T>
class clustering::hdbscan::PrimMstBackend< T >

Exact minimum-spanning-tree backend over mutual-reachability distance, streaming Prim.

Given a point matrix X and a core-distance parameter minSamples, the backend:

  1. Computes per-point core distances as the minSamples-th nearest-neighbour distance (self-excluded, squared). The large-n / small-minSamples path uses an exact dense symmetric pass; other shapes fall back to KDTree::knnQuery.
  2. Runs Prim's algorithm rooted at vertex 0. Per popped vertex, the relax step recomputes the d-wide squared Euclidean distance to every unvisited point on the fly (no n*n matrix is ever materialised) and lifts to MRD via the per-pair max with both core distances. The dense-core path reuses row norms and dot products for the squared-distance identity.
  3. Selects the next vertex while relaxing the current row on the serial path, avoiding a second full scan over edgeWeight.
Memory
Working state is O(n) plus O(n * minSamples) while dense core distances are computed. No n*n distance matrix is materialised.
Cache
Each Prim iteration streams X sequentially in d-wide rows. Total bytes touched per iteration is n * d * sizeof(T), which fits in L2 for the dispatcher's (n, d) window.

The backend satisfies MstBackendStrategy and carries no persistent state.

Template Parameters
TElement type of the point cloud. Only float is supported.

Definition at line 84 of file prim_mst_backend.h.

Constructor & Destructor Documentation

◆ PrimMstBackend()

template<class T>
clustering::hdbscan::PrimMstBackend< T >::PrimMstBackend ( )
default

Member Function Documentation

◆ run()

template<class T>
void clustering::hdbscan::PrimMstBackend< T >::run ( const NDArray< T, 2 > & X,
std::size_t minSamples,
math::Pool pool,
MstOutput< T > & out )
inline

Build the MRD-weighted minimum spanning tree of X.

Precondition
minSamples is positive and strictly less than X.dim(0).
X.dim(0) does not exceed kPrimMaxN. A violation fires CLUSTERING_ALWAYS_ASSERT before any allocation.
Parameters
XContiguous (n x d) dataset; caller retains ownership.
minSamplesNeighbour count driving the core-distance definition.
poolWorker pool; forwarded to the KDTree build and kNN query.
outDestination; edges filled with n - 1 entries in insertion order and coreDistances sized to n.

Definition at line 104 of file prim_mst_backend.h.


The documentation for this class was generated from the following file: