Clustering
C++20 header-only: DBSCAN, HDBSCAN, k-means.
Loading...
Searching...
No Matches
clustering::hdbscan::NnDescentMstBackend< T > Class Template Reference

Approximate minimum-spanning-tree backend over mutual-reachability distance via an NN-Descent-built kNN graph plus Kruskal, with a connectivity fallback that closes any disconnected components. More...

#include <clustering/hdbscan/policy/nn_descent_mst_backend.h>

Public Member Functions

 NnDescentMstBackend ()=default
 Default-construct with the default NnDescentMstConfig.
 NnDescentMstBackend (NnDescentMstConfig config)
 Construct with an explicit tuning config.
void run (const NDArray< T, 2 > &X, std::size_t minSamples, math::Pool pool, MstOutput< T > &out)
 Build the approximate MRD-weighted minimum spanning tree of X.

Detailed Description

template<class T>
class clustering::hdbscan::NnDescentMstBackend< T >

Approximate minimum-spanning-tree backend over mutual-reachability distance via an NN-Descent-built kNN graph plus Kruskal, with a connectivity fallback that closes any disconnected components.

Given a point matrix X and a core-distance parameter minSamples, the backend:

  1. Builds (or reuses) an clustering::index::NnDescentIndex at k = minSamples + kExtra.
  2. Extracts per-point core distances as the minSamples-th nearest squared distance from the kNN graph.
  3. Promotes the kNN edges to an MRD-weighted undirected edge list: MRD(i, j) = max(coreDist[i], coreDist[j], sqDist(i, j)).
  4. Runs Kruskal on the sorted MRD edge list to produce a spanning forest.
  5. When the forest spans fewer than n components, enumerates minimum-weight bridges between every pair of disjoint components and adds c - 1 bridges Kruskal-style until the tree is connected.

The fallback enumerates n_a * n_b * d arithmetic for each pair of components; it is expected-rare on typical inputs where k = minSamples + kExtra yields a connected kNN graph.

Satisfies MstBackendStrategy. The kNN graph is data-dependent and is rebuilt on every run call; caching it across fits would risk silent failure under in-place mutation of the caller's buffer. Only shape-free scratch (out buffers sized to the current input) amortises across calls at fixed shape.

Template Parameters
TElement type of the point cloud. Only float is supported.

Definition at line 72 of file nn_descent_mst_backend.h.

Constructor & Destructor Documentation

◆ NnDescentMstBackend() [1/2]

template<class T>
clustering::hdbscan::NnDescentMstBackend< T >::NnDescentMstBackend ( )
default

Default-construct with the default NnDescentMstConfig.

Required for the MstBackendStrategy concept. Callers who need custom tuning use NnDescentMstBackend(NnDescentMstConfig).

◆ NnDescentMstBackend() [2/2]

template<class T>
clustering::hdbscan::NnDescentMstBackend< T >::NnDescentMstBackend ( NnDescentMstConfig config)
inlineexplicit

Construct with an explicit tuning config.

Distinct from the zero-arg default ctor to avoid the default-constructor ambiguity in [class.default.ctor]/2.

Definition at line 92 of file nn_descent_mst_backend.h.

Member Function Documentation

◆ run()

template<class T>
void clustering::hdbscan::NnDescentMstBackend< T >::run ( const NDArray< T, 2 > & X,
std::size_t minSamples,
math::Pool pool,
MstOutput< T > & out )
inline

Build the approximate MRD-weighted minimum spanning tree of X.

Precondition
minSamples is positive and strictly less than X.dim(0).
Parameters
XContiguous (n x d) dataset; caller retains ownership.
minSamplesNeighbour count driving the core-distance definition.
poolWorker pool; forwarded to the inner NnDescentIndex build.
outDestination; edges receives the n - 1 MST edges in insertion order and coreDistances is sized to n.

Definition at line 105 of file nn_descent_mst_backend.h.


The documentation for this class was generated from the following file: