Classes
struct	MstEdge
	One edge of the minimum spanning tree of mutual-reachability distances. More...
struct	MstOutput
	Frozen output contract of every MST backend. More...
class	AutoMstBackend
	MST backend that dispatches between Prim, Boruvka, and NN-Descent on input shape. More...
class	BoruvkaMstBackend
	KDTree-accelerated Boruvka MST backend over mutual-reachability distances. More...
struct	NnDescentMstConfig
	Tuning knobs for the NN-Descent MST backend. More...
class	NnDescentMstBackend
	Approximate minimum-spanning-tree backend over mutual-reachability distance via an NN-Descent-built kNN graph plus Kruskal, with a connectivity fallback that closes any disconnected components. More...
class	PrimMstBackend
	Exact minimum-spanning-tree backend over mutual-reachability distance, streaming Prim. More...

Concepts
concept	MstBackendStrategy
	Contract for an MST backend satisfying the frozen clustering::hdbscan::MstOutput shape.

Enumerations
enum class	ClusterSelectionMethod : std::uint8_t { kEom , kLeaf }
	Cluster extraction method on the condensed tree. More...
enum class	MinSamplesConvention : std::uint8_t { kSklearn , kCampello }
	Semantics of the `minSamples` parameter at core-distance extraction. More...

Variables
constexpr std::size_t	kPrimMaxN = std::size_t{16384}
	Compute budget that gates the streaming Prim backend, expressed as the maximum point count it will accept.
constexpr std::size_t	kPrimMrdMatrixByteBudget = kPrimMaxN * kPrimMaxN * sizeof(float)
	Equivalent byte-budget phrasing of kPrimMaxN, kept so callers that gate on `nnsizeof`(T) <= kPrimMrdMatrixByteBudget compile unchanged.
constexpr std::size_t	kPrimDenseCoreMinN = 1024
	Thresholds gating the dense symmetric core-distance pass.
constexpr std::size_t	kPrimDenseCoreMinD = 17
constexpr std::size_t	kPrimDenseCoreMaxMinSamples = 64
constexpr std::size_t	kPrimPersistentRelaxMinWorkers = 8
constexpr std::size_t	kPrimPersistentRelaxMinOpsPerWorker = std::size_t{1} << 14

Enumeration Type Documentation

◆ ClusterSelectionMethod

enum class clustering::hdbscan::ClusterSelectionMethod : std::uint8_t

strong

Cluster extraction method on the condensed tree.

The two methods differ in traversal behaviour, not in output shape. Excess-of-mass is the default and optimizes per-cluster stability over the full tree; leaf returns every leaf of the condensed tree as its own cluster.

Enumerator
kEom	Excess-of-mass selection (the published default).
kLeaf	Leaf-cluster selection; every condensed-tree leaf becomes a cluster.

Definition at line 37 of file hdbscan.h.

◆ MinSamplesConvention

enum class clustering::hdbscan::MinSamplesConvention : std::uint8_t

strong

Semantics of the minSamples parameter at core-distance extraction.

Campello 2015 defines core_k(p) as the distance from p to its k-th nearest neighbour excluding p itself. scikit-learn and the original hdbscan package count p itself as one of the k neighbours, so for a given min_samples = k those implementations return what Campello would call core_{k-1}. Both conventions are widely deployed and produce different MSTs on high-dimensional near-uniform inputs; the enum is the API boundary between them.

Enumerator

kSklearn

minSamples counts the query point itself as one of the k neighbours.

The core distance returned is the Campello core_{minSamples-1}. Matches scikit-learn and the hdbscan package.

kCampello

minSamples counts only non-self neighbours, matching Campello 2015 directly.

Definition at line 52 of file hdbscan.h.

Variable Documentation

◆ kPrimDenseCoreMaxMinSamples

std::size_t clustering::hdbscan::kPrimDenseCoreMaxMinSamples = 64

inlineconstexpr

Definition at line 53 of file prim_mst_backend.h.

◆ kPrimDenseCoreMinD

std::size_t clustering::hdbscan::kPrimDenseCoreMinD = 17

inlineconstexpr

Definition at line 52 of file prim_mst_backend.h.

◆ kPrimDenseCoreMinN

std::size_t clustering::hdbscan::kPrimDenseCoreMinN = 1024

inlineconstexpr

Thresholds gating the dense symmetric core-distance pass.

The dense pass computes all n*(n-1)/2 squared distances once and feeds both endpoints into a per-row top-minSamples tracker. Its total work is O(n^2 * d) distance compute plus O(n^2 * minSamples) top-k bookkeeping. It beats the KDTree kNN fallback once minSamples is small enough for the top-k write amp to stay bounded and (n, d) is large enough to amortise the symmetric scan's fixed cost. Below these bounds the KDTree path wins; above kPrimDenseCoreMaxMinSamples the top-k rescan per update dominates.

Definition at line 51 of file prim_mst_backend.h.

◆ kPrimMaxN

std::size_t clustering::hdbscan::kPrimMaxN = std::size_t{16384}

inlineconstexpr

Compute budget that gates the streaming Prim backend, expressed as the maximum point count it will accept.

The streaming Prim variant materialises only O(n) state (no n*n MRD matrix), but the inner relax recomputes d-wide squared-Euclidean distances per popped vertex, costing O(n^2 * d) total scalar work. Beyond this n the dispatcher should prefer NN-Descent (high d) or Boruvka (low d), both of which scale better than O(n^2).

Definition at line 34 of file prim_mst_backend.h.

◆ kPrimMrdMatrixByteBudget

std::size_t clustering::hdbscan::kPrimMrdMatrixByteBudget = kPrimMaxN * kPrimMaxN * sizeof(float)

inlineconstexpr

Equivalent byte-budget phrasing of kPrimMaxN, kept so callers that gate on n*n*sizeof(T) <= kPrimMrdMatrixByteBudget compile unchanged.

Numerically equal to kPrimMaxN squared, scaled by sizeof(float).

Definition at line 39 of file prim_mst_backend.h.

◆ kPrimPersistentRelaxMinOpsPerWorker

std::size_t clustering::hdbscan::kPrimPersistentRelaxMinOpsPerWorker = std::size_t{1} << 14

inlineconstexpr

Definition at line 55 of file prim_mst_backend.h.

◆ kPrimPersistentRelaxMinWorkers

std::size_t clustering::hdbscan::kPrimPersistentRelaxMinWorkers = 8

inlineconstexpr

Definition at line 54 of file prim_mst_backend.h.

Classes

Concepts

Enumerations

Variables

Enumeration Type Documentation

◆ ClusterSelectionMethod

◆ MinSamplesConvention

Variable Documentation

◆ kPrimDenseCoreMaxMinSamples

◆ kPrimDenseCoreMinD

◆ kPrimDenseCoreMinN

◆ kPrimMaxN

◆ kPrimMrdMatrixByteBudget

◆ kPrimPersistentRelaxMinOpsPerWorker

◆ kPrimPersistentRelaxMinWorkers