Clustering
C++20 header-only: DBSCAN, HDBSCAN, k-means.
Loading...
Searching...
No Matches
clustering::DBSCAN< T, QueryModel > Class Template Reference

Density-based clustering over the eps-neighborhood graph produced by a clustering::index::RangeIndex backend. More...

#include <clustering/dbscan.h>

Public Member Functions

 DBSCAN (T eps, std::size_t minPts, std::size_t nJobs=0)
 Construct a reusable DBSCAN fitter.
 DBSCAN (const DBSCAN &)=delete
DBSCANoperator= (const DBSCAN &)=delete
 DBSCAN (DBSCAN &&)=delete
DBSCANoperator= (DBSCAN &&)=delete
 ~DBSCAN ()=default
void run (const NDArray< T, 2 > &X)
 Fit to X.
const NDArray< std::int32_t, 1 > & labels () const noexcept
 Per-point cluster labels after run; NOISY marks outliers.
std::size_t nClusters () const noexcept
 Total number of clusters discovered by the most recent run.
void reset ()
 Release every scratch buffer. The next run call reallocates against its shape.

Static Public Attributes

static constexpr std::int32_t UNCLASSIFIED = -2
 Sentinel for a point not yet visited.
static constexpr std::int32_t NOISY = -1
 Label assigned to points that no cluster claimed.

Detailed Description

template<class T, class QueryModel = index::AutoRangeIndex<T>>
requires index::RangeIndex<QueryModel, T>
class clustering::DBSCAN< T, QueryModel >

Density-based clustering over the eps-neighborhood graph produced by a clustering::index::RangeIndex backend.

DBSCAN groups points whose eps-ball contains at least minPts neighbors into density- reachable clusters; points that fall outside any cluster are noise. The backend surfaces the whole adjacency in one call so this class never touches pairwise distances directly.

Note
DBSCAN does NOT own X. The caller must keep the NDArray alive for the duration of every run call. Construction is stateless in X so a single DBSCAN instance can be reused across fits; repeated runs reallocate the label buffer only when n changes and keep the lazy thread pool spawned after the first parallel-eligible shape.
Template Parameters
TElement type of the point cloud.
QueryModelRange-index backend. Defaults to clustering::index::AutoRangeIndex, which picks a KD-Tree below bruteForceDimFloor and a blocked pairwise sweep at or above it.

Definition at line 38 of file dbscan.h.

Constructor & Destructor Documentation

◆ DBSCAN() [1/3]

template<class T, class QueryModel = index::AutoRangeIndex<T>>
clustering::DBSCAN< T, QueryModel >::DBSCAN ( T eps,
std::size_t minPts,
std::size_t nJobs = 0 )
inlineexplicit

Construct a reusable DBSCAN fitter.

Parameters
epsRadius of the density neighbourhood used to test reachability.
minPtsMinimum neighbour count (including self) that marks a core point.
nJobsWorker count for the range-index backend. A value of 0 is clamped upward to std::thread::hardware_concurrency() so the pool is always usable by the math::Pool helpers.

Definition at line 52 of file dbscan.h.

◆ DBSCAN() [2/3]

template<class T, class QueryModel = index::AutoRangeIndex<T>>
clustering::DBSCAN< T, QueryModel >::DBSCAN ( const DBSCAN< T, QueryModel > & )
delete

◆ DBSCAN() [3/3]

template<class T, class QueryModel = index::AutoRangeIndex<T>>
clustering::DBSCAN< T, QueryModel >::DBSCAN ( DBSCAN< T, QueryModel > && )
delete

◆ ~DBSCAN()

template<class T, class QueryModel = index::AutoRangeIndex<T>>
clustering::DBSCAN< T, QueryModel >::~DBSCAN ( )
default

Member Function Documentation

◆ labels()

template<class T, class QueryModel = index::AutoRangeIndex<T>>
const NDArray< std::int32_t, 1 > & clustering::DBSCAN< T, QueryModel >::labels ( ) const
inlinenodiscardnoexcept

Per-point cluster labels after run; NOISY marks outliers.

Definition at line 121 of file dbscan.h.

◆ nClusters()

template<class T, class QueryModel = index::AutoRangeIndex<T>>
std::size_t clustering::DBSCAN< T, QueryModel >::nClusters ( ) const
inlinenodiscardnoexcept

Total number of clusters discovered by the most recent run.

Definition at line 124 of file dbscan.h.

◆ operator=() [1/2]

template<class T, class QueryModel = index::AutoRangeIndex<T>>
DBSCAN & clustering::DBSCAN< T, QueryModel >::operator= ( const DBSCAN< T, QueryModel > & )
delete

◆ operator=() [2/2]

template<class T, class QueryModel = index::AutoRangeIndex<T>>
DBSCAN & clustering::DBSCAN< T, QueryModel >::operator= ( DBSCAN< T, QueryModel > && )
delete

◆ reset()

template<class T, class QueryModel = index::AutoRangeIndex<T>>
void clustering::DBSCAN< T, QueryModel >::reset ( )
inline

Release every scratch buffer. The next run call reallocates against its shape.

Definition at line 127 of file dbscan.h.

◆ run()

template<class T, class QueryModel = index::AutoRangeIndex<T>>
void clustering::DBSCAN< T, QueryModel >::run ( const NDArray< T, 2 > & X)
inline

Fit to X.

Queries the backend for the full eps-neighbourhood adjacency, derives core-point flags from per-row degrees, then expands clusters sequentially via graph BFS over the adjacency. Remaining unclassified points are marked NOISY.

Parameters
XContiguous n x d dataset. The caller retains ownership; X must outlive this run call.
Warning
X must remain alive and unchanged for the full duration of this call.

Definition at line 78 of file dbscan.h.

Member Data Documentation

◆ NOISY

template<class T, class QueryModel = index::AutoRangeIndex<T>>
std::int32_t clustering::DBSCAN< T, QueryModel >::NOISY = -1
staticconstexpr

Label assigned to points that no cluster claimed.

Definition at line 41 of file dbscan.h.

◆ UNCLASSIFIED

template<class T, class QueryModel = index::AutoRangeIndex<T>>
std::int32_t clustering::DBSCAN< T, QueryModel >::UNCLASSIFIED = -2
staticconstexpr

Sentinel for a point not yet visited.

Definition at line 40 of file dbscan.h.


The documentation for this class was generated from the following file: