Clustering
C++20 header-only: DBSCAN, HDBSCAN, k-means.
Loading...
Searching...
No Matches
clustering::KDTree< T, distanceType, LeafSize, AllocT > Class Template Reference

Implements a KDTree data structure. More...

#include <clustering/index/kdtree.h>

Public Types

using value_type = T
 Element type of the indexed point cloud.

Public Member Functions

 KDTree (const NDArray< T, 2 > &points)
 Constructs a KDTree using a given set of points.
std::vector< std::size_t > query (const NDArray< T, 1 > &query_point, T radius, std::int64_t limit=-1) const
 Finds points within a specified radius of a query point.
std::vector< std::vector< std::int32_t > > query (T radius, math::Pool pool) const
 Returns the full radius-neighborhood adjacency over the indexed point cloud.
std::pair< NDArray< std::int32_t, 2 >, NDArray< T, 2 > > knnQuery (std::int32_t k, math::Pool pool) const
 Returns the k nearest neighbours of every indexed point, self-excluded.
std::pair< std::span< const T >, std::span< const T > > nodeBounds (const KDTreeNode *node) const noexcept
 Axis-aligned bounding box of the points routed through node.
std::span< const std::size_t > indexPermutation () const noexcept
 Permutation from reordered-slot index to original point index.
std::span< const T > reorderedPoints () const noexcept
 Points in reordered (tree-build) order as a flat row-major buffer.
std::size_t nodeCount () const noexcept
 Total node count, equal to one past the largest m_id assigned during construction.
const KDTreeNoderoot () const noexcept
 Root of the tree; nullptr for an empty point set.
std::size_t dim () const noexcept
 Dimension count of the indexed point cloud.
 ~KDTree ()
 Destroys the KDTree, deallocating its nodes.

Detailed Description

template<class T, KDTreeDistanceType distanceType = KDTreeDistanceType::kEucledian, std::size_t LeafSize = 16, class AllocT = LinearAllocator<KDTreeNode>>
class clustering::KDTree< T, distanceType, LeafSize, AllocT >

Implements a KDTree data structure.

KDTree is a space-partitioning data structure for organizing points in a K-dimensional space. It is efficient in range-search and nearest neighbor search. This implementation contains a vector of points and an allocator for KDTree nodes, with the root node representing the tree's starting point.

Template Parameters
TData type of the points.
LeafSizeMaximum number of points in a leaf node before splitting (default 16).
AllocTAllocator type for KDTree nodes, defaults to LinearAllocator<KDTreeNode>.
Warning
The KDTree does not manage the lifecycle of the points array. Ensure the array remains valid during KDTree's lifetime.
Usage Example
NDArray<float, 2> points = ...;
KDTree<float> kdtree(points);
auto indices = kdtree.query(query_point, radius);
for (size_t index : indices) {
std::cout << index << std::endl;
}
KDTree(const NDArray< T, 2 > &points)
Constructs a KDTree using a given set of points.
Definition kdtree.h:108
Represents a multidimensional array (NDArray) of a fixed number of dimensions N and element type T.
Definition ndarray.h:136

Definition at line 92 of file kdtree.h.

Member Typedef Documentation

◆ value_type

template<class T, KDTreeDistanceType distanceType = KDTreeDistanceType::kEucledian, std::size_t LeafSize = 16, class AllocT = LinearAllocator<KDTreeNode>>
using clustering::KDTree< T, distanceType, LeafSize, AllocT >::value_type = T

Element type of the indexed point cloud.

Definition at line 94 of file kdtree.h.

Constructor & Destructor Documentation

◆ KDTree()

template<class T, KDTreeDistanceType distanceType = KDTreeDistanceType::kEucledian, std::size_t LeafSize = 16, class AllocT = LinearAllocator<KDTreeNode>>
clustering::KDTree< T, distanceType, LeafSize, AllocT >::KDTree ( const NDArray< T, 2 > & points)
inline

Constructs a KDTree using a given set of points.

Initializes the KDTree with a given array of points and prepares the allocator. This constructor builds the KDTree by sorting the points and constructing nodes accordingly.

Parameters
pointsNDArray of points to build the KDTree.
Note
The points array must remain valid for the lifetime of the KDTree, as the tree does not manage the array's lifecycle.

Definition at line 108 of file kdtree.h.

◆ ~KDTree()

template<class T, KDTreeDistanceType distanceType = KDTreeDistanceType::kEucledian, std::size_t LeafSize = 16, class AllocT = LinearAllocator<KDTreeNode>>
clustering::KDTree< T, distanceType, LeafSize, AllocT >::~KDTree ( )
inline

Destroys the KDTree, deallocating its nodes.

Deallocates all nodes of the KDTree, if the allocator supports deallocation. This is to ensure that no memory leaks occur from dynamically allocated nodes.

Definition at line 352 of file kdtree.h.

Member Function Documentation

◆ dim()

template<class T, KDTreeDistanceType distanceType = KDTreeDistanceType::kEucledian, std::size_t LeafSize = 16, class AllocT = LinearAllocator<KDTreeNode>>
std::size_t clustering::KDTree< T, distanceType, LeafSize, AllocT >::dim ( ) const
inlinenodiscardnoexcept

Dimension count of the indexed point cloud.

Definition at line 343 of file kdtree.h.

◆ indexPermutation()

template<class T, KDTreeDistanceType distanceType = KDTreeDistanceType::kEucledian, std::size_t LeafSize = 16, class AllocT = LinearAllocator<KDTreeNode>>
std::span< const std::size_t > clustering::KDTree< T, distanceType, LeafSize, AllocT >::indexPermutation ( ) const
inlinenodiscardnoexcept

Permutation from reordered-slot index to original point index.

Element k is the original row index of the point stored at reordered slot k. A leaf with m_index = base and m_dim = count owns slots [base, base + count); its original indices are permutation[base .. base + count - 1]. Stable for the tree's lifetime; pointer invalidation follows the tree's move and destruction.

Returns
Length- N span over the permutation buffer.

Definition at line 310 of file kdtree.h.

◆ knnQuery()

template<class T, KDTreeDistanceType distanceType = KDTreeDistanceType::kEucledian, std::size_t LeafSize = 16, class AllocT = LinearAllocator<KDTreeNode>>
std::pair< NDArray< std::int32_t, 2 >, NDArray< T, 2 > > clustering::KDTree< T, distanceType, LeafSize, AllocT >::knnQuery ( std::int32_t k,
math::Pool pool ) const
inlinenodiscard

Returns the k nearest neighbours of every indexed point, self-excluded.

For every row i of the original point cloud, the method finds the k original-index points minimizing squared Euclidean distance to row i (excluding i itself) and writes the result as two parallel (n x k) arrays: indices[i][j] is the original index of the j-th closest neighbour, sqDists[i][j] is its squared distance. Each row is sorted ascending by sqDist. Ties in distance resolve on smaller neighbour index so results are reproducible bit-for-bit across runs at matched input.

Traversal is depth-first with the bounded max-heap's current worst retained distance serving as the pruning bound (heap.top().first). Until the heap fills, the bound is std::numeric_limits<T>::max() and pruning is inactive.

Precondition
k >= 1 and k < n.
Parameters
kNumber of neighbours per point (self-excluded). The signed 32-bit argument mirrors the neighbour-index type carried through the (indices, sqDists) output; the preconditions clamp it to [1, n).
poolParallelism injection for the outer per-point loop; follows the shouldParallelize policy the radius-query path uses.
Returns
Pair of two arrays: .first is an (n x k) std::int32_t array of neighbour indices; .second is an (n x k) T array of squared distances.

Definition at line 241 of file kdtree.h.

◆ nodeBounds()

template<class T, KDTreeDistanceType distanceType = KDTreeDistanceType::kEucledian, std::size_t LeafSize = 16, class AllocT = LinearAllocator<KDTreeNode>>
std::pair< std::span< const T >, std::span< const T > > clustering::KDTree< T, distanceType, LeafSize, AllocT >::nodeBounds ( const KDTreeNode * node) const
inlinenodiscardnoexcept

Axis-aligned bounding box of the points routed through node.

For a leaf, the bounds enclose every point stored at the leaf. For an internal node, the bounds enclose the union of the child boxes plus the pivot. Callers consume the result as two d-element spans into the tree's contiguous bounds buffer; the pointers are valid for the tree's lifetime.

Parameters
nodeNode to query; must belong to this tree. Passing nullptr is a precondition violation.
Returns
Pair of spans (min, max), each of length d.

Definition at line 293 of file kdtree.h.

◆ nodeCount()

template<class T, KDTreeDistanceType distanceType = KDTreeDistanceType::kEucledian, std::size_t LeafSize = 16, class AllocT = LinearAllocator<KDTreeNode>>
std::size_t clustering::KDTree< T, distanceType, LeafSize, AllocT >::nodeCount ( ) const
inlinenodiscardnoexcept

Total node count, equal to one past the largest m_id assigned during construction.

Callers that side-table per-node state (e.g. the Boruvka per-round single-component cache) size their buffer to this count and index by node->m_id.

Definition at line 335 of file kdtree.h.

◆ query() [1/2]

template<class T, KDTreeDistanceType distanceType = KDTreeDistanceType::kEucledian, std::size_t LeafSize = 16, class AllocT = LinearAllocator<KDTreeNode>>
std::vector< std::size_t > clustering::KDTree< T, distanceType, LeafSize, AllocT >::query ( const NDArray< T, 1 > & query_point,
T radius,
std::int64_t limit = -1 ) const
inline

Finds points within a specified radius of a query point.

This method returns the indices of points within a given radius from the query point. It searches the KDTree efficiently by pruning branches that fall outside the search radius.

Parameters
query_pointThe point around which to search.
radiusThe search radius.
limitMaximum number of points to return. If -1, returns all points within the radius.
Returns
Vector of indices of points within the search radius.

Definition at line 155 of file kdtree.h.

◆ query() [2/2]

template<class T, KDTreeDistanceType distanceType = KDTreeDistanceType::kEucledian, std::size_t LeafSize = 16, class AllocT = LinearAllocator<KDTreeNode>>
std::vector< std::vector< std::int32_t > > clustering::KDTree< T, distanceType, LeafSize, AllocT >::query ( T radius,
math::Pool pool ) const
inlinenodiscard

Returns the full radius-neighborhood adjacency over the indexed point cloud.

Walks the tree once per row, fanning the outer loop out over pool. Lets DBSCAN consume the whole neighbor graph in one call rather than threading a per-point query loop through the caller.

Parameters
radiusNon-negative neighbourhood radius; comparison runs on the squared distance.
poolParallelism injection used to fan the outer row loop out across workers.
Returns
Length-n vector where element i lists every j with ||x_i - x_j||^2 <= radius^2.

Definition at line 183 of file kdtree.h.

◆ reorderedPoints()

template<class T, KDTreeDistanceType distanceType = KDTreeDistanceType::kEucledian, std::size_t LeafSize = 16, class AllocT = LinearAllocator<KDTreeNode>>
std::span< const T > clustering::KDTree< T, distanceType, LeafSize, AllocT >::reorderedPoints ( ) const
inlinenodiscardnoexcept

Points in reordered (tree-build) order as a flat row-major buffer.

The element at flat offset (slot * d + j) is the j-th coordinate of the point stored at reordered slot slot; this equals originalPoints(permutation[slot], j). Consumers that already have a KDTreeNode in hand can index through this buffer contiguously rather than chasing the permutation; leaf ranges are therefore a count x d block of neighbouring cache lines with no scatter on the leaf scan.

Returns
Length- N*d span over the reordered-points buffer.

Definition at line 325 of file kdtree.h.

◆ root()

template<class T, KDTreeDistanceType distanceType = KDTreeDistanceType::kEucledian, std::size_t LeafSize = 16, class AllocT = LinearAllocator<KDTreeNode>>
const KDTreeNode * clustering::KDTree< T, distanceType, LeafSize, AllocT >::root ( ) const
inlinenodiscardnoexcept

Root of the tree; nullptr for an empty point set.

Definition at line 340 of file kdtree.h.


The documentation for this class was generated from the following file: