8. Computational Geometry II

Range Searching

Introduction

There are a large number of geometric objects. How do you find all objects intersecting/within a given area/volume?

We assume:

1D range searching

Determine which points lie in [rmin,rmax][r_{min}, r_{max}]:

matches = [x for x in nums if rmin <= x <= rmax]

To improve performance, sort the items: O(nlogn)O(n\log{n}). Each query will use binary search to find the smallest point in the range, and then iterate through the list until the item is out of the range. Hence, the cost of each query is O(logn+m)O(\log{n} + m), where mm is the number of points in the range.

However, this approach does not generalize to multiple dimensions.

1D Binary Search Tree (1D k-d tree)

To create the search tree, sort the elements; the left subtree should contain the lower half of the list, plus element with index len(list)//2 - 1. Repeat the process with the subtrees until you reach a leaf node.

from collections import namedtuple
Node = namedtuple("Node", ["value""left""right" # Cheap way to make a class

def search_tree(nums, sorted=False):
  if not sorted:
    nums = sorted(nums)

  if len(nums) == 1:
    tree = Node(nums[0], NoneNone# Leaf node
    else:
      midpoint = len(nums) // 2 # Halfway, rounding down
      left = search_tree(nums[:midpoint], True)
      # Up to but not including midpoint
      right = search_tree(nums[midpoint:], True)
      # Including the midpoint and the end
      tree = Node(nums[midpoint - 1], left, right)
      # Left child will be equal to parent, which meets the
      # less than or equal to criteria

    return tree
def range_query(tree, low, high):
  # Inclusive range
  matches = []
  if tree.left is None and tree.right is None:
    if low <= tree.value <= high:
      matches.append(tree.value)
  else:
    if low <= node.value and tree.left is not None:
      matches += query(tree.left, low, high)
    if node.value < high and tree.right is not None:
      # Range may include both children, so use if and not else
      matches += range_query(tree.right, low, high)
  return matches

The algorithm visits at least one leaf node, so the minimum cost is O(logn)O(\log{n}).

If it visits mm leaves, then it also traversed m/2m/2 parents, m/4m/4, grandparents, \dots, 1 root node.

m2+m4+m8+2m\frac{m}{2} + \frac{m}{4} + \frac{m}{8} + \dots \approx 2m, so the order of complexity is O(logn+m)O(\log{n+m}).

k-d trees

Spatial Subdivision

Regular grid: regular divisions.

Quad tree: regular grid, but each cell contains a sufficiently small number of cells; if it has more, recursively subdivide it into quarters.

k-d tree: divide cell into halves along the median (each half contains the same number of cells) on one axis. For each half, divide it again into equal halves again, but this time on the other axis. Rinse and repeat until there a sufficiently small number of cells in each cell or a maximum depth is reached.

Differences:

def kd_tree(points, depth = 0):
  if len(points) <= threshold or depth >= max_depth:
    return Leaf(points)
  else:
    axis = depth % num_axes
    points.sort(key=lambda point: point[axis])
    mid_index = len(points) // 2
    midpoint = points[mid_index - 1][axis]
    # Subtract 1 since left <= midpoint, left does not include midpoint
    left =  kd_tree(points[:mid_index], depth + 1)
    right = kd_tree(points[mid_index:], depth + 1)
    return Node(axis, midpoint, left, right)

k-d Range Search:

def points_in_range(node, query_shape):
  if node is leaf:
    matches = points in node.points within shape
  else:
    matches = []
    if intersection of shape and lower half space not empty:
      matches += points_in_range(node.left, query_shape)
    if intersection of space and upper half space not empty:
      matches += points_in_range(node.right, query_shape)
  
  return matches

Unlike in the one dimensional case, multiple points can have the same value for a given dimension as when the split occurs on another dimension, it could end up on either side. Hence, both the lower and upper half of the shape can include points on the line.

Closest Neighbor

For a given point (that does not need to be in the tree), find the closest point within the tree.

Quadtrees

This only works in 2D. Performance:

Line Sweep

Data structures:

Closest Point-Pair

nn points on a plane: find the points PP and QQ such that they are the minimum distance from each other.

In the 1D case, simply sort then search through the list sequentially: O(nlogn)O(n\log{n}).

2D Case

Idea:

Data structure:

At each step:

The frontier set must be sorted by yy to make it efficient and have efficient addition and removal. Hence, a self-balancing binary tree or SortedContainer could be used.

Complexity: