2

Given a set of 2D points X, I would like to find a convex hull consisting of maximum n points. Of course, this is not always possible. Therefore, I am looking for an approximate convex hull consisting of max. n points that maximally covers the set of points X.

Stated more formally, if F(H,X) returns the amount of points of X the convex hull H covers, where |H| is the amount of points out of which the hull is constructed, then I seek the following quantity:

H_hat = argmax_H F(H,X), s.t |H|<=n

Another way to regard the problem is the task of finding the polygon consisting of max. n corners of a set X such that it maximally covers said set X.

The way I've come up with is the following:

X = getSet() \\ Get the set of 2D points
H = convexHull(X) \\ Compute the convex hull
while |H| > n do
    n_max = 0
    for h in H:
        H_ = remove(h,H) \\ Remove one point of the convex hull
        n_c = computeCoverage(X,H_) \\ Compute the coverage of the modified convex hull.
        \\ Save which point can be removed such that the coverage is reduced minimally.
        if n_c > n_max:
            n_max = n_c
            h_toremove = h
    \\ Remove the point and recompute the convex hull.
    X = remove(h,X)
    H = convexHull(X)

However, this is a very slow way of doing this. I have a large set of points and n (the constraint) is small. This means that the original convex hull contains a lot of points, hence the while-loop iterates for a very long time. Is there a more efficient way of doing this? Or are there any other suggestions for approximate solutions?

9
  • Does H_hat have to consist of original points or can you add new ones?
    – kafman
    Commented Feb 8, 2019 at 12:50
  • 1
    Hey, I added computational-geometry as a tag since I know of at least one expert who follows it specifically. You did a great job tagging, so I had to drop out "convex". Hope that's OK; you can make adjustments as desired. Commented Feb 8, 2019 at 14:41
  • A major difficulty is that the size of the hull does not decrease monotonically when you remove points.
    – user1196549
    Commented Feb 8, 2019 at 15:10
  • I doubt that the optimal solution can be found incrementally. Think of the case H=3 and assume that you found a triangle covering 5 points. Maybe there is another at a completely different place enclosing 100 of them.
    – user1196549
    Commented Feb 8, 2019 at 15:15
  • 1
    @DavidEisenstat: what I mean is that the problem doesn't seem to have nice local properties. Changing one vertex might change 90% of the coverage.
    – user1196549
    Commented Feb 8, 2019 at 15:21

3 Answers 3

2

A couple of ideas.

  1. The points that we exclude when deleting a vertex lie in the triangle formed by that vertex and the two vertices adjacent to it on the convex hull. Deleting any other vertex does not affect the set of potentially excluded points. Therefore, we only have to recompute coverage twice for each deleted vertex.

  2. Speaking of recomputing coverage, we don't necessarily have to look at every point. This idea doesn't improve the worst case, but I think it should be a big improvement in practice. Maintain an index of points as follows. Pick a random vertex to be the "root" and group the points by which triangle formed by the root and two other vertices contains them (O(m log m) time with a good algorithm). Whenever we delete a non-root vertex, we unite and filter the two point sets for the triangles involving the deleted vertex. Whenever we recompute coverage, we can scan only points in two relevant triangles. If we ever delete this root, choose a new one and redo the index. The total cost of this maintenance will be O(m log^2 m) in expectation where m is the number of points. It's harder to estimate the cost of computing coverage, though.

  3. If the points are reasonably uniformly distributed within the hull, maybe use area as a proxy for coverage. Store the vertices in a priority queue ordered by the area of their triangle formed by their neighbors (ear). Whenever we delete a point, update the ear area of its two neighbors. This is an O(m log m) algorithm.

1

May be the following approach could work for you: Initially, compute the convex hull H. Then select a subset of n points at random from H, which constructs a new polygon that might not cover all the points, so let's call it quasi-convex hull Q. Count how many points are contained in Q (inliers). Repeat this for a certain amount of times and keep the Q proposal with the most inliers.

This seems a bit related to RANSAC, but for this task, we don't really have a notion of what an outlier is, so we can't really estimate the outlier ratio. Hence, I don't know how good the approximation will be or how many iterations you need to get a reasonable result. May be you can add some heuristics instead of choosing the n points purely at random or you can have a threshold of how many points should at least be contained in Q and then stop when you reach that threshold.

Edit

Actually after thinking about it, you could use a RANSAC approach:

max_inliers = 0
best_Q = None
while(true):
    points = sample_n_points(X)
    Q = construct_convex_hull(points)
    n_inliers = count_inliers(Q, X)
    if n_inliers > max_inliers:
        max_inliers = n_inliers
        best_Q = Q
    if some_condition:
        break

The advantage of this is that the creation of the convex hull is faster than in your approach, as it only uses a maximum of n points. Also, checking the amount of inliers should be fast as it can be just a bunch of sign comparisons with each leg of the convex hull.

1

The following does not solve the question in the way I have formulated it, but it solved the problem which spawned the question. Therefore I wanted to add it in case anybody else encounters something similar.

I investigated two approaches:

1) Regards the convex hull as a polygon and apply a polygon simplification algorithm on it. The specific algorithm I investigated was the Ramer-Douglas-Peucker Algorithm.

2) Apply the algorithm described in in the question, without recomputing the convex hull.

Neither approaches will give you (as far as I can tell), the desired solution to the optimization problem stated, but for my tasks they worked sufficiently well.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.