-2

This question was moved to SE Overflow. Link


Problem

Given are $n=10^4$ points and $m=10^3$ line segments in a plane. The task is to determine the closest segment to each point. A fast solution in Python is preferred.

Specifications

Line segments are defined by their endpoints and the distance is Euclidean. Lines can cross. The minimum distance should be calculated to the entire segment, not just the endpoints. The distance to the nearest line segment is unnecessary.

Graphical example

This picture demonstrates how to find the closest line segment for a small number of points and segments. The points are colored based on their nearest line segment.

enter image description here

Possible solution strategy

One possible method to achieve this is by using a Voronoi tessellation of line segments and determining which cell a point lies in (link). However, calculating this tessellation is not easy and would require external packages. Also crossing lines could be a problem. Additionally, I am not concerned with the exact shape of the Voronoi cells. Although I do not want to exclude this method, it appears that there may be a simpler solution by vectorizing the distance calculation.

The picture shows the Voronoi tessellation of the previous example. Voronoi cells are colored based on line segment color.

enter image description here

Hints for solution

A solution has already been presented that is fast. For the given CPU it takes $0.2$s for $n=10^4,m=10^3$ and $7.5$s for $n=10^5,m=10^4$.

Can a substantially faster algorithm be implemented in Python, achieving at least a $30\%$ improvement in speed over the existing solution? More precisely, if the existing solution (function dist_fast) takes time $x$ on an individual CPU, then the improved solution should take at most time $0.7x$. For better comparison the case $n=10^4,m=10^3$ and the case $n=10^5,m=10^4$ can be tested.

  • What do you mean by “vectorize”? – joriki Jan 25 '24 at 08:37
  • https://www.mathworks.com/help/matlab/matlab_prog/vectorization.html – granular_bastard Jan 25 '24 at 15:51
  • As phrased, the problem that you are asking people to solve is a programming problem, not a mathematics problem. This not really a good fit for Math SE. We expect problems here to be fundamentally mathematical in nature. – Xander Henderson Feb 02 '24 at 13:58
  • Some days ago I flagged this question and suggested that maybe SE Overflow would fit better and then the question could be moved by a moderator. But this flag was declined. But strictly speaking this problem is between mathematics and programming, as mathematical methods could help to find another solution that might speed up the code. The tag Computational geometry shows that it is about programming in mathematics. – granular_bastard Feb 02 '24 at 15:04
  • I have consulted with the moderators at SO. Their impression is that this question might be on-topic there, without the self-answer. They are, however, concerned that it would be closed as a duplicate of https://stackoverflow.com/q/849211/. Because I do not expect this question to be well-received on SO, I will not be migrating it. You are welcome to try your luck with deleting this question and reposting it to SO. I have no idea if it will be well-received there. – Xander Henderson Feb 04 '24 at 21:09

2 Answers2

1

Vectorization of one loop can increase speed by 100 times

By modifying the Python algorithm from this post I was able to achieve a speed improvement of a factor of 100 (function dist_fast in the code below). I vectorized over m and calculated repeating steps in advance.

For n=$10^4$ and m=$10^3$, the given CPU takes $20$s when the original algorithm is looped over n and m, but only $0.2$s when we loop only over m.


import numpy as np, time

def dist_fast(p,segment): n=p.shape[1] segment_closest=np.zeros(n,dtype=int) dxs=segment[2,:]-segment[0,:] dys=segment[3,:]-segment[1,:] q=dxs2+dys2;dxsq=dxs/q;dysq=dys/q for i in range(n): dx=p[0,i]-segment[0];dy=p[1,i]-segment[1] u=np.clip(dxdxsq+dydysq,0,1) sqdist = (dx-udxs)2+(dy-udys)**2 segment_closest[i]=np.argmin(sqdist) return segment_closest

def dist_slow(p,segment):

#use distance function from https://stackoverflow.com/a/2233538 def dist(x1, y1, x2, y2, x3, y3): px = x2-x1 py = y2-y1 norm = pxpx + pypy u = ((x3 - x1) * px + (y3 - y1) * py) / float(norm) if u > 1: u = 1 elif u < 0: u = 0 x = x1 + u * px y = y1 + u * py dx = x - x3 dy = y - y3 sqdist = dxdx + dydy return sqdist

n=p.shape[1];m=segment.shape[1] segment_closest=np.zeros(n,dtype=int) d=np.zeros(m) for i in range(n): x3,y3=p[0,i],p[1,i] for j in range(m): x1,y1,x2,y2=segment[0,j],segment[1,j],segment[2,j],segment[3,j] d[j]=dist(x1,y1,x2,y2,x3,y3) segment_closest[i]=np.argmin(d) return segment_closest

n=104 #number of points
m=10
3 #number of line segments

#define repeatable random numbers np.random.seed(0) #create random points and line segments for testing p=np.random.rand(2,n) #x,y of points segment=np.random.rand(4,m) #x1,y1,x2,y2 start and end of line segments

#fast algorithm, vectorized, 1 loop t1=time.time()
segment_closest1=dist_fast(p,segment) time_fast=time.time()-t1

#slow algorithm, 2 loops t1=time.time()
segment_closest2=dist_slow(p,segment) time_slow=time.time()-t1

print('Time vectorized algorithm: ',time_fast) print('Time non-vectorized algorithm: ',time_slow) print('Results are equal?',np.array_equal(segment_closest1,segment_closest2))

  • Just for the record, you offer some reputation bounty for a 30% improvement in speed and answer your query yourself with a 100 times faster solution, an improvement of 9900% -- some advertisement or a chance find? – m-stgt Jan 30 '24 at 16:29
  • achieving at least a 30% improvement in speed over the existing solution? means: If the already existing fast solution takes $x$ milliseconds, then the improved solution should take at most $0.7x$ milliseconds. – granular_bastard Jan 30 '24 at 16:32
0

A quick modification of your code using numba.

pip install numba==0.58.1
import numpy as np, time

#<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< from numba import jit

@jit def jit_dist_fast(p,segment): n=p.shape[1] segment_closest=np.zeros(n,dtype=int) dxs=segment[2,:]-segment[0,:] dys=segment[3,:]-segment[1,:] q=dxs2+dys2;dxsq=dxs/q;dysq=dys/q for i in range(n): dx=p[0,i]-segment[0];dy=p[1,i]-segment[1] u=np.clip(dxdxsq+dydysq,0,1) sqdist = (dx-udxs)2+(dy-udys)**2 segment_closest[i]=np.argmin(sqdist) return segment_closest #>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

def dist_fast(p,segment): n=p.shape[1] segment_closest=np.zeros(n,dtype=int) dxs=segment[2,:]-segment[0,:] dys=segment[3,:]-segment[1,:] q=dxs2+dys2;dxsq=dxs/q;dysq=dys/q for i in range(n): dx=p[0,i]-segment[0];dy=p[1,i]-segment[1] u=np.clip(dxdxsq+dydysq,0,1) sqdist = (dx-udxs)2+(dy-udys)**2 segment_closest[i]=np.argmin(sqdist) return segment_closest

def dist_slow(p,segment):

#use distance function from https://stackoverflow.com/a/2233538 def dist(x1, y1, x2, y2, x3, y3): px = x2-x1 py = y2-y1 norm = pxpx + pypy u = ((x3 - x1) * px + (y3 - y1) * py) / float(norm) if u > 1: u = 1 elif u < 0: u = 0 x = x1 + u * px y = y1 + u * py dx = x - x3 dy = y - y3 sqdist = dxdx + dydy return sqdist

n=p.shape[1];m=segment.shape[1] segment_closest=np.zeros(n,dtype=int) d=np.zeros(m) for i in range(n): x3,y3=p[0,i],p[1,i] for j in range(m): x1,y1,x2,y2=segment[0,j],segment[1,j],segment[2,j],segment[3,j] d[j]=dist(x1,y1,x2,y2,x3,y3) segment_closest[i]=np.argmin(d) return segment_closest

n=104 #number of points
m=10
3 #number of line segments

#define repeatable random numbers np.random.seed(0) #create random points and line segments for testing p=np.random.rand(2,n) #x,y of points segment=np.random.rand(4,m) #x1,y1,x2,y2 start and end of line segments

#fast algorithm, vectorized, 1 loop t1=time.time()
segment_closest1=dist_fast(p,segment) time_fast=time.time()-t1

#slow algorithm, 2 loops t1=time.time()
segment_closest2=dist_slow(p,segment) time_slow=time.time()-t1 #>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

jitted algorithm

t1=time.time()
segment_closest3=jit_dist_fast(p,segment) time_jitted_first_run=time.time()-t1

t1=time.time()
segment_closest3=jit_dist_fast(p,segment) time_faster=time.time()-t1 #>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> print('Time vectorized algorithm: ',time_fast) print('Time non-vectorized algorithm: ',time_slow)

#<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< print('Time first run of jitted vectorized algorithm: ',time_jitted_first_run) print('Time jitted vectorized algorithm: ',time_faster) #>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> print('Results are equal?',np.array_equal(segment_closest1,segment_closest2)) print('Results are equal?',np.array_equal(segment_closest1,segment_closest3))

Using jit basically compiles the python like c, so the first run will cause some overhead. Later runs will be faster (because compiled). I find around 75% improvement over the original. If you intention is to run this algorithm many times, then using jit definitely gains.

output:

Time vectorized algorithm:  0.2614469528198242
Time non-vectorized algorithm:  17.305463314056396
Time first run of jitted vectorized algorithm:  2.0804057121276855
Time jitted vectorized algorithm:  0.048833370208740234
Results are equal? True
Results are equal? True
Yimin
  • 3,311
  • 17
  • 32
  • Could the first run be carried out with dummy variables with few entries? – granular_bastard Jan 30 '24 at 18:04
  • yes. you can do that! but the overhead should be more or less similar.@granular_bastard – Yimin Jan 30 '24 at 18:16
  • another way is to use something like cffi or cpython to compile your code into lib based on generated c code and then call the lib. – Yimin Jan 30 '24 at 18:19
  • there is a way to avoid the compiling during run time. see https://numba.pydata.org/numba-doc/dev/user/pycc.html – Yimin Jan 30 '24 at 18:21