The problem statement
Given a set of geometries G1, find the k nearest neighbours for each of these geometries from a second set of geometries G2. For simplicity, I assume G1 and G2 to be sets of points which are stored as simple POINT geometries in PostGIS. Note that G1 and G2 could be the same set, I will use this in the example code.
CREATE TABLE knntst
(
gid serial NOT NULL,
geom geometry,
CONSTRAINT knntst_pk PRIMARY KEY (gid)
);
CREATE INDEX knntst_geom_gist
ON knntst
USING gist
(geom);
INSERT INTO knntst (geom) VALUES ('POINT (0 0)'::geometry), ('POINT (1 0)'::geometry), ('POINT (0 1)'::geometry), ('POINT (1 1)'::geometry), ('POINT (3 0)'::geometry), ('POINT (4 0)'::geometry), ('POINT (3 3)'::geometry), ('POINT (4 5)'::geometry);
Using the KNN operator
There are many ways for solving this problem in PostGIS, but the trouble is to find a way that actually uses the spatial index and that does not require to set a maximum distance beforehand (and that provides the complete output, i.e. all k nearest neighbours for each of the points). The solution is to use the KNN operator "<->", which may be included in the ORDER BY clause. However, one argument to that operator needs to be constant in order to actually take advantage of the spatial index. So, for the given problem, a LATERAL join is required in the following way:
SELECT g1.gid, g2.gid, ST_AsText(g1.geom), ST_AsText(g2.geom), ST_Distance(g1.geom, g2.geom) FROM knntst AS g1 CROSS JOIN LATERAL (SELECT g3.gid, g3.geom FROM knntst AS g3 WHERE g1.gid!=g3.gid ORDER BY g3.geom <-> g1.geom LIMIT 3) AS g2The output table contains the 3 nearest neighbours to each of the points in the source table, excluding the point itself.
Keine Kommentare:
Kommentar veröffentlichen