geomcompare package
GeomCompare
The Geomcompare package provides multiple tools for comparing two independant sets of geometrical features.
Documentation for GeomCompare is available in the form of docstrings provided with the code, as well as on the project’s homepage https://geomcompare.readthedocs.io/en/latest/.
Available submodules
- geomrefdb
Defines the main classes of the library used for comparing geometry datasets.
- io
Provides a set of tools for I/O operations, extracting geometrical features from disk or from a PostGIS database, as well as writing a dataset of geometries to disk.
- comparefunc
Defines a few comparison functions to use with the geomrefdb’s main classes.
- geomutils
Defines a few functions and types to work with shapely geometrical objects.
- stats
Defines functions for computing classifier metrics (e.g. when comparing a result dataset from a machine learning model with a reference dataset).
Submodules
geomcompare.comparefunc module
geomcompare.geomrefdb module
- class geomcompare.geomrefdb.PostGISGeomRefDB(PG_params, PG_schema, PG_table, PG_geoms_column)[source]
Bases:
geomcompare._geomrefdb_abc.GeomRefDB
- true_positives(geoms_iter, geoms_EPSG, same_geoms_func)[source]
Return an iterable of input geometries that are matching geometries of the GeomRefDB instance.
- class geomcompare.geomrefdb.RtreeGeomRefDB(geoms_iter, geoms_EPSG)[source]
Bases:
geomcompare._geomrefdb_abc.GeomRefDB
- true_positives(geoms_iter, geoms_EPSG, same_geoms_func)[source]
Return an iterable of input geometries that are matching geometries of the GeomRefDB instance.
- geomcompare.geomrefdb.SpatialiteGeomType = typing.Literal['Point', 'LineString', 'Polygon', 'MultiPoint', 'MultiLineString', 'MultiPolygon', 'GeometryCollection']
Type:
_LiteralGenericAlias
Geometry types supported by the
SQLiteGeomRefDB
class.
- class geomcompare.geomrefdb.SQLiteGeomRefDB(filename=None, default_epsg=None, geoms_iter=None, geoms_tab_name=None, geom_type=None, geoms_epsg=None, in_ram=True, logger=None, logger_name=None, logging_level=20)[source]
Bases:
geomcompare._geomrefdb_abc.GeomRefDB
Concrete implementation of the GeomRefDB ABC using SQLite.
SQLiteGeomRefDB is a concrete implementation of the interface defined by the GeomRefDB abstract base class. It enables to load an existing (or create a new) SQLite database, where geometry datasets can be stored and can be compared (based on geometry similarity functions) with other geometrical features from an external dataset. Instances of this class can handle simultaneously multiple reference datasets, with various geometry types (see
supported_geom_types
) and spatial reference systems.- Parameters
filename (
str
, optional) – Path to an existing spatialite database.default_epsg (
int
, optional) – Default EPSG code of the geometrical features that will be added to the database. If specified, the EPSG code will be default value of thegeoms_epsg
parameter for any subsequent call of theadd_geometries()
method.geoms_iter (iterable of
GeomObject
, optional) – Iterable of the geometrical features to add to thisSQLiteGeomRefDB
instance. Such features can also be added later to the class instance with theadd_geometries()
method.geoms_tab_name (
str
, optional) – Name of the table where the geometrical features are to be stored. If thegeoms_iter
parameter is not given,geoms_tab_name
will be ignored.geom_type (
SpatialiteGeomType
, optional) – Geometry type of the geometrical features passed as argument to thegeoms_iter
parameter.geoms_epsg (
int
, optional) – EPSG code of the geometrical features passed as argument to thegeoms_iter
parameter. If specified, it overrides thedefault_epsg
parameter during the instance construction.in_ram (bool, default:
True
) – Set toTrue
to create/load the database in RAM for faster access. Set toFalse
for larger-than-RAM databases.logger (
logging.Logger
, optional) – Logger instance to use for logging outputs.logger_name (
str
, optional) – Name of thelogging.Logger
object to create for logging outputs. This parameter will be ignored if a Logger instance is passed to thelogger
parameter.logging_level (
int
, default:logging.INFO
) – Logging level of the logging output. For mor information, please see the documentation of thelogging
module.
- Raises
ValueError – If
in_ram=False
andfilename=None
.
Notes
This class makes use of the spatialite extension of SQLite, and as such, spatialite must be installed and available in order to work with instances of this class.
- class property supported_geom_types
Types supported by
SQLiteGeomRefDB
.- Type
list
of supported geometry types
- property filename
Path of the opened database file. The attribute is set to
None
if a new database was created in RAM for this instance.
- property in_ram
True
if the database is created/loaded in RAM.False
if the instance is connected to database file on disk.
- property default_epsg
Default EPSG code of the geometrical features that are added to the database.
- property logger
Ready configured Logger instance used for logging outputs.
- save_db(filename, overwrite=True)[source]
Save the internal SQLite database to disk.
The function saves the internal SQLite database, together with all the geometrical features added with
add_geometries()
, to disk. The path of the resulting output file can later be passed to thefilename
argument of theSQLiteGeomRefDB
class’ constructor to load the saved database with all its features. This function is useful only to save loaded-in-RAM databases, as the geometrical features added to aSQLiteGeomRefDB
instance, with an opened connections to databases that reside on disk, will be saved automatically even after the instance destruction.
- add_geometries(geoms_iter, geom_type=None, geoms_epsg=None, geoms_tab_name=None)[source]
Add geometrical features to the internal SQLite database.
The function adds geometrical features to the internal SQLite database, which can then be used as a “reference dataset” when running other public methods of the
SQLiteGeomRefDB
instance.- Parameters
geoms_iter (iterable of
GeomObject
) – Iterable of the geometrical features to add to thisSQLiteGeomRefDB
instance.geom_type (
SpatialiteGeomType
, optional) – Geometry type of the input geometrical features. If thegeom_type
is not specified by the user, the function will assume that the input features have the same geometry type as the features already stored in the destination table.geoms_epsg (
int
, optional) – EPSG code of the input geometrical features. If thegeoms_epsg
is not specified by the user, the function will assume that the input features are in the same spatial reference system as the features already stored in the destination table. Also, if the input features are to be stored in a new table of the database and thegeoms_epsg
is omitted, theSSQLiteGeomRefDB
instance will use the EPSG code stored in thedefault_epsg
attribute (if set).geoms_tab_name (
str
, optional) – Name of the table where the input geometrical features are to be stored in the internal SQLite database. If no argument is passed to thegeoms_tab_name
parameter, the function will try to store the input geometrical features into a table named default_table. The default_table table will be created if it does not already exist in the database.
- Raises
ValueError – If
geom_type
is not specified, in the case of a new database/table.ValueError – If
geoms_epsg
is not specified, in the case of a new database/table and if thedefault_epsg
attribute is not set.ValueError – If the argument passed to the
geom_type
parameter does not match the geometry type of the features already stored in the destination table.
Warning
The geometry type must be the same for all input features as they are to be stored in the same geometry column of the same table, and spatialite does not allow geometry columns to have mixed geometry types.
- Return type
None
- get_geometries(aoi_geom=None, aoi_epsg=None, geoms_tab_name=None, output_epsg=None)[source]
Get geometrical features from the internal SQLite database.
Generator function which yields geometrical features stored in the internal database. The user can specify the table, or define a limited area to yield the features from. In addition, the spatial reference system of the output geometries can also be specified.
- Parameters
aoi_geom (
GeomObject
, optional) – Area of interest, where the geometrical features lies.aoi_epsg (
int
, optional) – EPSG code of the area of interest geometry/ies.geoms_tab_name (
str
, optional) – Name of the table where the geometrical features are stored in the internal SQLite database. If no argument is passed to thegeoms_tab_name
parameter, the function will try to yield geometrical features from a table named default_table.output_epsg (
int
, optional) – EPSG code of the yielded geometrical features. This parameter can be used to transform the yielded geometries to a different Spatial Reference System from the one used in the internal database.
- Yields
GeomObject
– Geometrical features from the internal SQLite database.- Raises
ValueError – If
geoms_tab_name
is not specified and no table named default_table exist in the database.
- db_geom_info(to_stdout=False, count_features=False)[source]
Get information on features stored in the internal SQLite database.
Get information on the geometrical features such as the name of the table(s) where they are stored, their geometry type(s), spatial reference system(s) and the number of features per table. This information can be returned as
dict
instance, or printed to stdout.- Parameters
to_stdout (bool, default:
False
) – If set toFalse
, the information is returned as adict
. If set toTrue
, the information is written to stdout.count_features (bool, default:
False
) – If set toTrue
, the function will also return the number of features/rows per table. If set toFalse
, the features will not be counted.
- Returns
If
to_stdout=False
, returns adict
which keys are the table name(s), and which values are information on the individual table(s). This information is itself structured as adict
, which key/value pairs indicate for each table the geometry type (key: geom_type), the spatial reference system (key: srid), and optionally (ifcount_features=True
) the features count (key: count). The function returnsNone
ifto_stdout=True
.- Return type
- true_positives(geoms_iter, aoi_geom=None, geoms_epsg=None, geoms_tab_name=None, geoms_match=None, get_search_frame=None, ncores=None)[source]
Identidy matching input geometries.
The function takes as input geometrical features, and searches for reference features in one table of the internal database which geometries are considered to match that of the input features. All input features that have a geometry that matches the geometry of at least one of the reference features will be yielded back by the function.
- Parameters
geoms_iter (iterable of
GeomObject
) – Iterable of input geometrical features to compare to the features of the internal SQLite database.aoi_geom (
GeomObject
, optional) – Area of interest, within which the database’s features must lie.geoms_epsg (
int
, optional) – EPSG code of the input geometrical features (includingaoi_geom
if specified). If thegeoms_epsg
is not specified by the user, the function will assume that the input features are in the same spatial reference system as the reference features.geoms_tab_name (
str
, optional) – Name of the table where database’s features that will be used as reference are stored. If no argument is passed to thegeoms_tab_name
parameter, the function will search for reference features in a table named default_table.geoms_match (
callable
, optional) –Comparison function that takes two positional arguments:
gtest
: input geometry (GeomObject
)gref
: reference geometry (GeomObject
)
The function returns
True
if it finds that both geometries match, else returnsFalse
. If this parameter is omitted, the input geometrical feature will always be considered as a match in the case where its search frame (seeget_search_frame
parameter) interesects with one of the feature from the database’s table.get_search_frame (
callable
, optional) – Function that takes as single argument an input geometry (GeomObject
) and returns its search frame (GeomObject
). If this parameter is omitted, the search frame will be the same as the input geometry.ncores (
int
, optional) – Number of cores to use for running the function. If unspecified, the function will run in a single process
- Yields
GeomObject
– Matching input geometrical features.
Notes
If the spatial reference system of the input geometrical features is different from that of the database’s features, the input features’ coordinates are reprojected on-the-fly, before being compared to features stored in the database. If an input feature is considered to be a match, it is yielded back unchanged (its coordinates in the original spatial reference system).
- false_positives(geoms_iter, aoi_geom=None, geoms_epsg=None, geoms_tab_name=None, geoms_match=None, get_search_frame=None, ncores=None)[source]
Identify non-matching input geometries.
The function takes as input geometrical features, and searches for reference features in one table of the internal database which geometries are considered to match that of the input features. All input features that DO NOT have a geometry that matches the geometry of any reference features will be yielded back by the function.
- Parameters
geoms_iter (iterable of
GeomObject
) – Iterable of input geometrical features to compare to the features of the internal SQLite database.aoi_geom (
GeomObject
, optional) – Area of interest, within which the database’s features must lie.geoms_epsg (
int
, optional) – EPSG code of the input geometrical features (includingaoi_geom
if specified). If thegeoms_epsg
is not specified by the user, the function will assume that the input features are in the same spatial reference system as the reference features.geoms_tab_name (
str
, optional) – Name of the table where database’s features that will be used as reference are stored. If no argument is passed to thegeoms_tab_name
parameter, the function will search for reference features in a table named default_table.geoms_match (
callable
, optional) –Comparison function that takes two positional arguments:
gtest
: input geometry (GeomObject
)gref
: reference geometry (GeomObject
)
The function returns
True
if it finds that both geometries match, else returnsFalse
. If this parameter is omitted, the input geometrical feature will always be considered as a match in the case where its search frame (seeget_search_frame
parameter) interesects with one of the features from the database’s table.get_search_frame (
callable
, optional) – Function that takes as single argument an input geometry (GeomObject
) and returns its search frame (GeomObject
). If this parameter is omitted, the search frame will be the same as the input geometry.ncores (
int
, optional) – Number of cores to use for running the function. If unspecified, the function will run in a single process
- Yields
GeomObject
– Non-matching input geometrical features.
Notes
If the spatial reference system of the input geometrical features is different from that of the database’s features, the input features’ coordinates are reprojected on-the-fly, before being compared to features stored in the database. If an input feature is NOT considered to be a match, it is yielded back unchanged (its coordinates in the original spatial reference system).
- missing_geometries(geoms_iter, geom_type=None, aoi_geom=None, geoms_epsg=None, geoms_tab_name=None, geoms_match=None, get_search_frame=None, ncores=None)[source]
Identify (missing) non-matching reference geometries.
The function takes as input geometrical features, and searches for reference features in one table of the internal database which geometries are NOT considered to match the geometry of any feature from the input set. All reference features that DO NOT have a geometry that matches the geometry of any input features will be yielded by the function.
- Parameters
geoms_iter (iterable of
GeomObject
) – Iterable of input geometrical features to compare to the features of the internal SQLite database.geom_type (
SpatialiteGeomType
, optional) – Geometry type of the input geometrical features. If thegeom_type
is not specified by the user, the function will assume that the input features have the samegeom_type
as the reference features.aoi_geom (
GeomObject
, optional) – Area of interest, within which the database’s features must lie.geoms_epsg (
int
, optional) – EPSG code of the input geometrical features (includingaoi_geom
if specified). If thegeoms_epsg
is not specified by the user, the function will assume that the input features are in the same spatial reference system as the reference features.geoms_tab_name (
str
, optional) – Name of the table where database’s features that will be used as reference are stored. If no argument is passed to thegeoms_tab_name
parameter, the function will search for reference features in a table named default_table.geoms_match (
callable
, optional) –Comparison function that takes two positional arguments:
gtest
: input geometry (GeomObject
)gref
: reference geometry (GeomObject
)
The function returns
True
if it finds that both geometries match, else returnsFalse
. If this parameter is omitted, the input geometrical feature will always be considered as a match in the case where its search frame (seeget_search_frame
parameter) interesects with one of the features from the database’s table.get_search_frame (
callable
, optional) – Function that takes as single argument an input geometry (GeomObject
) and returns its search frame (GeomObject
). If this parameter is omitted, the search frame will be the same as the input geometry.ncores (
int
, optional) – Number of cores to use for running the function. If unspecified, the function will run in a single process
- Yields
GeomObject
– Non-matching reference geometrical features.
Notes
If the spatial reference system of the input geometrical features is different from that of the database’s features, the input features’ coordinates are reprojected on-the-fly, before being compared to features stored in the database.
geomcompare.geomutils module
- geomcompare.geomutils.GeomObject = typing.Union[shapely.geometry.polygon.LinearRing, shapely.geometry.linestring.LineString, shapely.geometry.multilinestring.MultiLineString, shapely.geometry.multipoint.MultiPoint, shapely.geometry.multipolygon.MultiPolygon, shapely.geometry.point.Point, shapely.geometry.polygon.Polygon]
Type:
_UnionGenericAlias
Type for shapely geometrical objects.
- geomcompare.geomutils.to_2D(geom)[source]
Remove the third dimension of a geometrical object’s coordinates.
- Parameters
geom (
GeomObject
) – Shapely geometrical object with XYZ-coordinates.- Returns
Geometrical object with its Z-coordinates removed.
- Return type
- geomcompare.geomutils.get_transform_func(epsg_in, epsg_out)[source]
Get function to transform a geometrical object to another SRS.
Create and return a function that transforms the XY-coordinates of
GeomObject
instances from one spatial reference system to another. The function identifies input and output spatial reference systems by the EPSG code.- Parameters
- Returns
Function that takes one
GeomObject
as positional argument and returns theGeomObject
with its XY-coordinates transformed to the output spatial reference system.- Return type
geomcompare.io module
- namedtuple geomcompare.io.ConnectionParameters(host, dbname, user, password, port=5432)[source]
Bases:
NamedTuple
Parameters to open a connection to a PostGIS database.
Instances of this class are intended to be used as parameter for the
fetch_geoms_from_pg
function.
- namedtuple geomcompare.io.SchemaTableColumn(schema, table, column)[source]
Bases:
NamedTuple
Location of a geometry column in a PostGIS database.
Instances of this class are intended to be used as parameter for the
fetch_geoms_from_pg
function.
- geomcompare.io.fetch_geoms_from_pg(conn=None, conn_params=None, sql_query=None, geoms_col_loc=None, aoi=None, aoi_epsg=None, output_epsg=None)[source]
Fetch geometrical features from a PostGIS database.
Generator function which connects or uses an existing connection to a PostGIS database, and yields geometrical features from specified geometry column (within a given area or not), or based on a user-defined SQL query. If the connection to the database is opened by the function, it will be closed automatically after the last geometrical feature is yielded.
- Parameters
conn (
psycopg2.extensions.connection
, optional) – Pre-opened connection to the PostGIS database.conn_params (
ConnectionParameters
, optional) – Parameters to open a connection to the PostGIS database.sql_query (
str
, optional) – SQL query to use to extract geometrical features from the PostGIS database.geoms_col_loc (
SchemaTableColumn
, optional) – Geometry column location within the PostGIS database.aoi (
GeomObject
, optional) – Area of interest, where the geometrical features lies.aoi_epsg (
int
, optional) – EPSG code of the area of interest geometry/ies.output_epsg (
int
, optional) – EPSG code of the yielded geometrical features. This parameter can be used to transform the yielded geometries to a different Spatial Reference System from the one used in the PostGIS database.
- Yields
GeomObject
– Geometrical features from the PostGIS database.- Raises
ValueError – If both
conn
andconn_params
parameters are not passed an argument different fromNone
.ValueError – If both
sql_query
andgeoms_col_loc
parameters are not passed an argument different fromNone
.
Notes
In the case where the
sql_query
parameter is given, the parametersgeoms_col_loc
,aoi
,aoi_epsg
andoutput_epsg
will be ignored, as SQL queries can include filtering and reprojection.
- geomcompare.io.LayerID = typing.Union[str, int]
Type:
_UnionGenericAlias
Type for identifying layers.
- namedtuple geomcompare.io.LayerFilter(layer_id=None, aoi=None, aoi_epsg=None, attr_filter=None, fids=None)[source]
Bases:
NamedTuple
Filter for extraction of geometrical features from file.
Instances of this class are intended to be used as parameter for the
extract_geoms_from_file
function, for filtering and choosing the geometrical features to extract.- Fields
layer_id (
Union
[str
,int
,None
]) – Name or index of the layer the filter will be applied to. If set toNone
, the filter will be applied on all layers.aoi (
Union
[LinearRing
,LineString
,MultiLineString
,MultiPoint
,MultiPolygon
,Point
,Polygon
,None
]) –GeomObject
, optional: Area of interest, where the geometrical features lies. All features lying outside the area of interest will be filtered out (not extracted).aoi_epsg (
Optional
[int
]) – EPSG code of the area of interest geometry/ies. If set toNone
, the same Spatial Reference System as the layer will be used.attr_filter (
Optional
[str
]) – Valid string representation of an attribute filter (e.g."attr_name = 'value'"
).fids (
Optional
[Sequence
[int
]]) – IDs of the features to extract from the layer. This parameter will be ignored if either theaoi
or theattr_filter
parameters are specified by the user.
- geomcompare.io.extract_geoms_from_file(filename, driver_name, layers=None, layer_filters=None)[source]
Extract geometrical features from a GDAL/OGR-readable file.
Generator function which opens a file located on disk, with one of the existing GDAL/OGR drivers, and yields geometrical features, from one or several layers. The function also permits the use of filters to allow for fine-grained extraction of the geometrical features.
- Parameters
filename (
str
) – Path to the file to extract the geometrical features from.driver_name (str) – Name of the GDAL/OGR driver to use for opening the file.
layers (sequence of
LayerID
, optional) – Layers from which the geometrical features will be extracted. If set toNone
(default), geometrical features will be extracted from all layers.layer_filters (sequence of
LayerFilter
, optional) – Filters to apply to the layer(s) when extracting the geometrical features.
- Yields
GeomObject
– Geometrical features from the file.- Raises
NotImplementedError – If GDAL/OGR is not installed or not importable.
- geomcompare.io.write_geoms_to_file(filename, driver_name, geoms_iter, geoms_epsg=None, layer=None, mode='update')[source]
Write multiple geometrical features to disk.
The function takes as input an iterable of geometrical features and writes them to disk using one of the existing GDAL/OGR drivers.
- Parameters
filename (
str
) – Path to the output file where the geometrical features will be written to.driver_name (
str
) – Name of the GDAL/OGR driver to use for writing the file.geoms_iter (iterable of
GeomObject
) – Iterable of the geometrical features to write.geoms_epsg (
int
, optional) – EPSG code of the input geometrical features. If the Spatial Reference System of the input geometrical features is specified and differs from that of the layer they will written to (in case of an update, see``mode`` parameter), the coordinates of the geometries will be reprojected to the layer’s Spatial Reference System. It is set toNone
as default (no Spatial Reference System).layer (
LayerID
, optional) – Layer name/index on which to write the input geometries. In case of a file update (seemode
parameter), the index of an existing layer can be passed as argument. If layer is set toNone
(default), the geometrical features will be written, inupdate
mode, on the first layer available (at index 0), if any. If no layer is available, as well as inoverwrite
mode, the layer parameter set toNone
will result in the function writing the input geometries to a layer nameddefault
(if the driver supports named layers).mode ({
"update"
,"overwrite"
}) – If set to"update"
, the function will update an existing file, or will create it if it does not exist. If set to"overwrite"
, the function will delete any file at the path set to thefilename
parameter, and will create a new file at this same location.
- Return type