geomcompare package

GeomCompare

The Geomcompare package provides multiple tools for comparing two independant sets of geometrical features.

Documentation for GeomCompare is available in the form of docstrings provided with the code, as well as on the project’s homepage https://geomcompare.readthedocs.io/.

Available submodules

geomrefdb

Defines the main classes of the library used for comparing geometry datasets.

io

Provides a set of tools for I/O operations, extracting geometrical features from disk or from a PostGIS database, as well as writing a dataset of geometries to disk.

comparefunc

Defines a few comparison functions to use with the geomrefdb’s main classes.

geomutils

Defines a few functions and types to work with shapely geometrical objects.

stats

Defines functions for computing classifier metrics (e.g. when comparing a result dataset from a machine learning model with a reference dataset).

Submodules

geomcompare.comparefunc module

geomcompare.geomrefdb module

class geomcompare.geomrefdb.PostGISGeomRefDB(PG_params, PG_schema, PG_table, PG_geoms_column)[source]

Bases: geomcompare._geomrefdb_abc.GeomRefDB

true_positives(geoms_iter, geoms_EPSG, same_geoms_func)[source]

Return an iterable of input geometries that are matching geometries of the GeomRefDB instance.

false_positives(geoms_iter, geoms_EPSG, same_geoms_func)[source]

Return an iterable of input geometries that are not matching any geometry of the GeomRefDB instance.

missing_geometries(geoms_iter, AOI_geom, geoms_EPSG, same_geoms_func)[source]

Return an iterable of geometries of the GeomRefDB that are not matching any of the input geometries.

class geomcompare.geomrefdb.RtreeGeomRefDB(geoms_iter, geoms_EPSG)[source]

Bases: geomcompare._geomrefdb_abc.GeomRefDB

true_positives(geoms_iter, geoms_EPSG, same_geoms_func)[source]

Return an iterable of input geometries that are matching geometries of the GeomRefDB instance.

false_positives(geoms_iter, geoms_EPSG, same_geoms_func)[source]

Return an iterable of input geometries that are not matching any geometry of the GeomRefDB instance.

missing_geometries(geoms_iter, AOI_geom, geoms_EPSG, same_geoms_func)[source]

Return an iterable of geometries of the GeomRefDB that are not matching any of the input geometries.

geomcompare.geomrefdb.SpatialiteGeomType = typing.Literal['Point', 'LineString', 'Polygon', 'MultiPoint', 'MultiLineString', 'MultiPolygon', 'GeometryCollection']

Type:    _LiteralGenericAlias

Geometry types supported by the SQLiteGeomRefDB class.

class geomcompare.geomrefdb.SQLiteGeomRefDB(filename=None, default_epsg=None, geoms_iter=None, geoms_tab_name=None, geom_type=None, geoms_epsg=None, in_ram=True, logger=None, logger_name=None, logging_level=20)[source]

Bases: geomcompare._geomrefdb_abc.GeomRefDB

Concrete implementation of the GeomRefDB ABC using SQLite.

SQLiteGeomRefDB is a concrete implementation of the interface defined by the GeomRefDB abstract base class. It enables to load an existing (or create a new) SQLite database, where geometry datasets can be stored and can be compared (based on geometry similarity functions) with other geometrical features from an external dataset. Instances of this class can handle simultaneously multiple reference datasets, with various geometry types (see supported_geom_types) and spatial reference systems.

Parameters
  • filename (str, optional) – Path to an existing spatialite database.

  • default_epsg (int, optional) – Default EPSG code of the geometrical features that will be added to the database. If specified, the EPSG code will be default value of the geoms_epsg parameter for any subsequent call of the add_geometries() method.

  • geoms_iter (iterable of GeomObject, optional) – Iterable of the geometrical features to add to this SQLiteGeomRefDB instance. Such features can also be added later to the class instance with the add_geometries() method.

  • geoms_tab_name (str, optional) – Name of the table where the geometrical features are to be stored. If the geoms_iter parameter is not given, geoms_tab_name will be ignored.

  • geom_type (SpatialiteGeomType, optional) – Geometry type of the geometrical features passed as argument to the geoms_iter parameter.

  • geoms_epsg (int, optional) – EPSG code of the geometrical features passed as argument to the geoms_iter parameter. If specified, it overrides the default_epsg parameter during the instance construction.

  • in_ram (bool, default: True) – Set to True to create/load the database in RAM for faster access. Set to False for larger-than-RAM databases.

  • logger (logging.Logger, optional) – Logger instance to use for logging outputs.

  • logger_name (str, optional) – Name of the logging.Logger object to create for logging outputs. This parameter will be ignored if a Logger instance is passed to the logger parameter.

  • logging_level (int, default: logging.INFO) – Logging level of the logging output. For mor information, please see the documentation of the logging module.

Raises

ValueError – If in_ram=False and filename=None.

Notes

This class makes use of the spatialite extension of SQLite, and as such, spatialite must be installed and available in order to work with instances of this class.

class property supported_geom_types

Types supported by SQLiteGeomRefDB.

Type

list of supported geometry types

property filename

Path of the opened database file. The attribute is set to None if a new database was created in RAM for this instance.

property in_ram

True if the database is created/loaded in RAM. False if the instance is connected to database file on disk.

property default_epsg

Default EPSG code of the geometrical features that are added to the database.

property logger

Ready configured Logger instance used for logging outputs.

save_db(filename, overwrite=True)[source]

Save the internal SQLite database to disk.

The function saves the internal SQLite database, together with all the geometrical features added with add_geometries(), to disk. The path of the resulting output file can later be passed to the filename argument of the SQLiteGeomRefDB class’ constructor to load the saved database with all its features. This function is useful only to save loaded-in-RAM databases, as the geometrical features added to a SQLiteGeomRefDB instance, with an opened connections to databases that reside on disk, will be saved automatically even after the instance destruction.

Parameters
  • filename (str) – Path of the output database file.

  • overwrite (bool, default: True) – True if the output file should overwrite any existing file at path filename, else False.

Return type

None

add_geometries(geoms_iter, geom_type=None, geoms_epsg=None, geoms_tab_name=None)[source]

Add geometrical features to the internal SQLite database.

The function adds geometrical features to the internal SQLite database, which can then be used as a “reference dataset” when running other public methods of the SQLiteGeomRefDB instance.

Parameters
  • geoms_iter (iterable of GeomObject) – Iterable of the geometrical features to add to this SQLiteGeomRefDB instance.

  • geom_type (SpatialiteGeomType, optional) – Geometry type of the input geometrical features. If the geom_type is not specified by the user, the function will assume that the input features have the same geometry type as the features already stored in the destination table.

  • geoms_epsg (int, optional) – EPSG code of the input geometrical features. If the geoms_epsg is not specified by the user, the function will assume that the input features are in the same spatial reference system as the features already stored in the destination table. Also, if the input features are to be stored in a new table of the database and the geoms_epsg is omitted, the SSQLiteGeomRefDB instance will use the EPSG code stored in the default_epsg attribute (if set).

  • geoms_tab_name (str, optional) – Name of the table where the input geometrical features are to be stored in the internal SQLite database. If no argument is passed to the geoms_tab_name parameter, the function will try to store the input geometrical features into a table named default_table. The default_table table will be created if it does not already exist in the database.

Raises
  • ValueError – If geom_type is not specified, in the case of a new database/table.

  • ValueError – If geoms_epsg is not specified, in the case of a new database/table and if the default_epsg attribute is not set.

  • ValueError – If the argument passed to the geom_type parameter does not match the geometry type of the features already stored in the destination table.

Warning

The geometry type must be the same for all input features as they are to be stored in the same geometry column of the same table, and spatialite does not allow geometry columns to have mixed geometry types.

Return type

None

get_geometries(aoi_geom=None, aoi_epsg=None, geoms_tab_name=None, output_epsg=None)[source]

Get geometrical features from the internal SQLite database.

Generator function which yields geometrical features stored in the internal database. The user can specify the table, or define a limited area to yield the features from. In addition, the spatial reference system of the output geometries can also be specified.

Parameters
  • aoi_geom (GeomObject, optional) – Area of interest, where the geometrical features lies.

  • aoi_epsg (int, optional) – EPSG code of the area of interest geometry/ies.

  • geoms_tab_name (str, optional) – Name of the table where the geometrical features are stored in the internal SQLite database. If no argument is passed to the geoms_tab_name parameter, the function will try to yield geometrical features from a table named default_table.

  • output_epsg (int, optional) – EPSG code of the yielded geometrical features. This parameter can be used to transform the yielded geometries to a different Spatial Reference System from the one used in the internal database.

Yields

GeomObject – Geometrical features from the internal SQLite database.

Raises

ValueError – If geoms_tab_name is not specified and no table named default_table exist in the database.

db_geom_info(to_stdout=False, count_features=False)[source]

Get information on features stored in the internal SQLite database.

Get information on the geometrical features such as the name of the table(s) where they are stored, their geometry type(s), spatial reference system(s) and the number of features per table. This information can be returned as dict instance, or printed to stdout.

Parameters
  • to_stdout (bool, default: False) – If set to False, the information is returned as a dict. If set to True, the information is written to stdout.

  • count_features (bool, default: False) – If set to True, the function will also return the number of features/rows per table. If set to False, the features will not be counted.

Returns

If to_stdout=False, returns a dict which keys are the table name(s), and which values are information on the individual table(s). This information is itself structured as a dict, which key/value pairs indicate for each table the geometry type (key: geom_type), the spatial reference system (key: srid), and optionally (if count_features=True) the features count (key: count). The function returns None if to_stdout=True.

Return type

dict or None

true_positives(geoms_iter, aoi_geom=None, geoms_epsg=None, geoms_tab_name=None, geoms_match=None, get_search_frame=None, ncores=None)[source]

Identidy matching input geometries.

The function takes as input geometrical features, and searches for reference features in one table of the internal database which geometries are considered to match that of the input features. All input features that have a geometry that matches the geometry of at least one of the reference features will be yielded back by the function.

Parameters
  • geoms_iter (iterable of GeomObject) – Iterable of input geometrical features to compare to the features of the internal SQLite database.

  • aoi_geom (GeomObject, optional) – Area of interest, within which the database’s features must lie.

  • geoms_epsg (int, optional) – EPSG code of the input geometrical features (including aoi_geom if specified). If the geoms_epsg is not specified by the user, the function will assume that the input features are in the same spatial reference system as the reference features.

  • geoms_tab_name (str, optional) – Name of the table where database’s features that will be used as reference are stored. If no argument is passed to the geoms_tab_name parameter, the function will search for reference features in a table named default_table.

  • geoms_match (callable, optional) –

    Comparison function that takes two positional arguments:

    The function returns True if it finds that both geometries match, else returns False. If this parameter is omitted, the input geometrical feature will always be considered as a match in the case where its search frame (see get_search_frame parameter) interesects with one of the feature from the database’s table.

  • get_search_frame (callable, optional) – Function that takes as single argument an input geometry (GeomObject) and returns its search frame (GeomObject). If this parameter is omitted, the search frame will be the same as the input geometry.

  • ncores (int, optional) – Number of cores to use for running the function. If unspecified, the function will run in a single process

Yields

GeomObjectMatching input geometrical features.

Notes

If the spatial reference system of the input geometrical features is different from that of the database’s features, the input features’ coordinates are reprojected on-the-fly, before being compared to features stored in the database. If an input feature is considered to be a match, it is yielded back unchanged (its coordinates in the original spatial reference system).

false_positives(geoms_iter, aoi_geom=None, geoms_epsg=None, geoms_tab_name=None, geoms_match=None, get_search_frame=None, ncores=None)[source]

Identify non-matching input geometries.

The function takes as input geometrical features, and searches for reference features in one table of the internal database which geometries are considered to match that of the input features. All input features that DO NOT have a geometry that matches the geometry of any reference features will be yielded back by the function.

Parameters
  • geoms_iter (iterable of GeomObject) – Iterable of input geometrical features to compare to the features of the internal SQLite database.

  • aoi_geom (GeomObject, optional) – Area of interest, within which the database’s features must lie.

  • geoms_epsg (int, optional) – EPSG code of the input geometrical features (including aoi_geom if specified). If the geoms_epsg is not specified by the user, the function will assume that the input features are in the same spatial reference system as the reference features.

  • geoms_tab_name (str, optional) – Name of the table where database’s features that will be used as reference are stored. If no argument is passed to the geoms_tab_name parameter, the function will search for reference features in a table named default_table.

  • geoms_match (callable, optional) –

    Comparison function that takes two positional arguments:

    The function returns True if it finds that both geometries match, else returns False. If this parameter is omitted, the input geometrical feature will always be considered as a match in the case where its search frame (see get_search_frame parameter) interesects with one of the features from the database’s table.

  • get_search_frame (callable, optional) – Function that takes as single argument an input geometry (GeomObject) and returns its search frame (GeomObject). If this parameter is omitted, the search frame will be the same as the input geometry.

  • ncores (int, optional) – Number of cores to use for running the function. If unspecified, the function will run in a single process

Yields

GeomObjectNon-matching input geometrical features.

Notes

If the spatial reference system of the input geometrical features is different from that of the database’s features, the input features’ coordinates are reprojected on-the-fly, before being compared to features stored in the database. If an input feature is NOT considered to be a match, it is yielded back unchanged (its coordinates in the original spatial reference system).

missing_geometries(geoms_iter, geom_type=None, aoi_geom=None, geoms_epsg=None, geoms_tab_name=None, geoms_match=None, get_search_frame=None, ncores=None)[source]

Identify (missing) non-matching reference geometries.

The function takes as input geometrical features, and searches for reference features in one table of the internal database which geometries are NOT considered to match the geometry of any feature from the input set. All reference features that DO NOT have a geometry that matches the geometry of any input features will be yielded by the function.

Parameters
  • geoms_iter (iterable of GeomObject) – Iterable of input geometrical features to compare to the features of the internal SQLite database.

  • geom_type (SpatialiteGeomType, optional) – Geometry type of the input geometrical features. If the geom_type is not specified by the user, the function will assume that the input features have the same geom_type as the reference features.

  • aoi_geom (GeomObject, optional) – Area of interest, within which the database’s features must lie.

  • geoms_epsg (int, optional) – EPSG code of the input geometrical features (including aoi_geom if specified). If the geoms_epsg is not specified by the user, the function will assume that the input features are in the same spatial reference system as the reference features.

  • geoms_tab_name (str, optional) – Name of the table where database’s features that will be used as reference are stored. If no argument is passed to the geoms_tab_name parameter, the function will search for reference features in a table named default_table.

  • geoms_match (callable, optional) –

    Comparison function that takes two positional arguments:

    The function returns True if it finds that both geometries match, else returns False. If this parameter is omitted, the input geometrical feature will always be considered as a match in the case where its search frame (see get_search_frame parameter) interesects with one of the features from the database’s table.

  • get_search_frame (callable, optional) – Function that takes as single argument an input geometry (GeomObject) and returns its search frame (GeomObject). If this parameter is omitted, the search frame will be the same as the input geometry.

  • ncores (int, optional) – Number of cores to use for running the function. If unspecified, the function will run in a single process

Yields

GeomObjectNon-matching reference geometrical features.

Notes

If the spatial reference system of the input geometrical features is different from that of the database’s features, the input features’ coordinates are reprojected on-the-fly, before being compared to features stored in the database.

geomcompare.geomutils module

geomcompare.geomutils.GeomObject = typing.Union[shapely.geometry.polygon.LinearRing, shapely.geometry.linestring.LineString, shapely.geometry.multilinestring.MultiLineString, shapely.geometry.multipoint.MultiPoint, shapely.geometry.multipolygon.MultiPolygon, shapely.geometry.point.Point, shapely.geometry.polygon.Polygon]

Type:    _UnionGenericAlias

Type for shapely geometrical objects.

geomcompare.geomutils.to_2D(geom)[source]

Remove the third dimension of a geometrical object’s coordinates.

Parameters

geom (GeomObject) – Shapely geometrical object with XYZ-coordinates.

Returns

Geometrical object with its Z-coordinates removed.

Return type

GeomObject

geomcompare.geomutils.get_transform_func(epsg_in, epsg_out)[source]

Get function to transform a geometrical object to another SRS.

Create and return a function that transforms the XY-coordinates of GeomObject instances from one spatial reference system to another. The function identifies input and output spatial reference systems by the EPSG code.

Parameters
  • epsg_in (int) – EPSG code of the input spatial reference system.

  • epsg_out (int) – EPSG code of the output spatial reference system.

Returns

Function that takes one GeomObject as positional argument and returns the GeomObject with its XY-coordinates transformed to the output spatial reference system.

Return type

callable

geomcompare.io module

namedtuple geomcompare.io.ConnectionParameters(host, dbname, user, password, port=5432)[source]

Bases: NamedTuple

Parameters to open a connection to a PostGIS database.

Instances of this class are intended to be used as parameter for the fetch_geoms_from_pg function.

Fields
  1.  host (str) – Database host address.

  2.  dbname (str) – Database name.

  3.  user (str) – User name used to authenticate.

  4.  password (str) – Password used to authenticate.

  5.  port (int) – Connection port number.

namedtuple geomcompare.io.SchemaTableColumn(schema, table, column)[source]

Bases: NamedTuple

Location of a geometry column in a PostGIS database.

Instances of this class are intended to be used as parameter for the fetch_geoms_from_pg function.

Fields
  1.  schema (str) – Schema name of the PostGIS database, where the table containing the geometrical features is located.

  2.  table (str) – Table name, where the geometrical features can be found.

  3.  column (str) – Column name, where the geometrical features can be found.

geomcompare.io.fetch_geoms_from_pg(conn=None, conn_params=None, sql_query=None, geoms_col_loc=None, aoi=None, aoi_epsg=None, output_epsg=None)[source]

Fetch geometrical features from a PostGIS database.

Generator function which connects or uses an existing connection to a PostGIS database, and yields geometrical features from specified geometry column (within a given area or not), or based on a user-defined SQL query. If the connection to the database is opened by the function, it will be closed automatically after the last geometrical feature is yielded.

Parameters
  • conn (psycopg2.extensions.connection, optional) – Pre-opened connection to the PostGIS database.

  • conn_params (ConnectionParameters, optional) – Parameters to open a connection to the PostGIS database.

  • sql_query (str, optional) – SQL query to use to extract geometrical features from the PostGIS database.

  • geoms_col_loc (SchemaTableColumn, optional) – Geometry column location within the PostGIS database.

  • aoi (GeomObject, optional) – Area of interest, where the geometrical features lies.

  • aoi_epsg (int, optional) – EPSG code of the area of interest geometry/ies.

  • output_epsg (int, optional) – EPSG code of the yielded geometrical features. This parameter can be used to transform the yielded geometries to a different Spatial Reference System from the one used in the PostGIS database.

Yields

GeomObject – Geometrical features from the PostGIS database.

Raises
  • ValueError – If both conn and conn_params parameters are not passed an argument different from None.

  • ValueError – If both sql_query and geoms_col_loc parameters are not passed an argument different from None.

Notes

In the case where the sql_query parameter is given, the parameters geoms_col_loc, aoi, aoi_epsg and output_epsg will be ignored, as SQL queries can include filtering and reprojection.

geomcompare.io.LayerID = typing.Union[str, int]

Type:    _UnionGenericAlias

Type for identifying layers.

namedtuple geomcompare.io.LayerFilter(layer_id=None, aoi=None, aoi_epsg=None, attr_filter=None, fids=None)[source]

Bases: NamedTuple

Filter for extraction of geometrical features from file.

Instances of this class are intended to be used as parameter for the extract_geoms_from_file function, for filtering and choosing the geometrical features to extract.

Fields
  1.  layer_id (Union[str, int, None]) – Name or index of the layer the filter will be applied to. If set to None, the filter will be applied on all layers.

  2.  aoi (Union[LinearRing, LineString, MultiLineString, MultiPoint, MultiPolygon, Point, Polygon, None]) – GeomObject, optional: Area of interest, where the geometrical features lies. All features lying outside the area of interest will be filtered out (not extracted).

  3.  aoi_epsg (Optional[int]) – EPSG code of the area of interest geometry/ies. If set to None, the same Spatial Reference System as the layer will be used.

  4.  attr_filter (Optional[str]) – Valid string representation of an attribute filter (e.g. "attr_name = 'value'").

  5.  fids (Optional[Sequence[int]]) – IDs of the features to extract from the layer. This parameter will be ignored if either the aoi or the attr_filter parameters are specified by the user.

geomcompare.io.extract_geoms_from_file(filename, driver_name, layers=None, layer_filters=None)[source]

Extract geometrical features from a GDAL/OGR-readable file.

Generator function which opens a file located on disk, with one of the existing GDAL/OGR drivers, and yields geometrical features, from one or several layers. The function also permits the use of filters to allow for fine-grained extraction of the geometrical features.

Parameters
  • filename (str) – Path to the file to extract the geometrical features from.

  • driver_name (str) – Name of the GDAL/OGR driver to use for opening the file.

  • layers (sequence of LayerID, optional) – Layers from which the geometrical features will be extracted. If set to None (default), geometrical features will be extracted from all layers.

  • layer_filters (sequence of LayerFilter, optional) – Filters to apply to the layer(s) when extracting the geometrical features.

Yields

GeomObject – Geometrical features from the file.

Raises

NotImplementedError – If GDAL/OGR is not installed or not importable.

geomcompare.io.write_geoms_to_file(filename, driver_name, geoms_iter, geoms_epsg=None, layer=None, mode='update')[source]

Write multiple geometrical features to disk.

The function takes as input an iterable of geometrical features and writes them to disk using one of the existing GDAL/OGR drivers.

Parameters
  • filename (str) – Path to the output file where the geometrical features will be written to.

  • driver_name (str) – Name of the GDAL/OGR driver to use for writing the file.

  • geoms_iter (iterable of GeomObject) – Iterable of the geometrical features to write.

  • geoms_epsg (int, optional) – EPSG code of the input geometrical features. If the Spatial Reference System of the input geometrical features is specified and differs from that of the layer they will written to (in case of an update, see``mode`` parameter), the coordinates of the geometries will be reprojected to the layer’s Spatial Reference System. It is set to None as default (no Spatial Reference System).

  • layer (LayerID, optional) – Layer name/index on which to write the input geometries. In case of a file update (see mode parameter), the index of an existing layer can be passed as argument. If layer is set to None (default), the geometrical features will be written, in update mode, on the first layer available (at index 0), if any. If no layer is available, as well as in overwrite mode, the layer parameter set to None will result in the function writing the input geometries to a layer named default (if the driver supports named layers).

  • mode ({"update", "overwrite"}) – If set to "update", the function will update an existing file, or will create it if it does not exist. If set to "overwrite", the function will delete any file at the path set to the filename parameter, and will create a new file at this same location.

Return type

None

geomcompare.stats module