COG ImageMosaic from local storage to S3¶
Introduction¶
This tutorial provides instructions to update an existing ImageMosaic built on top of local granules to a COG ImageMosaic with granules stored on S3 bucket. It is aimed to users that want to move COG granules of an ImageMosaic to a remote bucket without the need of re-harvesting the whole collection of granules.
Assumptions¶
An ImageMosaic store already exists, with its index based on a DB (i.e. PostGIS).
Local GeoTIFF granules are already valid COGs.
User has experience with uploading data on S3.
Verifying data is valid COG¶
Verifying that a sample GeoTIFF is a valid COG can be achieved using COG validator service.
Store a sample GeoTIFF to the target bucket (or to the server location) you will use for remote serving and copy its full URL location, i.e. https://modis-vi-nasa.s3-us-west-2.amazonaws.com/MOD13A1.006/2018.01.01.tif.
Go to COG Validator
Paste the sample COG URL in the text box and hit the submit button.
In case the sample file is a valid COG, you will get a message like this:
Cloud Optimized GeoTIFF Validator: result Validation succeeded !
https://sample.s3.eu-central-1.amazonaws.com/test/cog.tif
is a valid Cloud Optimized GeoTIFF.
In case the file isn’t a valid COG, you can use GDAL 3.1 or above to convert your file to COG format. See the related GDAL documentation for further details.
Once the data has been verified, all of your granules need to be stored to an S3 bucket.
ImageMosaic update¶
Next step is updating both the ImageMosaic’s config as well as the index.
ImageMosaic configuration update¶
A few new properties need to be added to the ImageMosaic configuration to support COG.
Locate the .properties
file containing the mosaic configuration. It’s usually a .properties
file having the same name of the parent folder.
You may recognize it since it’s usually being autogenerated during first ImageMosaic configuration and it contains this header:
#-Automagically created from GeoTools-
.
Let’s assume it’s named mosaic.properties
for simplicity for future references in this documentation.
Once located, edit that file by adding these new properties:
Cog=true
SuggestedSPI=it.geosolutions.imageioimpl.plugins.cog.CogImageReaderSpi
When storing your granules on a public bucket, you may stick with the default RangeReader implementation so no other flags are needed and you can skip to the ImageMosaic index update paragraph.
In case you are using a private bucket instead, you need to specify additional properties to the mosaic.properties file:
CogRangeReader=it.geosolutions.imageioimpl.plugins.cog.S3RangeReader
CogUser=S3AccessKeyID
CogPassword=S3SecretAccessKey
Where the S3AccessKeyID
and S3SecretAccessKey
are the actual values needed to access that bucket.
ImageMosaic index update¶
The next step is updating the ImageMosaic index which is a catalog of all the granules composing the mosaic. We need to update the location values to refer to remote URLs instead of local paths on disk. The location attribute initially contains the path of each granule on disk, which can be either a relative or an absolute path. Relative paths are relative to the ImageMosaic parent configuration folder whilst absolute paths are full paths.
The mosaic.properties
file contains a PathType
property set to RELATIVE
or ABSOLUTE
.
On old mosaics, that property might be missing and AbsolutePath
property exists instead with a boolean value true/false.
Based on that, note that all the paths of the same mosaic will be either relative or absolute.
To give you an example, an ImageMosaic stored at /var/data/imageMosaic/mosaic
with a granule at /var/data/imageMosaic/mosaic/2018.01.01.tif
may have a record in the database with location attribute equal to :
2018.01.01.tif
in case of relative path/var/data/imageMosaic/mosaic/2018.01.01.tif
in case of absolute path.
The type of path affects the query to be executed to update the index.
Note
Make sure to backup your table for a quick recovery in case of messes with the updating query.
For this example, we are going to use the same public datasets from S3 Urls being used in the previous ImageMosaic example with Modis COG datasets section.
For location with relative paths a simple replacing query could be like this:
UPDATE schema.table SET location=CONCAT(
'https://modis-vi-nasa.s3-us-west-2.amazonaws.com/MOD13A1.006/', location);
So we are basically prepending the S3 bucket URL to the location value.
By this way, based on the above examples,
location=2018.01.01.tif
will become location='https://modis-vi-nasa.s3-us-west-2.amazonaws.com/MOD13A1.006/2018.01.01.tif
For location with absolute path, a replacing query may be like this (for our example):
UPDATE schema.table SET location=REPLACE(location,'/var/data/imageMosaic/mosaic/',
'https://modis-vi-nasa.s3-us-west-2.amazonaws.com/MOD13A1.006/');
GeoServer reload¶
Once everything is done, reload the GeoServer configuration.