I’m attempting to use TileDB as a backend for a mapserver. With the way my data works I need to add new images every 10 minutes (or more frequently) for the same data. I was hoping to be able to use a “time” dimension in TileDB, append new images to an array, and remove old images (to save space). From my research this doesn’t seem possible. Is that right?
I mentioned mapserver because I’m wondering since TileDB has a mapserver example if anyone has looked at supporting a TileDB dimension from mapserver?
Indeed, TileDB currently does not support deleting previous images the way you want to. The quick workaround for now is the following:
Add a time dimension to the array schema. Each 2D image should be written in a dense slice like A[t, x_range, y_range] or A[x_range, y_range, t], where t is the timestamp of insertion. Make sure to set the tile extent to 1, or any number that is meaningful if you write and/or slice more images than one at a time.
TileDB stores the data of each write operation in a timestamped subdirectory inside the array directory. Each such subdirectory has name with format __t1_t2_uuid, where t1 and t2 are timestamps (which will be equal in your case if you do not run consolidation). Those timestamps are ms elapsed since 1970-01-01 00:00:00 +0000 (UTC). By listing your subdirectories in the console you can see them sorted in time (based on the name) and simply just remove the older subdirectories that you do not need. These deletions will not corrupt the array at all, and you will be able to see only your most recent images as desired.
I hope this helps. We are happy to add a more explicit API function to carry this out for you. If you could post this as a suggested feature on http://feedback.tiledb.com/ we will review it and schedule it.
that is an interesting use case you suggest, supporting a TileDB dimension from Mapserver. I have used Mapserver a lot in the past (https://mapserver.org/ogc/wcs_format.html) for serving time series data and I can look into creating a time series Mapserver demo with TileDB.
Are you able to share your existing map file so I can see what you are currently doing (or email me directly norman tiledb.com), but lets keep the general discussion on this forum.
The main issues (besides my lack of experience with mapserver and tiledb) I have are that the data I want to serve is in a non-EPSG projection and I would like to avoid resampling it. The other issue is not knowing what’s possible by pointing to a TileDB backend. I liked what I saw from TileDB and where it seems TileDB is going so wanted to try it out. The other is really not understanding if or how mapserver supports updating data sources in the time dimension (wms_timeextent).
I based what I have so far on the existing example on the tiledb docs:
LAYER
NAME "goes16_fldk_true_color"
TYPE RASTER
DATA "/data/goes16_fldk_true_color"
OFFSITE 0 0 0
PROJECTION
# "+proj=geos +sweep=x +lon_0=-75 +h=35786023 +x_0=0 +y_0=0 +ellps=GRS80 +units=m +no_defs"
AUTO
END
METADATA
"wms_title" "CSPP Geo Geo2Grid GOES-16 Full Disk True Color"
"wms_timeextent" "2019-01-01/2021-01-01"
#"wms_timedefault" "2019-05-01 10:11:12"
"wms_timeitem" "TIFFTAG_DATETIME"
"wms_enable_request" "*"
END
FILTER (`[TIFFTAG_DATETIME]`=`2019-02-01 01:02:03`)
END
If I have to make a shapefile to specify the time indexing then that’s too bad but I understand. I wonder if the TileDB GDAL driver could be updated/changed (if needed) to filter queries by dimension (like time or elevation) like PostGIS can (from what I’ve read).