Hello !
I was wondering a bit on how to optimize tile fetching and caching on arrays.
So from what I’ve gathered. The only way to cache a cloud array for now is by playing with the parameter sm.tile_cache_size
. Which impact an uncompressed in memory LRU cache of fetched tiles by the current session.
The Issue is that, sometimes, you might want to keep a “generated” cache from one session to another ( as in, commit the cache to the client filesystem) ( For example on compute instances, on long lived sessions that you might have to restart at some point without loosing the cache, or have multiple tiledb clients opening the same cache etc… )
I found an old issue relating to that problem.
But I’m not sure if you had adressed it at some point and I’m missing something.
My current workaround is using minio gateway:
docker run \
--name minio-s3-gateway \
-d \
-e MINIO_ROOT_USER=$PUB_KEY \
-e MINIO_ROOT_PASSWORD=$PRIV_KEY \
-e MINIO_CACHE="on" \
-e MINIO_CACHE_DRIVES="/data" \
-p 9000:9000 \
-v minio-cache:/data \
minio/minio gateway s3
(You can tune some more params for the caching policy )
It uses minio as a local cache for the fetched objects.
This has the benefit of faster metadata fetch on array opening.
I guess the question is: Is there a way to create a longer lived cache of fetched tiles that isn’t cleared when you close the client / can reuse between tiledb embeded clients ?
Thanks !