I have a (geospatial) dataset which is a 2D dense array 21,600 x 43,200 with float values. From this set I would like to read a set of ~100,000 coordinate pairs (corresponding to locations) and get back an array of 100,000 floats. Performance is important in this use-case.
I have been using zarr and the get_coordinate_selection function for the same task and this works pretty well - and specifically it seems to treat the request as a ‘batch’ to ensure that the chunks needed are decompressed only once. However I would like to compare the performance to TileDB and also I am interested in database (especially Trino) integration. I was looking at multi_index, but this seems to take a cross_product of ranges which is a bit different.
How best to achieve this? I am using Python API but C#/C++ also fine. I have an uneasy feeling that I am just missing something in the documentation, in which case apologies in advance!