Hi All,
I’m building an API that serves up data from a TileDB database that I have stored on AWS S3, everything is working but is doing so very slowly.
The problem is confined to the following code block.
with tiledb.open(array_dir, "r") as TileDB_array: # Takes 1.7s
data = TileDB_array[start_idx:end_idx] # Takes 9.8s
I’ve tried to do a number of things to get the timings down to what they are above: first I reduced the tile size from a year to a day which led to a x4 speedup, then I moved my API server to the same AWS region which led to a 2x speed-up.
The problem is that despite these speed-ups it still takes ~11s to serve the data, far too slow for my requirements and far slower than the previous SQL set-up I had running.
Are there any general tips/tricks that can be used to speed-up slicing and accessing data? I’m using a time series index but I cant imagine that slows things down too much as its stored as integer values.
Thanks for all help and guidance!