Preparing array for reading itself takes sometime

TileDbUser · June 26, 2019, 11:39pm

I am just going off original Tile-db C++ API example where they have code that prepares the array for reading before query is submitted.

Array array(ctx, array_name, TILEDB_WRITE);

Now, I have a considerably large dense array of two float attributes (5000 X 5M) and the size of the slice that I am reading is (5k X 100k). I am reading one attribute at a time

But the line above which is not even reading from the array is taking considerable good amount of time (15-20 seonds) everytime I call reader function on the array. Is this expected or I am doing something wrong?

stavros · June 27, 2019, 12:08am

Just to clarify, you meant to prepare the array for reading (with TILEDB_READ, and not TILEDB_WRITE), right?

The reason you are observing that is because that statement “opens” the array, loading all the fragment metadata. If your array is large, then the fragment metadata are expected to be fairly large as well, so it takes time to load and decompress them. Please note that this is an one-off cost: you should create the array object once, and then create multiple query objects passing that single array instance to avoid reloading the fragment metadata.

As another note, please note that we have just added an optimization on loading the fragment metadata. This is already in the dev branch and scheduled for the 1.6 release over the next couple of days. Specifically, we now load the fragment metadata lazily during the reads (i.e., during the query submission).

Finally, we are planning to move to a full “out-of-core” fragment metadata implementation soon, where only a small fraction of relevant metadata is loaded upon a query, instead of all metadata as we do currently. So please expect this to be substantially improved over the next couple of months.

TileDbUser · June 27, 2019, 12:42am

Yes it is being read in “TILEDB_READ” mode. But rest of your explanation makes sense. Thank you for your help on this. I am looking forward to newer version.

Topic		Replies	Views
Reads are suffering badly	4	1244	June 28, 2019
How to speed up the reading from tiledb	5	1995	October 8, 2020
How can we boost write/read performance in our scenario?	1	1036	June 10, 2021
Optimizing the reads for sparse arrays	9	761	June 27, 2023
What could affect array opening time?	3	495	July 18, 2023

Preparing array for reading itself takes sometime

Related topics