How can we boost write/read performance in our scenario?

Hi,

We are evaluating to use TileDb in one of our projects.
During this analysis we have encountered some difficulties, in our testbed, to improve the TileDb read/write performance, so, we would share with you our findings to allow you to give us back your precious suggestions.

Workstation

Processor: Intel(R) Xeon(R) W-2295 CPU @ 3.00GHz, 3000 Mhz, 18 Core(s), 36 Logical Processor(s)
OS Name: Microsoft Windows Server 2019 Standard (Version 10.0.17763 Build 17763)
Installed Physical Memory (RAM): 128 GB
SSD: 1 Tb

TileDB Organization

Type: Dense Array (ROW Major, ROW Major)
Array Size: 500 Rows, 500000 Columns containing 1 float attribute (for each cell)
Tile: 1 Row, 500000 Columns

Test scenario

We have filled completely 23 arrays using the C++ API (version 2.2.9) and we have measured the write/read performance.
After each fragment write we have consolidated and vacuumed the array to boost the reading performance.

Writing Performance (at the 500th element of each array):

WRITING SINGLE FRAGMENT (milliseconds): 49
CONSOLIDATING (milliseconds): 13473
VACUUMING (milliseconds): 237

For each fragment we have written [1 Row, 500000 Columns] to match the tile size (GLOBAL_ORDER).

Reading performance

We summed up the elapsed time required to retrieve the first [1 Row, 500000 Columns] for each of the 23 arrays (GLOBAL_ORDER layout). The elapsed time is ~14 sec.

Questions

  1. During the writing we have noticed that the consolidation time increase linearly with the array size growth.
    We think that this situation is completely normal because the bigger the array size is the worst will be the performance.
    Anyway what we have noticed that after the first 10 fragments, the consolidation takes 349 msec, while, at the last fragment, it tooks 13sec. Is this large gap [349msec - 13sec] expected? How can we reduce it?
  2. The total time to read the first [100 Row, 500000 Columns] of each array was ~14 Sec (so to read the whole content of the 23 arrays we would spend ~25 Min). Is this a good value in this scenario? How can we reduce it?

We have tried to put in place all of your suggestion related to the performance improvement (e.g. dumping the TileDb statistics we can read the following “Percentage of useful cells read: 100%”) but probably we underestimated some factors as the read and write performance are a little bit out of our expectations.

That being said, we will appreciate any of your suggestion to improve our performance.

Thanks for your support and the time that you will spend analysing this topic.

Regards,
Giordano

Hello Giordano, thanks for reaching out!

Some questions:

  1. Why are you storing the data in 23 separate arrays? Would it be better to store the data in a single array with 23 attributes? That would speed up performance if the slicing query across all arrays is the same.
  2. Can you please try consolidating the fragment metadata instead of the fragments? That is instant and in your particular use case it will have the same effect as consolidating the actual fragments.
  3. Can you please retry the experiment after upgrading to 2.3 (just released)? We have performed various optimizations across the board.
  4. Can you please share the stats with us? That will allow us to understand where the time is spent, in case we are missing something obvious.
  5. Can you please share the array schema (using the dump command)? We will try to reproduce everything locally.

Thanks for the detailed description. We look forward to fixing the read issues you are observing. Also please feel free to reach out directly to stavros@tiledb.com or hello@tiledb.com.

Stavros