Improved performance

I recently saw @stavros pointing to a tutorial for the new dataframe interactions.

I really like the new .df way of directly returning a dataframe from a TileDB array,
and especially how TileDB now takes advantage of arrow!

uri = 'BTC_USDT'

t1,t2 = (np.datetime64('2020-12-20'),np.datetime64('2021-12-25'))

with tiledb.open(uri,'r') as A:
    df = A.query(attrs=('price','volume','is_buyer_maker',
                        'first_trade_ID','last_trade_ID'
                       ),
                 dims=('date',),index_col='date',use_arrow=True
                ).df[:,t1:t2]  

This will read 9612287 rows × 5 columns in 6.31 s.

Array-Schema

ArraySchema(
domain=Domain(*[
Dim(name=‘agg_ID’, domain=(0, 9000000000000000000), tile=1000000, dtype=‘int64’),
Dim(name=‘date’, domain=(numpy.datetime64(‘1980-01-01T00:00:00.000000000’), numpy.datetime64(‘2100-01-01T00:00:00.000000000’)), tile=86400000000000 nanoseconds, dtype=‘datetime64[ns]’),
]),
attrs=[
Attr(name=‘price’, dtype=‘float64’, var=False, filters=FilterList([ZstdFilter(level=9), ])),
Attr(name=‘volume’, dtype=‘float64’, var=False, filters=FilterList([ZstdFilter(level=9), ])),
Attr(name=‘first_trade_ID’, dtype=‘int64’, var=False, filters=FilterList([ZstdFilter(level=9), ])),
Attr(name=‘last_trade_ID’, dtype=‘int64’, var=False, filters=FilterList([ZstdFilter(level=9), ])),
Attr(name=‘is_buyer_maker’, dtype=‘int8’, var=False, filters=FilterList([BitShuffleFilter(), ZstdFilter(level=9), ])),
Attr(name=‘is_best_price_match’, dtype=‘int8’, var=False, filters=FilterList([BitShuffleFilter(), ZstdFilter(level=9), ])),
],
cell_order=‘row-major’,
tile_order=‘row-major’,
capacity=10000000,
sparse=True,
allows_duplicates=False,
coords_filters=FilterList([ZstdFilter(level=-1), ])
)

Stats-dump1

TileDB Embedded Version: (2, 1, 3)
TileDB-Py Version: 0.7.4
==== READ ====

  • Number of read queries: 3

  • Number of attempts until results are found: 9

  • Number of attributes read: 15

    • Number of fixed-sized attributes read: 15
  • Number of dimensions read: 3

    • Number of fixed-sized dimensions read: 3
  • Number of logical tiles overlapping the query: 167

  • Number of physical tiles read: 3006

    • Number of physical fixed-sized tiles read: 3006
  • Number of cells read: 19516358

  • Number of result cells: 17023669

  • Percentage of useful cells read: 87.2277%

  • Number of bytes read: 163402447 bytes (0.15218 GB)

  • Number of read operations: 10186

  • Number of bytes unfiltered: 693603523 bytes (0.645969 GB)

  • Unfiltering inflation factor: 4.24476x

  • Time to compute estimated result size: 0.043015 secs

    • Time to compute tile overlap: 0.0366245 secs

      Time to compute relevant fragments: 0.00434861 secs
      Time to load relevant fragment R-trees: 0.00656483 secs
      Time to compute relevant fragment tile overlap: 0.0247598 secs

  • Time to open array: 0.038626 secs

    • Time to load array schema: 0.000264519 secs
    • Time to load consolidated fragment metadata: 1.986e-06 secs
    • Time to load fragment metadata: 0.0113568 secs
  • Total metadata read: 229077 bytes (0.000213345 GB)

    • Array schema: 413 bytes (3.84636e-07 GB)
    • Fragment metadata: 175552 bytes (0.000163496 GB)
    • R-tree: 4224 bytes (3.93391e-06 GB)
    • Fixed-sized tile offsets: 48888 bytes (4.55305e-05 GB)
  • Time to load array metadata: 7.5849e-05 secs

  • Time to initialize the read state: 0.000125849 secs

  • Read time: 5.36799 secs

    • Time to compute next partition: 0.0339634 secs
    • Time to split current partition: 0.00350552 secs
    • Time to compute result coordinates: 4.82893 secs

      Time to compute sparse result tiles: 0.000779751 secs
      Time to read coordinate tiles: 0.0181086 secs
      Time to unfilter coordinate tiles: 0.243352 secs
      Time to compute range result coordinates: 3.89938 secs

    • Time to compute sparse result cell slabs: 0.0495635 secs
    • Time to copy result attribute values: 0.400744 secs

      Time to read attribute tiles: 0.0356251 secs
      Time to unfilter attribute tiles: 0.272747 secs
      Time to copy fixed-sized attribute values: 0.0487621 secs

    • Time to copy result coordinates: 0.0172377 secs

      Time to copy fixed-sized coordinates: 0.0142504 secs

  • Total read query time (array open + init state + read): 5.36812 secs
    ==== Python Stats ====

  • TileDB-Py Indexing Time: 6.25372

    • TileDB-Py query execution time: 6.13007

      TileDB C++ Core initial query submit time: 0.964374
      TileDB C++ Core incomplete resubmit(s) time: 4.40392
      TileDB-Py incomplete buffer updates: 0.730236

    • TileDB-Py buffer conversion time: 0.119553

But when increasing the buffer size, I get almost 100% read performance improvement!

cfg = tiledb.Ctx().config()
cfg.update(
  {
    'py.init_buffer_bytes': 1024**2 * 400
  }
)
tiledb.default_ctx(cfg)

The default recommended parameter of 1024**2 * 100 had no improvements, but setting it to 1024**2 * 400 did the trick for me, reducing read-times to 3.27 s.

Stats-Dump2

TileDB Embedded Version: (2, 1, 3)
TileDB-Py Version: 0.7.4
==== READ ====

  • Number of read queries: 1

  • Number of attempts until results are found: 1

  • Number of attributes read: 5

    • Number of fixed-sized attributes read: 5
  • Number of dimensions read: 1

    • Number of fixed-sized dimensions read: 1
  • Number of logical tiles overlapping the query: 90

  • Number of physical tiles read: 540

    • Number of physical fixed-sized tiles read: 540
  • Number of cells read: 9634194

  • Number of result cells: 9624052

  • Percentage of useful cells read: 99.8947%

  • Number of bytes read: 117662004 bytes (0.109581 GB)

  • Number of read operations: 10099

  • Number of bytes unfiltered: 472129351 bytes (0.439705 GB)

  • Unfiltering inflation factor: 4.01259x

  • Time to compute estimated result size: 0.00702653 secs

    • Time to compute tile overlap: 0.00654431 secs

      Time to compute relevant fragments: 0.000313903 secs
      Time to load relevant fragment R-trees: 0.00271106 secs
      Time to compute relevant fragment tile overlap: 0.0034636 secs

  • Time to open array: 0.0323722 secs

    • Time to load array schema: 0.000369341 secs
    • Time to load consolidated fragment metadata: 2.077e-06 secs
    • Time to load fragment metadata: 0.00840535 secs
  • Total metadata read: 230241 bytes (0.000214429 GB)

    • Array schema: 413 bytes (3.84636e-07 GB)
    • Fragment metadata: 176396 bytes (0.000164282 GB)
    • R-tree: 4320 bytes (4.02331e-06 GB)
    • Fixed-sized tile offsets: 49112 bytes (4.57391e-05 GB)
  • Time to load array metadata: 7.5414e-05 secs

  • Time to initialize the read state: 0.000172829 secs

  • Read time: 3.3564 secs

    • Time to compute next partition: 0.000490527 secs
    • Time to compute result coordinates: 2.98954 secs

      Time to compute sparse result tiles: 0.000242224 secs
      Time to read coordinate tiles: 0.00611595 secs
      Time to unfilter coordinate tiles: 0.0957798 secs
      Time to compute range result coordinates: 2.38976 secs

    • Time to compute sparse result cell slabs: 0.026101 secs
    • Time to copy result attribute values: 0.319114 secs

      Time to read attribute tiles: 0.0188366 secs
      Time to unfilter attribute tiles: 0.203411 secs
      Time to copy fixed-sized attribute values: 0.0476533 secs

    • Time to copy result coordinates: 0.0125715 secs

      Time to copy fixed-sized coordinates: 0.0122299 secs

  • Total read query time (array open + init state + read): 3.35657 secs
    ==== Python Stats ====

  • TileDB-Py Indexing Time: 3.46947

    • TileDB-Py query execution time: 3.36533

      TileDB C++ Core initial query submit time: 3.35662

    • TileDB-Py buffer conversion time: 0.101548

Anyway, great work and looking forward to future improvements!

1 Like

@Mtrl_Scientist

Excellent!

By using the parameters you provided, I could reduce the reading time in my benchmark(Benchmarking tiledb read performance) from ~16s to ~8s, only slower than parquet by a very small margin.

Thanks for the tip!

1 Like

Great to know folks, thanks for the update! More optimizations and tutorials are coming up.

Stavros

1 Like