A few questions regarding efficient writes

Getting familiar with TileDB - I am at the point where most things are working, but I’m trying to figure out if I can do things more efficiently / faster.

The current pain point is write speed. I am using the python interface, and I am writing to the following schema:

dom = tiledb.Domain(tiledb.Dim(name="positions", domain=(0, total_size), tile=10000, dtype=np.uint32),
                   tiledb.Dim(name="samples", domain=(0, len(signal_files)), tile=10, dtype=np.uint32))

filters = [tiledb.LZ4Filter()]

schema = tiledb.ArraySchema(domain=dom, sparse=False, cell_order='C', tile_order='C',                                 attrs=[tiledb.Attr(name="value", dtype=np.float32, filters=filters)])

The total array is 3 billion positions (human genome) for dim1, and an expected 100-1000 for dim2 (samples).

The data itself is dense - signal of various assays along the chromosomes.

I’ve read most of the documentation (which is very nice, but the topic is relatively complex with tiles and making sense of the global order of cells) - does the above look optimal? Typical reads are querying a few regions (around 500 cells long) along dim1, retrieving all samples from dim2 at once in most cases.

I profiled the writes and most of the time is spent by TileDB - I am sending data in row-order (default for numpy) which should fit the schema exactly, and I’m writing multiple exact tiles at once (again in schema order).

For the writing itself I am doing:

A[start_dim1:stop_dim1, start_dim2:stop_dim2] = data

The writing is very slow (takes ~20 hours for the full thing, which ends up being around 50GB compressed) and I’m wondering if there’s a way to speed it up - its only using one of my cores, and the activity on the hard drive only spikes up once every 30s for a write. I don’t really know what its doing in the mean time - compression filters or not it’s mostly the same time, so its not that either. I’m hypothesizing that it’s doing a lot of reordering internally to fit global order.

Something I observed is that if I try to make even smaller tiles (1000 for example) things get even slower.

The docs say that the fastest writing is to provide a buffer that’s directly in global order, but I don’t think the Python interface has this option anywhere.

Tips appreciated :slight_smile:

Yeah so the tile size on dim1 is apparently having a big impact - making it smaller (1000 instead of 10000) impacts negatively write size, but making it larger (100000) makes writes much faster.

I guess this will however affect read efficiency negatively - the target intervals on that axis are 100-500 cells longs.

So large (100000 cells) tiles are writing much faster than smaller (1000 cells) - 12 seconds vs 190 seconds for the same amount of data (sending 500 MB worth of tiles at once).

Here’s the output of the performance logs for 1 write using a tile size of 1000 (my actual target, but terrible write performance).

==== WRITE ====

  • Number of write queries: 1

  • Number of attributes written: 1

    • Number of fixed-sized attributes written: 1
  • Number of bytes written: 6511200 bytes (0.00606403 GB)

  • Number of write operations: 13027

  • Number of bytes filtered: 521248104 bytes (0.48545 GB)

  • Filtering deflation factor: 80.0541x

  • Total metadata written: 38354 bytes (3.57199e-05 GB)

    • Fragment metadata footer: 206 bytes (1.91852e-07 GB)
    • R-tree: 99 bytes (9.22009e-08 GB)
    • Fixed-sized tile offsets: 33113 bytes (3.08389e-05 GB)
    • Var-sized tile offsets: 2468 bytes (2.2985e-06 GB)
    • Var-sized tile sizes: 2468 bytes (2.2985e-06 GB)
  • Write time: 179.897 secs

    • Time to split the coordinates buffer: 6e-07 secs
    • Time to check out-of-bounds coordinates: 5e-07 secs
    • Time to initialize dense cell range iterators: 0.0028157 secs
    • Time to compute cell ranges: 1.86062 secs
    • Time to prepare and filter tiles: 1.14174 secs
    • Time to write tiles: 176.45 secs
    • Time to write fragment metadata: 0.418166 secs
  • Time to finalize write query: 6e-07 secs

All the time is spent ‘writing tiles’ - 6511200 bytes worth after compression. How is this possible?

Thanks Michael! We are looking into this, we’ll get back with an update very soon.

Thanks - same behavior on a slow HD and a fast SSD - same time as well. I have the impression that TileDB is doing some heavy reordering of the numpy arrays that I send, and maybe if I could just provide cells in global order it would help, but there doesn’t seem to be a way to do that with the Python interface.

I wanted to make sure I was writing along tiles - however there’s only statistics for tiles you read, not for writes (more write statistics would be very useful!).

So instead I used the read statistics afterwards, using the same intervals im using for writing and can confirm that I’m hitting tiles straight when writing.

However I noted something that might be odd - I tried writing literally 1 tile (1000 x 10, and im hitting only 1 tile confirmed by the above method) and the write statistics state 28 ‘write operations’ to write 5873 bytes (after compression). I guess a few of those are metadata related.

Yeah, this seems like a bug affecting performance, it is next in our list. We’ll definitely enhance the write stats as well.

Good, let me know if there’s anything you need!

For now it’d be helpful to know total_size and len(signal_files) so that we create the same 2D array (its contents probably don’t matter right now to debug performance in this use case).

3.2 billion cells for total size, 130 for signal_files at the moment, but would go bigger.

I tried having long tiles (not chunking dim2) but its the same so far.

No worries, we’ll try various configurations. We’ll post an update soon. Thanks again for bringing up the issue.