Seeking Advice to Optimize Writing Speed for Large Int16_t Arrays

Hello TileDB Community,

I’m working on a project where I need to write two dense arrays with dimensions 19995 x 9216 of Int16_t type within < 0.07 seconds . Unfortunately, my current performance is not meeting this target. Below are the details of my setup and attempts:

  • Arrays : I have two 1D arrays (data1 and data2), each with a length of 184273920 elements (equivalent to 19995 x 9216).
  • Frames : I process a total of 8 frames; after processing each frame, I receive new data1 and data2 data.
  • Dimensions : I’ve set up the dimensions as follows:
    • const auto dimFrame = tiledb::Dimension::create<int32_t>(ctx, “frame”, {1, totalFrames}, 1);
    • const auto dimHeight = tiledb::Dimension::create<int32_t>(ctx, “height”, {1, height}, height);
    • const auto dimWidth = tiledb::Dimension::create<int32_t>(ctx, “width”, {1, width}, width);
  • Attributes : I’ve set up the array schema as follows:
    • tiledb::ArraySchema schema(ctx, TILEDB_DENSE);
    • schema.set_domain(domain)
    • .set_cell_order(TILEDB_ROW_MAJOR)
    • .set_tile_order(TILEDB_ROW_MAJOR)
    • .add_attribute(tiledb::Attribute::create<int16_t>(ctx, “data1”))
    • .add_attribute(tiledb::Attribute::create<int16_t>(ctx, “data2”));
  • Write Frame: I’ve set up write frame as follows:
    • const tiledb::Context ctx;
    • tiledb::Array array(ctx, arrayName, TILEDB_WRITE);
    • tiledb::Query query(ctx, array);
    • query.set_layout(TILEDB_ROW_MAJOR);
    • query.set_subarray({frameNumber, frameNumber, 1, height, 1, width});
    • query.set_buffer(“data1”, const_cast<int16_t*>(data1), width * height);
    • query.set_buffer(“data2”, const_cast<int16_t*>(data2), width * height);
    • query.submit();
    • array.close();

Here are the methods I’ve tried and their results:

  • Per Frame Writing : Writing after processing each frame resulted in 0.9 1.1 seconds per frame.
  • Batch Writing : Writing all 8 frames at once took about 13 – 15 seconds.
  • Parallel Batch Writing : Writing in batch parallel (8 frames, 8 threads) took about 3 – 5 seconds.
  • Tile Sizes : Experimenting with different tile sizes showed that 465 x 512 offers the best performance, but it’s still not sufficient.
  • Configuration Tweaks : Adjusting the config parameters (sm.num_threads, sm.io_concurrency_level) had no significant impact on writing speed.
  • Compression Filter : Applying a compression filter before writing slowed down the process even more.
  • Async Writing : Attempting asynchronous writing (query.submit_async()) resulted in an excessively long completion time for even a single frame.

It appears that data write speed is too slow for my application. Any suggestions on how to improve the write speed would be most welcome.

Thank you in advance for your help!

Hi @juhwan,

I believe the target throughput here is ~5GB/s? We can look at some optimization possibilities, but I have a few questions to clarify:

  • is this target a sustained rate? for how long?
  • can you buffer writes – if so, how much?
  • is the target hardware capable of this rate (ie testing raw parallelized writes or copies)?

(if you want to discuss directly, please email and reference this thread)


Thank you for looking into the question. I will take a look and come up with more data and answers to your questions.