Integrating with a flask/Dash application (Python)

Dave_L · April 21, 2023, 2:23pm

If I wish to use tileDB as my primary database and have a Dash-plotly application visualize the data stored in the array, what is the best way to handle the following:
1.) Is there an existing callback in tileDB that indicates when new samples/changes are made to a tiledb array?
2.) Making the array globally available to the Dash (plotly) app when the different layout and callback components of the app are organized in separate python modules. Should I have another child process that accepts requests to the tileDB array (i.e: like a side-car / server of sorts?) what is the best way (that minimizes network calls) to do this?
3.) My array is currently small, but has the potential to grow larger over time. What is the best way to prevent loading too many samples from the array at once so that the application doesn’t consume a lot of memory resources?

ihnorton · April 26, 2023, 12:26pm

Hi @Dave_L,

No - this would need to be implemented at the application layer.
I’m not quite sure I follow this one; as a rule of thumb, running in-process will avoid overhead of transfer/serialization between processes, but this can be mitigated with shared memory data structures.
Running fragment metadata consolidation will reduce array open latency; running fragment consolidation can help to reduce the i/o by providing better data locality.

Hope that helps,
Isaiah

Dave_L · May 1, 2023, 1:39pm

Hi @ihnorton,
Thank you for the reply and apologies for the poorly worded question. For number 2, I was trying (poorly) to say that I would like to minimize network calls when the app fetches data. Currently, tiledb sits on the same ec2 server I have running the wep app, so I guess there are no network calls at all because. However, in the future I may separate these services (i.e: put tiledb array on s3) and was wondering what the best approach would be (won’t reads from the s3 uri be quite slow?). From what you wrote, are you saying that I should wrap the whole Dash app in with tiledb.open() ? Sorry if I misunderstand, web dev is not my everyday.

Thank you again,

Dave

ihnorton · May 2, 2023, 11:53am

Hi @Dave_L,

This will be influenced by your array layout, and write/query patterns, but consolidation can help to minimize network requests. Fragment metadata consolidation will reduce the number of listing operations; fragment (data) consolidation can help to reduce the number of i/o operations for reads.

Best,
Isaiah

Topic		Replies	Views
Slow AWS Data Slicing	5	1039	June 12, 2020
How to speed up the reading from tiledb	5	2004	October 8, 2020
Filters with dask.array.to_tiledb()	14	947	July 28, 2022
Storage and Query Performance in TileDB	1	76	November 18, 2024
Tiledb performance with sparse point cloud data	8	948	March 23, 2023

Integrating with a flask/Dash application (Python)

Related topics