Dask Dataframe Integration

Hi TileDB community, curious if there has been some work done integrating Dask Dataframes for lazy computation? I saw in another forum it was a backlog item. Curious where that is at the moment. Would be happy to test :slight_smile:

1 Like

Hi @mronda, no - but I’m curious what use-case(s) you have in mind?

Hi @ihnorton , is this something in the horizon ?

I work with large tabular data and have been using Dask Dataframes to lazy load them. I also work with large clusters to do some data science on these Dask dataframes. My use case would be to use TileDB to store sparse dataframes and then be able to lazily load these sparse dataframes as ddfs. Is there a way I can do this at the moment?

Hi @mronda,

Thanks for reaching out about this. It’s definitely on our radar, but unfortunately we are not actively working on this at the moment. We are prioritizing work on distributed dataframe computations using our serverless task graphs on TileDB Cloud. We are hoping to revisit more strongly integrating with Dask dataframes in 2023. Cheers!

1 Like