tiledb looks extremely appealing to me except for one glaring issue: people have to learn a relatively complex system to get at their data. That to me represents an unacceptably high barrier to entry and a red flag risk that the data might become hard to recover in the future. I am wondering if there are any plans for functionality where a tiledb file could be mounted in
/etc/fstab or with FUSE so that the data could be made accessible as directories of read-only delimited text files and/or some other standard human readable serialization format with wide parsing support across software platforms?
Thanks for the comment. We aim to make TileDB accessible to a wide range of users and use-cases by building integrations with as many languages and libraries as possible. The postcard version of using TileDB from Python is as simple as
pip install tiledb and then:
import tiledb, numpy as np a = np.array(...) tiledb.from_numpy("/path/array.tiledb", a) with tiledb.open("/path/array.tiledb") as T: print( T[:] ) # read and print the array
Using TileDB from R looks very similar. From there, moving data to the cloud can be done with a change of target URL, for example
s3://my-bucket/my-array to write to AWS S3 (we also support Azure Blob Storage and Google Cloud Storage natively).
Regarding data longevity, there are a few steps we’ve taken to address this:
(1) the TileDB Embedded library is open source under the MIT License, as are all of the language integrations.
(2) the TileDB format is versioned, and we publish a format specification here: TileDB/format_spec at dev · TileDB-Inc/TileDB · GitHub
(3) starting from TileDB 1.4 we maintain tested backward compatibility for reading (we test reading of arrays written with all previous versions before release).
I suspect a FUSE<>txt interface is unlikely in the near future, but we do have many additional integrations as well as improvements to the TileDB CLI tooling in our roadmap.