There’s a section on the doc which talks about tiledb using checksums for checking data integrity
Can you give some high level details about how these are used?
I’m trying to hash array contents to check for equality between two arrays. Is it possible for me to get the sha256 hash of an array? Could not find anything in the docs or API reference
Encryption support is for the security of data-at-rest, it is applied during the write process and isn’t exposed at the API level.
Checksums are intended to verify data integrity on read. They are applied as part of the “filter pipeline” which TileDB executes before a write – similar to a compression step. The checksum is calculated for each tile and stored during write alongside the tile data. The checksum is then re-computed and re-checked for each tile on read, and an error will be raised if a mismatch is encountered. Because it is applied at the tile level, the actual stored hash will depend on the array order (cell/row-major; hilbert); and for sparse arrays it will also depend on the data (dimension) coordinates because data cells will be reordered into the TileDB global order before tiles are written. Thus, it is not really designed for array equality comparison, and it is not user-visible at this time.