Hi,
As mentioned in other threads there is a bit of lacking documentation when it comes to using Azure and GCS as storage backends.
While waiting for such documentation to be created, I’d like to share the problem I’m facing, with hopes of that the fix is swift.
Problem: Read / Write to an array on Azure Blob storage is not working.
Using the following simple script (account secret and sas-token redacted) in an effort to create a simple array:
import tiledb
import numpy as np
def create_array(x_dim, y_dim, array_name, tiling = 1):
print("Creating array")
time_dim = tiledb.Dim(name="time", domain=(0, x_dim), tile=x_dim, dtype=np.int32)
attribute_dim = tiledb.Dim(name="attribute", domain=(0, y_dim), tile=tiling, dtype=np.int32)
dom = tiledb.Domain(time_dim, attribute_dim)
attr = tiledb.Attr(name="value", dtype=np.int32)
schema = tiledb.ArraySchema(domain=dom, sparse=False, attrs=[attr], tile_order='col-major')
tiledb.Array.create(array_name, schema)
# Set up config
config = tiledb.Config()
config["vfs.azure.use_https"] = "true"
config["vfs.azure.storage_account_name"] = "tiledbtest"
config["vfs.azure.storage_account_key"] = "<my-account-secret>"
config["vfs.azure.storage_sas_token"] = "https://tiledbtest.blob.core.windows.net/25504f45-1975-4294-86d6-f90f7cde738f?sp=racwdli&st=2022-09-27T08:53:11Z&se=2022-09-28T16:53:11Z&spr=https&sv=2021-06-08&sr=c&sig=<sas-token-signature>"
# Define a TileDB context
ctx = tiledb.Ctx(config=config)
storage_acc = "tiledbtest"
container = "25504f45-1975-4294-86d6-f90f7cde738f"
azure_uri_manual = f"azure://{storage_acc}.blob.core.windows.net/{container}/"
# Create an array
x, y = 2, 1
create_array(x, y, azure_uri_manual)
seems to time out (during blob reading), this is the error message:
File "/workspace/tiledb/tiledb_azure.py", line 12, in create_array
tiledb.Array.create(array_name, schema)
File "tiledb/libtiledb.pyx", line 3552, in tiledb.libtiledb.Array.create
File "tiledb/libtiledb.pyx", line 575, in tiledb.libtiledb._raise_ctx_err
File "tiledb/libtiledb.pyx", line 560, in tiledb.libtiledb._raise_tiledb_error
tiledb.cc.TileDBError: [TileDB::Azure] Error: List blobs failed on: azure://tiledbtest.blob.core.windows.net/25504f45-1975-4294-86d6-f90f7cde738f/__schema/
I have verified that both the secret and sas-token can be used to list and create blobs using Azures python API independently from TileDB.
I have three questions:
- Do you have a working example of using Azure Blob Storage as backend with SAS tokens?
- Is there additional verbosity I can toggle to help with troubleshooting?
- Is there a flag to control if access to Azure is done using the secret or SAS-token? Or is it simply that it tries using SAS-token if present, otherwise account-key?
Things that might be clues:
- TileDB does not seem to be using the credentials properly, or is attempting to access an endpoint which does not exist, because posting an invalid configuration of storage account + credentials does not cause immediate 4XX, simply times out in the same manner.
Let me know if you require additional information / debug traces.
Best Regards,
David