Failed to Load Dataset - Unknown Error

I encountered an issue while attempting to load a dataset in TileDB. The error message I received is as follows:

{
    "detail": "Failed to load dataset: Unknown error: Wrapping {1925|1-154585209-C-T_ANY|ADAR|103|ANY|0.8390|0.0003|0.4829|0.0000|0.0000|1|0|splice_region_variant|ADAR:ENST00000368474.9:c.3443+8G>A:p.?|BENIGN|BA1|ORPHA:51|\"Aicardi-Gouti�res_syndrome\"} failed"
}

Hello, please provide more information about how you reached this error. There’s no clear connection to anything we can debug from this message.

1 Like

Hi Manesh,
My first instinct is to look with suspicion at the Unicode value representing the è in Aicardi-Goutières_syndrome. Normally that would not present a problem to ingestion, I tried it myself, but maybe something else is going on there. As an experiment, can you try either

  1. eliminating that particular variant 1-154585209-C-T from the bcf file
  2. changing the Aicardi-Goutières_syndrome to “AG_syndrome” or something less error-prone

Thanks,
Jeremy

I am trying to create a TileDB dataset using the following code:

import tiledbvcf # type: ignore
import shutil

# Define the URI for the TileDB-VCF dataset
uri = ".../tiledbvcf/tiledb/assets/my_vcf_dataset"

# Paths to the individual VCF files
sample = [".../tiledbvcf/tiledb/assets/assets/40506300907A.bcf"]

# Create the dataset
ds = tiledbvcf.Dataset(uri, mode='w')
try:
    shutil.rmtree(uri)
    print("Deleted existing array")
except FileNotFoundError:
    pass

ds.create_dataset()

# Ingest the samples
ds.ingest_samples(sample_uris=sample)
ds = tiledbvcf.Dataset(uri, mode='r')
print(ds.samples())

For some files, the dataset is created successfully. However, when using this new file, I encounter the error mentioned above.

Could you provide more details on how to address this issue or where I might need to look to debug the problem further?

Hi Mahesh,
Let me know if you were able to run that experiment.

Thanks,
Jeremy