Error registering samples; a file has more than 1 sample. Ingestion from cVCF is not supported

Hi
I’m trying to ingest vcf.gz file using the below code and facing the not supported issue.

Code:

small_ds.ingest_samples(
[’./chr11-prefix.vcf.gz’,
‘./chr1-prefix.vcf.gz’],
scratch_space_path = tempfile.gettempdir(),
scratch_space_size=10
)

Error Messages:

RuntimeError Traceback (most recent call last)
in ()
4 local_vcfs,
5 scratch_space_path = tempfile.gettempdir(),
----> 6 scratch_space_size=10
7 )

/home/ec2-user/SageMaker/tileDB-new/tiledbvcf/lib/python3.7/site-packages/tiledbvcf/dataset.py in ingest_samples(self, sample_uris, extra_attrs, checksum_type, allow_duplicates, scratch_space_path, scratch_space_size)
212 # Create is a no-op if the dataset already exists.
213 self.writer.create_dataset()
–> 214 self.writer.register_samples()
215 self.writer.ingest_samples()
216

RuntimeError: TileDB-VCF exception: Error registering samples; a file has more than 1 sample. Ingestion from cVCF is not supported.

Thanks

Thanks for posting. TileDB-VCF does not currently support direct ingestion of multi-sample VCF files, although it’s on our roadmap. If you don’t have access to the original single-sample VCF files, bcftools provides a plugin for splitting VCF files by sample: bcftools +split.