I am trying out the 101 Population Genomics tutorial.
After installing the package through conda and pip I tried to run the 1st snippet containing:
vcf_bucket = "s3://tiledb-inc-demo-data/examples/notebooks/vcfs/1kg-dragen"
samples_to_ingest = ["HG00096_chr21.gvcf.gz",
"HG00097_chr21.gvcf.gz",
"HG00099_chr21.gvcf.gz",
"HG00100_chr21.gvcf.gz",
"HG00101_chr21.gvcf.gz"]
sample_uris = [f"{vcf_bucket}/{s}" for s in samples_to_ingest]
# The URIs of the samples to be ingested should look like this:
sample_uris
and I get:
miniconda3/envs/tiledb-vcf-tutorial/lib/python3.8/site-packages/tiledb/cloud/config.py:96: UserWarning: You must first login before you can run commands. Please run tiledb.cloud.login.
warnings.warn(
Got ERROR: "Could not open mysql.plugin table: "Table 'mysql.plugin' doesn't exist". Some plugins may be not loaded" errno: 2000
Got ERROR: "Can't open and lock privilege tables: Table 'mysql.servers' doesn't exist" errno: 2000
Got ERROR: "Can't open the mysql.func table. Please run mysql_upgrade to create it." errno: 2000
What shall I run before the for-loop so I can connect to tiledb-cloud instance?
I have a Ubuntu machine and PyCharm (Community) IDE.
The snippet you shared from that tutorial does not reference any calls to tiledb-cloud which is what’s showing up in your traceback, so i’m confused why that’s happening there.
That snippet is just running some purely Python builtin commands as preparation for working with tiledbvcf. You can ensure tiledb-cloud isn’t interfering by commenting out any mentions to import tiledb.cloud at the top of your script/module/notebook to double check.
The latter portion of the tutorial discusses how to use TileDB Cloud. You can sign up for a free account and request some introductory credits if you’d like! For that you will need to login and you can do that using tiledb.cloud.login (ref) and passing your credentials there (either your username and password or preferably your API token).
So in summary, since the portion you’re referencing is completely open source, you do not need to login to TileDB Cloud programatically. To take full advantage of TileDB and explore our Cloud platform, you can sign up for a free account and login as described above.
I have remove the tiledb cloud imports and I could run some more code indeed.
However when I come to the point of:
# Ignore any warnings
db = tiledb.sql.connect()
pd.read_sql(sql=f"select * from `{variant_stats_uri}` where pos >= 5030025 and pos <= 5030087", con=db)
running the sql connect brings up:
Process finished with exit code 139 (interrupted by signal 11:SIGSEGV)
About the conda env that I use in this PyCharm project,
I have created an empty conda env and then install the dependencies (as explain in the start of the tutorial post).
Hi @damianos.melidis, upon a quick test I’m not able to reproduce the segfault you are seeing. I’d like to confirm the package versions so I can test with closer to your setup. Would you mind running conda list -e and post the output? I’ll then be able to try to reproduce and diagnose what you are seeing.
Hi @seth, I would like to provide a txt file containing the output of the conda list -e, but the ''upload" function does not like text files so here you go: