I am trying to create a tiledb array on local minio S3 bucket using tiledb-presto docker.
I have used below commad to start the docker:-
docker run -e AWS_ACCESS_KEY_ID=<minio_user> -e AWS_SECRET_ACCESS_KEY=<minio_pwd> -it --rm -v ./data/:/data/local_array tiledb/tiledb-presto
For creating an array I am using below command:-
CREATE TABLE region( regionkey bigint WITH (dimension=true), name varchar, comment varchar ) WITH (uri = 's3://<IP>:<Port>/testbucket/region');
This gives below error:
Query 20200819_145407_00007_mp58g failed: [TileDB::S3] Error: Failed to create multipart request for object '/testbucket/region/__array_schema.tdb
Error message: Unable to connect to endpoint
In case I query for tiledb array on minio s3 bucket with below command:-
SELECT * FROM tiledb.tiledb."s3://<IP>:<Port>/testbucket/processed/csv_dowjones";
Query 20200819_150351_00008_mp58g failed: line 1:15: Table tiledb.tiledb.s3://IP:Port/testbucket/processed/csv_dowjones does not exist
SELECT * FROM tiledb.tiledb.“s3://IP:Port/testbucket/processed/csv_dowjones”
Can anyone help to access local minio s3 bucket from tiledb-presto?
There is no option to set endpoint url for minio. So currently not able to create any query on minio s3. This can be done by adding below code line in src/main/java/io/prestosql/plugin/tiledb/TileDBClient.java at 80 line
@Shashikant255 did adding the settings fix the problem?
I’ve got two pull requests open for TileDB-Presto to update the TileDB-Java (and core TileDB) version, and second to add support for arbitrary configs.
I plan to try to have them merged by tomorrow and cut a new release. We are in the process of also updating our Presto driver for the TileDB 2.0 features such as string dimensions, and heterogeneous dimensions. We expect to have the remaining features updated by mid September. Please let us know if you have any other problems, or feature requests and we will try to accommodate those as we are making changes.
Lastly, a small question, what version of PrestoDB are you using? We are looking to also separately update our target version for the driver, so we are interested to know the version you are using.
Thanks for your reply @seth .
Yes, adding the setting fixed the problem.
One more issue that I am facing is if I have created a tiledb array using tiledb function and I want to query on those array using tiledb-presto then it throws error like
Query 20200825_073220_00002_g9pwj failed: line 1:15: Table tiledb.tiledb.s3://testbucket/processed/csv_dowjones does not exist
select * from tiledb.tiledb.“s3://testbucket/processed/csv_dowjones”
I am able to create, insert rows and query on table which is created by using tiledb-presto sql command.
I don’t know what I am missing in this.
We are currently in experimentation phase and have no prestodb version in production as of now.
@Shashikant255, we’ve released a new version of TileDB-Presto, 1.6.0. I’d appreciate it if you would give that a try and see if it solves the s3 problem.
tiledb-config session parameter details have been added to our docs. This should replace your need for altering the code. You can set the config parameters with
set session tiledb.tiledb_config="vfs.s3.scheme=http,vfs.s3.endpoint_override=IP:Port,vfs.s3.use_virtual_addressing=false"
Yes, I am able to connect to minio. Thanks for your help.