Tile-db AWs API does not work for other regions than aws

I have following code:

#include <tiledb/tiledb>
#include
#include

using namespace tiledb;

// Name of array.

Config config;

// Set a configuration parameter
std::string array_name(“s3://kunal-test-bucketing/quickstart_dense_array”);

void create_array() {
// Create a TileDB context.
tiledb::Config cfg;

Context ctx(cfg);

// The array will be 4x4 with dimensions “d1” and “d2”, with domain [1,4].
Domain domain(ctx);
domain.add_dimension(Dimension::create(ctx, “d1”, {{1, 4}}, 4))
.add_dimension(Dimension::create(ctx, “d2”, {{1, 4}}, 4));

cfg[“vfs.s3.region”] = “us-west-1”;
// The array will be dense.
ArraySchema schema(ctx, TILEDB_DENSE);
schema.set_domain(domain)
.set_order({{TILEDB_ROW_MAJOR, TILEDB_ROW_MAJOR}});

// Add a single attribute “a” so each (i,j) cell can store an integer.
schema.add_attribute(Attribute::create(ctx, “a”));

// Create the (empty) array on disk.
Array::create(array_name, schema);
}
void write_array() {
Context ctx;

// Prepare some data for the array
std::vector data = {
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16};

// Open the array for writing and create the query.
Array array(ctx, array_name, TILEDB_WRITE);
Query query(ctx, array);
query.set_layout(TILEDB_ROW_MAJOR)
.set_buffer(“a”, data);

// Perform the write and close the array.
query.submit();
array.close();
}

void read_array() {
Context ctx;

// Prepare the array for reading
Array array(ctx, array_name, TILEDB_READ);

// Slice only rows 1, 2 and cols 2, 3, 4
const std::vector subarray = {1, 2, 2, 4};

// Prepare the vector that will hold the result (of size 6 elements)
std::vector data(6);

// Prepare the query
Query query(ctx, array);
query.set_subarray(subarray)
.set_layout(TILEDB_ROW_MAJOR)
.set_buffer(“a”, data);

// Submit the query and close the array.
query.submit();
array.close();

// Print out the results.
for (auto d : data)
std::cout << d << " ";
std::cout << “\n”;
}

int main()
{
create_array();
write_array();
read_array();
return 0;
}
As you can see I am basically picking up code as it is from your examples on tile-db website. I have the following set up.

My Access_key and secret are set up inside the credential file. I am able to run this code and create and read from the array all fine AS FAR AS the bucket is on us-east-1

please note here that in my ~/.aws/config file my and environment variable AWS_DEFAULT_REGION are both set to us-west-1 also as you can see in the code, I am also trying to override the value if set to anything else to us-west-1 by doing cfg["vfs.s3.region"] = "us-west-1";

After setting all the possible places which could possibly provide the region value, I was hoping that it would be able to connect to a bucket in us-west-1 region but unfortunately, it did not happen.’

However, I am able to change the s3 link to a bucket in us-east-1 which is the tile-db default and it would work all fine. This tells me that tile-db api is not able to override on the region.

Any thoughts?

The issue is that the Config (cfg) object where you set us-west-1 must be passed to the Ctx constructor:

Context ctx;
needs to be:
Context ctx(cfg)

@ihnorton I updated my code(as you can see updated in the question). But its not changing the region. same error Just FYI the error I am getting is

Error message: Unable to parse ExceptionName: AuthorizationHeaderMalformed Message: The authorization header is malformed; the region ‘us-east-1’ is wrong; expecting ‘us-west-1’ with address : 52.219.116.66
Abort trap: 6

Ok, the code did not build as posted due to some unrelated errors, so I started from the quickstart_dense example (which I believe you started from), and modified it a little bit to work with S3. The only real change is to initialize the Context in main, using the Config object with the vfs.s3.region setting, and pass the configured context to each of the functions.

Hope this helps to get started!

#include <iostream>
#include <tiledb/tiledb>

using namespace tiledb;

// Name of array.
std::string array_name("s3://some-bucket-us-west-1/my_array");

void create_array(Context ctx) {
  // The array will be 4x4 with dimensions "rows" and "cols", with domain [1,4].
  Domain domain(ctx);
  domain.add_dimension(Dimension::create<int>(ctx, "rows", {{1, 4}}, 4))
      .add_dimension(Dimension::create<int>(ctx, "cols", {{1, 4}}, 4));

  // The array will be dense.
  ArraySchema schema(ctx, TILEDB_DENSE);
  schema.set_domain(domain).set_order({{TILEDB_ROW_MAJOR, TILEDB_ROW_MAJOR}});

  // Add a single attribute "a" so each (i,j) cell can store an integer.
  schema.add_attribute(Attribute::create<int>(ctx, "a"));

  // Create the (empty) array on disk.
  Array::create(array_name, schema);
}

void write_array(Context ctx) {
  // Prepare some data for the array
  std::vector<int> data = {
      1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16};

  // Open the array for writing and create the query.
  Array array(ctx, array_name, TILEDB_WRITE);
  Query query(ctx, array, TILEDB_WRITE);
  query.set_layout(TILEDB_ROW_MAJOR).set_buffer("a", data);

  // Perform the write and close the array.
  query.submit();
  array.close();
}

void read_array(Context ctx) {
  // Prepare the array for reading
  Array array(ctx, array_name, TILEDB_READ);

  // Slice only rows 1, 2 and cols 2, 3, 4
  const std::vector<int> subarray = {1, 2, 2, 4};

  // Prepare the vector that will hold the result (of size 6 elements)
  std::vector<int> data(6);

  // Prepare the query
  Query query(ctx, array, TILEDB_READ);
  query.set_subarray(subarray)
      .set_layout(TILEDB_ROW_MAJOR)
      .set_buffer("a", data);

  // Submit the query and close the array.
  query.submit();
  array.close();

  // Print out the results.
  for (auto d : data)
    std::cout << d << " ";
  std::cout << "\n";
}

int main() {
  // Create a TileDB context.
  tiledb::Config cfg;
  cfg["vfs.s3.region"] = "us-west-1";

  Context ctx(cfg);

  if (Object::object(ctx, array_name).type() != Object::Type::Array) {
    create_array(ctx);
    write_array(ctx);
  }

  read_array(ctx);
  return 0;
}

For a diff of all of the changes, please see below:

diff ex.cc orig.cc
7c7,11
< std::string array_name("s3://some-bucket-us-west-1/test1");
---
> std::string array_name("quickstart_dense_array");
>
> void create_array() {
>   // Create a TileDB context.
>   Context ctx;
9d12
< void create_array(Context ctx) {
26c29,31
< void write_array(Context ctx) {
---
> void write_array() {
>   Context ctx;
>
41c46,48
< void read_array(Context ctx) {
---
> void read_array() {
>   Context ctx;
>
68,72c75
<   // Create a TileDB context.
<   tiledb::Config cfg;
<   cfg["vfs.s3.region"] = "us-west-1";
<
<   Context ctx(cfg);
---
>   Context ctx;
75,76c78,79
<     create_array(ctx);
<     write_array(ctx);
---
>     create_array();
>     write_array();
79c82
<   read_array(ctx);
---
>   read_array();
81c84
< }
\ No newline at end of file
---
> }

@ihnorton The code you posted worked and I tried to understand what’s the different. I think the major different here is that the cfg["vfs.s3.region"] = "us-west-1"; needs to be set up before config is assigned to the context and you are also using a single context for each function.

is that what it boils down to? Also, the other thing I wanted to confirm that I do have the region set up in both my ~/.aws/config file and also set in my environment variable. It sounds like tile-db AWS API is able to pick up the credentials(access_key, secret) from the environment variable/credentials file. however, for region the only way to do is to set it in the code. Is that a correct understanding?

Correct

Correct, the AWS SDK picks up the credentials from the config file, as long as you don’t override in the TileDB config. But it does not pick up the default region, and the default region set by the SDK is us-east-1 when the vfs.s3.region is empty.

1 Like

Thank you for your help on this one. It was really helpful.

1 Like