We are currently testing if we could use tiledb in our project but are unhappy with the performance, namely slicing from point cloud data seems to take quite a bit of time. Our current test case consists of a point cloud with 161059 points (xyz data), we are using the c++ api to read the data from the tiledb. Schema of our data looks like this:
ArraySchema(
domain=Domain(*[
Dim(name='z', domain=(-1000.0, 1000.0), tile=2000.0, dtype='float32'),
Dim(name='y', domain=(-1000.0, 1000.0), tile=2000.0, dtype='float32'),
Dim(name='x', domain=(-1000.0, 1000.0), tile=2000.0, dtype='float32'),
]),
attrs=[
Attr(name='floor', dtype='uint8', var=False, nullable=False),
Attr(name='outlier', dtype='uint8', var=False, nullable=False),
],
cell_order='hilbert',
tile_order=None,
capacity=10000,
sparse=True,
allows_duplicates=False,
)
Bounding box for the data is
non_empty_domain x: -31.0248 50.0115
non_empty_domain y: -23.6003 32.5186
non_empty_domain z: -3.32503 6.89223
Reading all that data from the database takes around 250-800ms which feels a bit slow, but when we increase the size of the database by just copying the input database around the bounding box above, e.g. shifting all the datapoints to right by adding maxX - minX to x coordinates and doing that to all directions so that the data is duplicated all around the above bounding box so that the new database has bounding box of size:
non_empty_domain x: -598.279 617.266
non_empty_domain y: -416.433 425.351
non_empty_domain z: -33.9768 37.544
Everything else stays the same, schema and all, only data amount is increased (and even that database is rather small for our purposes). Now getting the same original data, using the first bound box as slicing area, basically doing the same slice as before, but from this larger database, we get reading times of around 7-8 seconds which is too slow for our purposes. Is the expected performance? If it is not, how can the performance be increased? I guess usual data dimensios for our slicing are around 100x100x10 or so in meters. Maybe twice as much.
We have already consolidated the data and also vacuumed it, there used to be 104 fragments in the data but now after consolidation and vacuuming, there is only 4 fragments. But to us it seemed that the consolidatino/vacuuming had no effect on the read performance. Before consolidation it took around 8s to read from the larger bbox data area and it took the same time after the consolidation.
I got slight improvement on reading when I switched from libtiledb2.14 to libtiledb2.15, it shaved off around 1s from reading time, dropping it to around 6.8s.
Here is the code we use for reading the data. (And do the timing measurements)
pcl::PointCloud<pcl::PointXYZ>::Ptr readData(const tiledb::Config &config,
const std::string &array_name,
const std::array<float, 2> &xrange,
const std::array<float, 2> &yrange,
const std::array<float, 2> &zrange,
const std::string &attribute = "",
const uint8_t attributeValue = 0,
const tiledb_query_condition_op_t queryComparison = TILEDB_EQ)
{
tiledb::Context ctx(config);
// Prepare the array for reading
tiledb::Array array(ctx, array_name, TILEDB_READ);
tiledb::Subarray subarray(ctx, array);
subarray.add_range("z", zrange[0], zrange[1])
.add_range("y", yrange[0], yrange[1])
.add_range("x", xrange[0], xrange[1]);
// will hold the query return status, if the input buffers are not large enough
// to hold the return values, then this will be TILEDB_INCOMPLETE
// and we must consume our data from buffer and the resubmit query to get the
// rest of the data.
tiledb::Query::Status status;
// Prepare the query
tiledb::Query query(ctx, array);
query.set_subarray(subarray);
//.set_layout(TILEDB_COL_MAJOR);
// Get estimate for a fixed-length attribute or dimension (in bytes)
uint64_t est_size = query.est_result_size("floor");
std::cout << "estimated buffer size for floor: " << est_size << std::endl;
est_size = query.est_result_size("outlier");
std::cout << "estimated buffer size for outlier: " << est_size << std::endl;
est_size = query.est_result_size("x");
std::cout << "estimated buffer size for x: " << est_size << std::endl;
est_size = query.est_result_size("y");
std::cout << "estimated buffer size for y: " << est_size << std::endl;
est_size = query.est_result_size("z");
std::cout << "estimated buffer size for z: " << est_size << std::endl;
const int buffer_max_limit = 1024 * 1024 * 256; // bytes
if (est_size > buffer_max_limit)
{
est_size = buffer_max_limit;
std::cout << "Limiting buffer size to 256MB (" << est_size << ")" << std::endl;
}
// if attribute is given, then we pass those conditions all the way to tiledb
// so that it only returns those points that have the attribute condition met
// thus making stuff easier since we don't have to check or filter based on
// attribute values.
// With this we can e.g. get only floor points from given xyz intervals.
tiledb::QueryCondition querycond(ctx);
if (attribute.length() > 0)
{
std::cout << "Using " << attribute << qc_op2str(queryComparison) << (attributeValue > 0) << " as query condition" << std::endl;
querycond.init(attribute, &attributeValue, sizeof(uint8_t), queryComparison);
query.set_condition(querycond);
}
// Prepare the vector that will hold the result.
// We take an upper bound on the result size, as we do not
// know a priori how big it is (since the array is sparse)
// maximum number of points the sliced dataset can contain
int maxPoints = (est_size / sizeof(float));
std::cout << "points: " << maxPoints << std::endl;
std::vector<float> datax(maxPoints, 0.1337f);
std::vector<float> datay(maxPoints, 0.1337f);
std::vector<float> dataz(maxPoints, 0.1337f);
// attributes (metadata) for each point
std::vector<uint8_t> dataf(maxPoints, 250);
std::vector<uint8_t> datao(maxPoints, 250);
auto start = std::chrono::high_resolution_clock::now(); // start the timer
query.set_data_buffer("z", dataz)
.set_data_buffer("y", datay)
.set_data_buffer("x", datax)
.set_data_buffer("floor", dataf)
.set_data_buffer("outlier", datao);
pcl::PointCloud<pcl::PointXYZ>::Ptr cloudPtr(new pcl::PointCloud<pcl::PointXYZ>);
int floorCount = 0;
int restCount = 0;
// Submit the query and close the array.
do
{
// tiledb::Stats::enable();
query.submit();
// tiledb::Stats::dump(stdout);
// tiledb::Stats::disable();
status = query.query_status();
// IMPORTANT: check if there are any results, as your buffer
// could have been too small to fit even a single result
auto result_num = query.result_buffer_elements()["z"].second;
if (status == tiledb::Query::Status::INCOMPLETE && (result_num == 0))
{
// You need to reallocate your buffers, otherwise
// you will get an infinite loop
std::cout << "Initial size for buffer is too small to even contain one data point! " << datax.capacity()
<< std::endl;
datax.resize(datax.capacity() * 2);
datay.resize(datay.capacity() * 2);
dataz.resize(dataz.capacity() * 2);
dataf.resize(dataf.capacity() * 2);
datao.resize(datao.capacity() * 2);
query.set_data_buffer("z", dataz)
.set_data_buffer("y", datay)
.set_data_buffer("x", datax)
.set_data_buffer("floor", dataf)
.set_data_buffer("outlier", datao);
}
else if (result_num > 0)
{
// Do something with the results
// You could set new buffers to the query here
std::cout << "Data contained " << result_num << " points and status is "
<< status << std::endl;
// next store the data into pcl data structure.
for (size_t i = 0; i < result_num; i++)
{
// std::cout << datax[i] << " " << datay[i] << " " << dataz[i] << std::endl;
cloudPtr->points.push_back(pcl::PointXYZ(datax[i], datay[i], dataz[i]));
if (dataf[i] > 0)
floorCount++;
else
restCount++;
}
}
} while (status == tiledb::Query::Status::INCOMPLETE);
auto stop = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(stop - start);
std::cout << "Actual query duration: " << duration.count() << "ms" << std::endl;
auto non_empty_domain_x = array.non_empty_domain<float>("x");
auto non_empty_domain_y = array.non_empty_domain<float>("y");
auto non_empty_domain_z = array.non_empty_domain<float>("z");
std::cout << "non_empty_domain x: " << non_empty_domain_x.first << " " << non_empty_domain_x.second << std::endl;
std::cout << "non_empty_domain y: " << non_empty_domain_y.first << " " << non_empty_domain_y.second << std::endl;
std::cout << "non_empty_domain z: " << non_empty_domain_z.first << " " << non_empty_domain_z.second << std::endl;
array.close();
std::cout << "FloorCount: " << floorCount << " restCount: " << restCount << std::endl;
return cloudPtr;
}
What also confuses us, is the size of the read buffer (in the code datax, datay, dataz etc.). When the size of the database increases, so does increase the need for the size of the datax etc. read buffers. This is a bit strange since the slice size should still stay the same, we should always get 161059 points. And we do, it’s just that the read buffer needs to be larger when tiledb database is larger.
So when only 161059 points are in the database, the read output looks like this:
estimated buffer size for floor: 161059
estimated buffer size for outlier: 161059
estimated buffer size for x: 644236 // in bytes, in vector<float> this is 161059 floats
estimated buffer size for y: 644236
estimated buffer size for z: 644236
points: 161059
Data contained 161055 points and status is COMPLETE
Actual query duration: 385ms
non_empty_domain x: -31.0248 50.0115
non_empty_domain y: -23.6003 32.5186
non_empty_domain z: -3.32503 6.89223
FloorCount: 83050 restCount: 78005
But with larger dataset:
estimated buffer size for floor: 3925053
estimated buffer size for outlier: 3925053
estimated buffer size for x: 15700211
estimated buffer size for y: 15700211
estimated buffer size for z: 15700211
points: 3925052 //Size of this has increased a lot
// and still we could not read all the data in one
// iteration, had to read it twice
Data contained 51516 points and status is INCOMPLETE
Data contained 109541 points and status is COMPLETE
Actual query duration: 8304ms
non_empty_domain x: -598.279 617.266
non_empty_domain y: -416.433 425.351
non_empty_domain z: -33.9768 37.544
FloorCount: 83051 restCount: 78006
Basically we are still getting the same amount of points back, it just takes a lot larger read buffer to hold the intermediate result, why is this? I would imagine that the slice size is the same no matter how much there is data outside the slice volume unless this comes down to the capacity setting somehow.
All these tests are run on local HP elitebook laptop with Core i7 8th gen with 32Gb of ram.
Br.
Jarkko
p.s. The read data seems to be missing some data points. Input data had 161059 points and when slicing using the bounding box as a slicer, we are missing 4 points, since the output data only has 161055 points, maybe this is due to rounding happening when bounding box is returned using array.non_empty_domain(“x”) etc. or something. Have not investigated what points are left out.
I know that all the datapoints are there if I increase the slicing volume to be a tad bit bigger than the data bounding box in the tiledb, then I get all 161059 points.