I have a sparse array written on disk - 6 dimensions (INT32) with a single attribute. Ranges for each dimension as follows;
1:60k, 1:100m, 1:2, 1:2, 1:1m, 1:10k
Queries work well in most cases - a single value for a dimension say, return all results fast - but ranges are much slower - The array is about 6GB uncompressed - and to load the whole array in, for example of a query in the limit, takes a long time (over 10 minutes on a 160GB RAM, 40 core linux machine) - consider that R’s ‘data.table’ package (using fread function) can load the csv equivalent in seconds.
I have played around with config, to no effect, and fear I am misunderstanding something more fundamental .
All help appreciated guys, thanks in advance