DenseReader: Cannot process a single tile, increase memory budget

Thanks for the details @kunaal_desai. If I look at the array with 114 fragments (“the C# array”), I see that the tile extent for the rows dimension is 114. As discussed, this will make the whole dataset a single tile and since we always write a full tile to disc for a write, even if you only write one row. I see that the data is not compressed so the size of each fragment is easy to compute… 114 rows * 642824 cols * 4 bytes per float * 2 attributes = ~560MB. Times 144 fragments, you get your 68GB. If you had the tile extent set to 1 for rows, I would expect the data to take about 500MB. What C# code did you use to create the 114 fragment array? That could be where the problem lies.

Now I noticed that you don’t use a compression filter for your data… Is there a reason? I didn’t think the data on disk would be so large because the data we pad to a full tile with is zeros so I thought they would be compressed to a very small size but of course that’s not the case if you don’t have any filters and your on disk data will be very large if you don’t specify the correct tile extent.

If I look at the array you shared a dump for (“the C++ array”), I see that the tile extent is properly set to 1 for rows, so that’s why it’s taking a lesser footprint on disk. So here you’re probably passing in the tile extent correctly at schema creation time.

Thank you that was insightful and helpful. I changed the extent to 1 and now the disk size is reduced. I will also try turning on compression.

Thanks for the feedback @kunaal_desai! I’m glad it helped! Let us know if there’s anything else you’d like us to look at.

Best,
Luc