Ran out of memory during vacuuming

Ran out of memory during consolidation & vacuuming:

[ERROR/2020-11-23 14:22:13,534] root: Error: Error: Internal TileDB uncaught exception; std::bad_alloc, ../Material Indicators/mnt/volume-nbg1-1/trade_data3/Binance/ANT_USDT
Traceback (most recent call last):
  File "<ipython-input-3-4d265a553e82>", line 61, in f
    tiledb.consolidate(file,ctx=ctx)
  File "tiledb/libtiledb.pyx", line 4902, in tiledb.libtiledb.consolidate
  File "tiledb/libtiledb.pyx", line 469, in tiledb.libtiledb._raise_ctx_err
  File "tiledb/libtiledb.pyx", line 454, in tiledb.libtiledb._raise_tiledb_error
tiledb.libtiledb.TileDBError: Error: Internal TileDB uncaught exception; std::bad_alloc

Now getting tons of errors about missing .ok files.

[ERROR/2020-11-23 17:24:18,598] root: Error: [TileDB::IO] Error: Cannot delete file '/root/mnt3/Material Indicators/mnt/volume-nbg1-1/indicator_data/Binance/CVD/raw/VITE_USDT/__1605762118400_1605762118400_c94337e75f474eac99c27c885a47605b_5.ok'; No such file or directory, ../Material Indicators/mnt/volume-nbg1-1/indicator_data/Binance/CVD/raw/VITE_USDT
Traceback (most recent call last):
  File "<ipython-input-3-4d265a553e82>", line 62, in f
    tiledb.vacuum(file,config=config)
  File "tiledb/libtiledb.pyx", line 6155, in tiledb.libtiledb.vacuum
  File "tiledb/libtiledb.pyx", line 484, in tiledb.libtiledb.check_error
  File "tiledb/libtiledb.pyx", line 480, in tiledb.libtiledb._raise_ctx_err
  File "tiledb/libtiledb.pyx", line 465, in tiledb.libtiledb._raise_tiledb_error
tiledb.libtiledb.TileDBError: [TileDB::IO] Error: Cannot delete file '/root/mnt3/Material Indicators/mnt/volume-nbg1-1/indicator_data/Binance/CVD/raw/VITE_USDT/__1605762118400_1605762118400_c94337e75f474eac99c27c885a47605b_5.ok'; No such file or directory

Code used to consolidate & vacuum:

config = tiledb.Config()

config = tiledb.Config({
    "sm.tile_cache_size":str(120_000_000),
    "sm.consolidation.step_min_frags":"2",
    "sm.consolidation.step_max_frags":"60",
    "sm.consolidation.steps":"100",
    "sm.consolidation.buffer_size":str(120_000_000),
    "sm.consolidation.step_size_ratio": "0.5",
})

ctx = tiledb.Ctx(config)

tiledb.consolidate(file,ctx=ctx)
tiledb.vacuum(file,config=config)

It seems I can still read the data and write to the array but the error appears every time I try to consolidate & vacuum. What should I do now?

Link to faulty example array:
https://1drv.ms/u/s!ArP7_EkyioIBwuJ8lydSoSFqzN7knw?e=IlfGBv

Note:
It’s not just one array. It’s hundreds that have been affected by this…

@Mtrl_Scientist

We’ve encountered a similar scenario in the past and fixed the issue with the following python script. This script attempts to get the array back into a good state for retrying the vacuum. Please back up your array before using the script.

I’d suggest saving the following snippet as sanitize_array.py and running it with python sanitize_array.py <array name>.

import os
import shutil

if len(sys.argv) < 2:
  print("usage: python sanitize_array.py <array name>")
  quit()

array_dir = sys.argv[1]

# Gather sets for fragment file names and ok-fragment file names.
fragments_set = set()
fragments_ok_set = set()
for file_name in os.listdir(array_dir):
  if file_name.endswith(".tdb") or \
      file_name.endswith(".vac") or \
      file_name.endswith(".meta"):
    continue

  if file_name.startswith("__"):
    if file_name.endswith(".ok"):
      fragments_ok_set.add(file_name)
    else:
      fragments_set.add(file_name)

print("found " + str(len(fragments_set)) + " fragments")
print("found " + str(len(fragments_ok_set)) + " ok-fragments")

# Remove all fragments without a "__fragment_metadata.tdb" file.
removed_fragments = []
for fragment in fragments_set:
  fragment_path = os.path.join(array_dir, fragment)
  metadata_path = os.path.join(fragment_path, "__fragment_metadata.tdb")
  if not os.path.isfile(metadata_path):
    shutil.rmtree(fragment_path)
    removed_fragments.append(fragment)

# Remove the removed fragment file names from `fragments_set`.
print("removed " + str(len(removed_fragments)) + " fragments")
for removed_fragment in removed_fragments:
  fragments_set.remove(removed_fragment)

# Create missing ok-fragment files
added_fragment_ok_files = 0
for fragment in fragments_set:
  fragment_ok = fragment + ".ok"
  fragment_ok_path = os.path.join(array_dir, fragment_ok)
  if not os.path.isfile(fragment_ok_path):
    open(fragment_ok_path, "a").close()
    added_fragment_ok_files += 1

print("added " + str(added_fragment_ok_files) + " ok-fragments")
2 Likes

It seems to work, thanks!