Error while using QueryCondition unicode attr

Hey,
I’m getting an error when trying to use query conditions on a U type attr.

Here is code snippet to replicate the issue:

import pandas as pd
import tiledb
from tiledb import QueryCondition

data = [
    ["str","str"]
]

df = pd.DataFrame(data, columns = ["Stype","Utype"])

df.Stype = df.Stype.astype("S0")

uri = "/tmp/test"
tiledb.from_pandas(uri, dataframe=df)

with tiledb.open(uri) as A:
    print(A.attr(0))
    print(A.attr(1))

with tiledb.open(uri) as A:
    qc = QueryCondition("Stype=='str'")
    A.query(attr_cond=qc).df[:]

"""
No errors
"""

with tiledb.open(uri) as A:
    qc = QueryCondition("Utype=='str'")
    A.query(attr_cond=qc).df[:]

"""
Traceback (most recent call last):
  File "/homefolder/roya/anaconda3/envs/tiledbclient/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3397, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-16-27f929b3f9ff>", line 3, in <cell line: 1>
    A.query(attr_cond=qc).df[:]
  File "/homefolder/roya/anaconda3/envs/tiledbclient/lib/python3.9/site-packages/tiledb/multirange_indexing.py", line 210, in __getitem__
    return self if self.return_incomplete else self._run_query()
  File "/homefolder/roya/anaconda3/envs/tiledbclient/lib/python3.9/site-packages/tiledb/multirange_indexing.py", line 341, in _run_query
    self.pyquery.submit()
tiledb.cc.TileDBError: [TileDB::QueryCondition] Error: Value node non-empty attribute may only be var-sized for ASCII strings: Utype
"""

Also, I found this thread:

But sadly it didn’t help

Help will be much appreciated.

Best,
Roy

Hi @royassis,

The "U0" Numpy dtype maps internally to TILEDB_STRING_UTF8 which is not supported for query conditions. In order to set the attribute dtype to TILEDB_STRING_ASCII in from_pandas, use the column_types argument and map the attribute to "ascii". I’ve also modified your code to set Stype to "S0" (TILEDB_CHAR) using this method too.

import pandas as pd
import tiledb
from tiledb import QueryCondition

data = [["str", "str"]]

df = pd.DataFrame(data, columns=["Stype", "Utype"])

uri = "/tmp/test"
tiledb.from_pandas(
    uri,
    dataframe=df,
    column_types={"Stype": "S0", "Utype": "ascii"},
)

with tiledb.open(uri) as A:
    print(A.attr(0))
    print(A.attr(1))

# no longer errors out
with tiledb.open(uri) as A:
    qc = QueryCondition("Stype=='str'")
    A.query(attr_cond=qc).df[:]
    qc = QueryCondition("Utype=='str'")
    A.query(attr_cond=qc).df[:]

Hey @nguyenv,
Thank you for your answer. That is what I did. I had some problems with some attributes as they contained unicode characters, but that wasn’t a big issue.