A couple comments – happy to dig in further if I misunderstood what you mean by “from scratch” here:
You only need to specify/fill the columns with NAs.
If you are using Pandas read_csv implementation directly, there are a variety of options for handling NAs, including overrides for interpretation of other strings as NA (see here).
For Pandas dataframes, you can call pandas.DataFrame.fillna directly to accomplish the same thing directly on the dataframe.
Indeed! Another highlight is the “Hilbert” tiling feature, which will simplify array creation as well as providing significant performance boosts. There is an active list of upcoming features and improvements here: