Pandas dataframe to GDB
I have a Pandas dataframe that I want to convert to a GDB. I am using a dataframe because I have text and numeric data fields.
if
SurveyName object (this is how Pandas reports my string fields)
Job int64
Record int64
Date int64
...
dtype: object
When I try to write the line
File "...\geosoft\gxpy\gdb.py", line 2146, in write_line
self.write_channel(line, cs, data[:, np_index: np_index + w], fid=fid)
File "...\geosoft\gxpy\gdb.py", line 2027, in write_channel
cs = self.new_channel(channel, data.dtype, array=_va_width(data))
File "...\geosoft\gxpy\gdb.py", line 1189, in new_channel
gxu.gx_dtype(dtype),
File "...\geosoft\gxpy\utility.py", line 566, in gx_dtype
return _np2gx_type[str(dtype)]
KeyError: 'object'
I also tried explicitly converting each text dataframe column to strings, but it doesn't help:
Thanks,
Nick
if
df
is my dataframe, df.dtype
returns:SurveyName object (this is how Pandas reports my string fields)
Job int64
Record int64
Date int64
...
dtype: object
When I try to write the line
gdb.write_line('L0', df, df.columns)
I get the errors:File "...\geosoft\gxpy\gdb.py", line 2146, in write_line
self.write_channel(line, cs, data[:, np_index: np_index + w], fid=fid)
File "...\geosoft\gxpy\gdb.py", line 2027, in write_channel
cs = self.new_channel(channel, data.dtype, array=_va_width(data))
File "...\geosoft\gxpy\gdb.py", line 1189, in new_channel
gxu.gx_dtype(dtype),
File "...\geosoft\gxpy\utility.py", line 566, in gx_dtype
return _np2gx_type[str(dtype)]
KeyError: 'object'
I also tried explicitly converting each text dataframe column to strings, but it doesn't help:
for column in df.select_dtypes(include=['object']): df[column] = df[column].astype('|S')Is it possible to go directly from a Pandas dataframe to a GDB? Or do I need to use low level functions to write each channel and manually specify the type?
Thanks,
Nick
Tagged:
0
Comments
-
It looks like a small change to the function gx_dtype in the gxpy utility.py code avoids the error. Adding the np.object_ check as below:
if dtype.type is np.str_: # x4 to allow for full UTF-8 characters return -int(dtype.str[2:])*4 elif dtype.type is np.object_: # My edit, assign length 80 to all strings return -int(80)
I assume this is not a complete solution. Any ideas how to do this properly?
0