Interfacing with the Pandas Package¶
The pandas package is a package for high
performance data analysis of table-like structures that is complementary to the
Table
class in astropy
.
In order to exchange data between the Table
class and
the pandas.DataFrame
class (the main data structure in pandas
),
the Table
class includes two methods, to_pandas()
and from_pandas()
.
Example¶
To demonstrate, we can create a minimal table:
>>> from astropy.table import Table
>>> t = Table()
>>> t['a'] = [1, 2, 3, 4]
>>> t['b'] = ['a', 'b', 'c', 'd']
Which we can then convert to a DataFrame
:
>>> df = t.to_pandas()
>>> df
a b
0 1 a
1 2 b
2 3 c
3 4 d
>>> type(df)
<class 'pandas.core.frame.DataFrame'>
It is also possible to create a table from a DataFrame
:
>>> t2 = Table.from_pandas(df)
>>> t2
<Table length=4>
a b
int64 string8
----- -------
1 a
2 b
3 c
4 d
The conversions to and from pandas
are subject to the following caveats:
The
DataFrame
structure does not support multidimensional columns, soTable
objects with multidimensional columns cannot be converted toDataFrame
.Masked tables can be converted, but in columns of
float
or string values the resultingDataFrame
usesnumpy.nan
to indicate missing values. Forfloat
columns, the conversion therefore does not necessarily round-trip if converting back to anastropy
table, because the distinction betweennumpy.nan
and masked values is lost. This is not a problem for integer columns.Tables with Mixin Columns can not be converted.