Version 0.9.0 (October 7, 2012)¶
This is a major release from 0.8.1 and includes several new features and
enhancements along with a large number of bug fixes. New features include
vectorized unicode encoding/decoding for Series.str
, to_latex
method to
DataFrame, more flexible parsing of boolean values, and enabling the download of
options data from Yahoo! Finance.
New features¶
Add
encode
anddecode
for unicode handling to vectorized string processing methods in Series.str (GH1706)Add
DataFrame.to_latex
method (GH1735)Add convenient expanding window equivalents of all rolling_* ops (GH1785)
Add Options class to pandas.io.data for fetching options data from Yahoo! Finance (GH1748, GH1739)
More flexible parsing of boolean values (Yes, No, TRUE, FALSE, etc) (GH1691, GH1295)
Add
level
parameter toSeries.reset_index
TimeSeries.between_time
can now select times across midnight (GH1871)Series constructor can now handle generator as input (GH1679)
DataFrame.dropna
can now take multiple axes (tuple/list) as input (GH924)Enable
skip_footer
parameter inExcelFile.parse
(GH1843)
API changes¶
The default column names when
header=None
and no columns names passed to functions likeread_csv
has changed to be more Pythonic and amenable to attribute access:
In [1]: import io
In [2]: data = """
...: 0,0,1
...: 1,1,0
...: 0,1,0
...: """
...:
In [3]: df = pd.read_csv(io.StringIO(data), header=None)
In [4]: df
Out[4]:
0 1 2
0 0 0 1
1 1 1 0
2 0 1 0
Creating a Series from another Series, passing an index, will cause reindexing to happen inside rather than treating the Series like an ndarray. Technically improper usages like
Series(df[col1], index=df[col2])
that worked before “by accident” (this was never intended) will lead to all NA Series in some cases. To be perfectly clear:
In [5]: s1 = pd.Series([1, 2, 3])
In [6]: s1
Out[6]:
0 1
1 2
2 3
dtype: int64
In [7]: s2 = pd.Series(s1, index=["foo", "bar", "baz"])
In [8]: s2
Out[8]:
foo NaN
bar NaN
baz NaN
dtype: float64
Deprecated
day_of_year
API removed from PeriodIndex, usedayofyear
(GH1723)Don’t modify NumPy suppress printoption to True at import time
The internal HDF5 data arrangement for DataFrames has been transposed. Legacy files will still be readable by HDFStore (GH1834, GH1824)
Legacy cruft removed: pandas.stats.misc.quantileTS
Use ISO8601 format for Period repr: monthly, daily, and on down (GH1776)
Empty DataFrame columns are now created as object dtype. This will prevent a class of TypeErrors that was occurring in code where the dtype of a column would depend on the presence of data or not (e.g. a SQL query having results) (GH1783)
Setting parts of DataFrame/Panel using ix now aligns input Series/DataFrame (GH1630)
first
andlast
methods inGroupBy
no longer drop non-numeric columns (GH1809)Resolved inconsistencies in specifying custom NA values in text parser.
na_values
of type dict no longer override default NAs unlesskeep_default_na
is set to false explicitly (GH1657)DataFrame.dot
will not do data alignment, and also work with Series (GH1915)
See the full release notes or issue tracker on GitHub for a complete list.