As part of PyData Amsterdam 2020 (tickets on sale), I was asked to prepare a session on getting started with testing in Python.
The goal was to empower participants to participate to open source development, on libraries such as pandas.
This is always a daunting and humbling task, even more so now that I basically run the Xebia Academy full-time, with no time to code — or to write tests.
So while readying my talk, I thought: wouldn’t it be a great challenge if could not only teach participants something new but also contribute a real test to pandas?
Lo and behold: the pandas developers are ready to merge my contribution!
@pytest.mark.parametrize(
"grouping,_index",
[
(
{"level": 0},
pd.MultiIndex.from_tuples(
[(0, 0), (0, 0), (1, 1), (1, 1), (1, 1)], names=[None, None]
),
),
(
{"by": "X"},
pd.MultiIndex.from_tuples(
[(0, 0), (1, 0), (2, 1), (3, 1), (4, 1)], names=["X", None]
),
),
],
)
def test_rolling_positional_argument(grouping, _index, raw):
# GH 34605
def scaled_sum(*args):
if len(args) < 2:
raise ValueError("The function needs two arguments")
array, scale = args
return array.sum() / scale
df = DataFrame(data={"X": range(5)}, index=[0, 0, 1, 1, 1])
expected = DataFrame(data={"X": [0.0, 0.5, 1.0, 1.5, 2.0]}, index=_index)
result = df.groupby(**grouping).rolling(1).apply(scaled_sum, raw=raw, args=(2,))
tm.assert_frame_equal(result, expected)
I learned tons of things about new features of pytest, how the pandas code base is structured, and more!
Also interested in learning about more advanced features of Python for Data Science? Be sure to check out our Advanced Data Science with Python course, or start with the Data Science with Python Foundation course.