superleaf.dataframe.selection#

Functions

`dfilter`(df, filters, *col_filters)	Filters a DataFrame by applying provided conditions.
`partition`(df, filters, *col_filters)	Partitions a DataFrame into two subsets based on provided filtering conditions.
`reorder_columns`(df, columns[, back, after, ...])	Reorders columns in a DataFrame based on the provided parameters.

superleaf.dataframe.selection.dfilter(df: DataFrame, *filters, **col_filters) → DataFrame[source]#

Filters a DataFrame by applying provided conditions.

Parameters:

df (pd.DataFrame) – The DataFrame to filter.
*filters – Variable positional arguments that can be: - Instances of ColOp. - Callables applied row-wise, returning boolean values. - Iterable of boolean values indicating row selection.
**col_filters – Keyword arguments mapping column names to conditions, which can be: - Values (equality filter). - Instances of ColOp. - Callables applied element-wise to the specified column.

Returns:

A copy of the DataFrame containing only rows satisfying all filters.

Return type:

pd.DataFrame

Examples

>>> filtered_df = dfilter(df, Col('age') > 30, status='active')

superleaf.dataframe.selection.partition(df: DataFrame, *filters, **col_filters) → tuple[DataFrame, DataFrame][source]#

Partitions a DataFrame into two subsets based on provided filtering conditions.

Parameters:

df (pd.DataFrame) – The DataFrame to partition.
*filters – Variable positional arguments (see dfilter documentation).
**col_filters – Keyword arguments (see dfilter documentation).

Returns:

First DataFrame contains rows matching the provided filters.
Second DataFrame contains rows that do not match.

Return type:

tuple[pd.DataFrame, pd.DataFrame]

Example

>>> passed_df, failed_df = partition(df, score=lambda x: x > 50)

superleaf.dataframe.selection.reorder_columns(df: DataFrame, columns: str | Sequence[str], back=False, after=None, before=None) → DataFrame[source]#

Reorders columns in a DataFrame based on the provided parameters.

Parameters:

df (pd.DataFrame) – The DataFrame whose columns to reorder.
columns (Union[str, Sequence[str]])) – Column name or sequence of column names to reorder.
back (bool, optional) – If True, moves specified columns to the end. Default is False.
after (str, optional) – Column name after which the specified columns should be placed. Default is None.
before (str, optional) – Column name before which the specified columns should be placed. Default is None.

Returns:

A new DataFrame with reordered columns.

Return type:

pd.DataFrame

Notes

Exactly one of back, after, or before can be used at a time.

Raises:: ValueError – If more than one of back, after, or before parameters are provided simultaneously.

Examples

>>> reordered_df = reorder_columns(df, ['age', 'name'], after='id')

superleaf.dataframe.selection#

This Page