superleaf.dataframe.selection#
Functions
|
Filters a DataFrame by applying provided conditions. |
|
Partitions a DataFrame into two subsets based on provided filtering conditions. |
|
Reorders columns in a DataFrame based on the provided parameters. |
- superleaf.dataframe.selection.dfilter(df: DataFrame, *filters, **col_filters) DataFrame [source]#
Filters a DataFrame by applying provided conditions.
- Parameters:
df (pd.DataFrame) – The DataFrame to filter.
*filters – Variable positional arguments that can be: - Instances of ColOp. - Callables applied row-wise, returning boolean values. - Iterable of boolean values indicating row selection.
**col_filters – Keyword arguments mapping column names to conditions, which can be: - Values (equality filter). - Instances of ColOp. - Callables applied element-wise to the specified column.
- Returns:
A copy of the DataFrame containing only rows satisfying all filters.
- Return type:
pd.DataFrame
Examples
>>> filtered_df = dfilter(df, Col('age') > 30, status='active')
- superleaf.dataframe.selection.partition(df: DataFrame, *filters, **col_filters) tuple[DataFrame, DataFrame] [source]#
Partitions a DataFrame into two subsets based on provided filtering conditions.
- Parameters:
df (pd.DataFrame) – The DataFrame to partition.
*filters – Variable positional arguments (see
dfilter
documentation).**col_filters – Keyword arguments (see
dfilter
documentation).
- Returns:
First DataFrame contains rows matching the provided filters.
Second DataFrame contains rows that do not match.
- Return type:
tuple[pd.DataFrame, pd.DataFrame]
Example
>>> passed_df, failed_df = partition(df, score=lambda x: x > 50)
- superleaf.dataframe.selection.reorder_columns(df: DataFrame, columns: str | Sequence[str], back=False, after=None, before=None) DataFrame [source]#
Reorders columns in a DataFrame based on the provided parameters.
- Parameters:
df (pd.DataFrame) – The DataFrame whose columns to reorder.
columns (Union[str, Sequence[str]])) – Column name or sequence of column names to reorder.
back (bool, optional) – If True, moves specified columns to the end. Default is False.
after (str, optional) – Column name after which the specified columns should be placed. Default is None.
before (str, optional) – Column name before which the specified columns should be placed. Default is None.
- Returns:
A new DataFrame with reordered columns.
- Return type:
pd.DataFrame
Notes
Exactly one of
back
,after
, orbefore
can be used at a time.- Raises:
ValueError – If more than one of
back
,after
, orbefore
parameters are provided simultaneously.
Examples
>>> reordered_df = reorder_columns(df, ['age', 'name'], after='id')