superleaf.dataframe.transform#
Functions
|
Expand dictionary-like values in one or more DataFrame columns into new, flat columns. |
|
- superleaf.dataframe.transform.expand_dict_to_cols(df: DataFrame, cols, fields=None, with_col_prefix=True, prefix='', prefix_fun=None, sep='_', drop=True, dropna=False, col_renamer=None, recursive=False, uniform_keys=False, default=nan) DataFrame [source]#
Expand dictionary-like values in one or more DataFrame columns into new, flat columns.
For each column in
cols
, dictionary-like entries are unpacked into separate columns (one per key). You can control which keys to expand, how to name the new columns, whether to drop the original columns, and whether nested dictionaries should be expanded recursively.- Parameters:
df (pd.DataFrame) – The input DataFrame.
cols (str or Sequence[str]) – Column name or list of column names whose values are dicts to expand.
fields (str or Sequence[str], optional) – Specific keys to extract from each dict. If None, all keys encountered (or, if
uniform_keys=True
, the keys from the first non-null dict) will be used.with_col_prefix (bool, optional) – If True, prefix new column names with the source column name and
sep
. If False, onlyprefix
(if provided) is used.prefix (str, optional) – A string to prepend to all new column names.
prefix_fun (callable, optional) – Function mapping (col_name, current_prefix) to a new prefix.
sep (str, optional) – Separator between prefix and field name. Default is ‘_’.
drop (bool, optional) – If True, drop the original columns from the output DataFrame.
dropna (bool, optional) – If True, omit creating new columns when all values for a given key are null.
col_renamer (dict or callable, optional) – Mapping or function to rename generated column names.
recursive (bool, optional) – If True, expand nested dict values recursively.
uniform_keys (bool, optional) – If True, assume all dicts have the same keys and extract from the first non-null entry.
default (scalar, optional) – Value to use when a key is missing in a particular row.
- Returns:
A new DataFrame with each specified dict-column expanded into its own column(s).
- Return type:
pd.DataFrame
- Raises:
TypeError – If a non-dict-like value is encountered when expanding.
Examples
>>> df = pd.DataFrame({ ... 'metadata': [ ... {'id': 1, 'score': 9}, ... {'id': 2, 'score': 7, 'extra': 5}, ... None ... ] ... }) >>> expand_dict_to_cols(df, 'metadata') metadata_id metadata_score metadata_extra 0 1 9.0 NaN 1 2 7.0 5.0 2 NaN NaN NaN