allensdk.core.dataframe_utils module¶

allensdk.core.dataframe_utils.INT_NULL = -99¶: A collection of utilities to manipulate pandas DataFrames.

allensdk.core.dataframe_utils.enforce_df_column_order(input_df: DataFrame, column_order: List[str]) → DataFrame[source]¶

Return the data frame but with columns ordered.

Parameters:

input_dfpandas.DataFrame: Data frame with columns to be ordered.
column_orderlist of str: Ordering of column names to enforce. Columns not specified are shifted to the end of the order but retain their order amongst others not specified. If a specified column is not in the DataFrame it is ignored.

Returns:

output_dfpandas.DataFrame: DataFrame the same as the input but with columns reordered.

allensdk.core.dataframe_utils.enforce_df_int_typing(input_df: DataFrame, int_columns: List[str], use_pandas_type: object = False) → DataFrame[source]¶

Enforce integer typing for columns that may have lost int typing when combined into the final DataFrame.

Parameters:

input_dfpandas.DataFrame: DataFrame with typing to enforce.
int_columnslist of str: Columns to enforce int typing and fill any NaN/None values with the value set in INT_NULL in this file. Requested columns not in the dataframe are ignored.
use_pandas_typebool: Instead of filling with the value INT_NULL to enforce integer typing, use the pandas type Int64. This type can have issues converting to numpy/array type values.

Returns:

output_dfpandas.DataFrame: DataFrame specific columns hard typed to Int64 to allow NA values without resorting to float type.

allensdk.core.dataframe_utils.patch_df_from_other(target_df: DataFrame, source_df: DataFrame, columns_to_patch: List[str], index_column: str) → DataFrame[source]¶

Overwrite column values in target_df from column values in source_df in rows where the two dataframes share a value of index_column.

Parameters:

target_df: pd.DataFrame: The dataframe whose columns will get overwritten
source_df: pd.DataFrame: The dataframe from which correct values are to be read
columns_to_patch: List[str]: The columns to be overwritten
index_column: str: The column to join the dataframes on

Returns:

patched_df: pd.DataFrame: target_df except with the specified columns and rows overwritten.

Notes

If any of the columns_to_patch are not in target_df, they will be added.

This function starts by creating a copy of target_df, so it will not alter the argument in-place.

allensdk.core.dataframe_utils.return_one_dataframe_row_only(input_table: DataFrame, index_value: int, table_name: str) → Series[source]¶

Lookup and return one and only one row from the DataFrame returning an informative error if no or multiple rows are returned for a given index.

This method is used mainly to return a more informative error when attempting to retrieve metadata from the values behavior cache metadata tables.

Parameters:

input_tablepandas.DataFrame

Input dataframe to retrieve row from.

index_valueint

Index of the row to return. Must match an index in the input: dataframe/table. i.e. in the case of ecephys_session_table or

behavior_session_table.

table_namestr

Name of the table being returned. Used to output the table name in case of error.

Returns:

rowpandas.Series: Row corresponding to the input index.

allensdk.core.dataframe_utils module¶

Contents

Questions