allensdk.core.dataframe_utils module

allensdk.core.dataframe_utils.INT_NULL = -99

A collection of utilities to manipulate pandas DataFrames.

allensdk.core.dataframe_utils.enforce_df_column_order(input_df: DataFrame, column_order: List[str]) DataFrame[source]

Return the data frame but with columns ordered.


Data frame with columns to be ordered.

column_orderlist of str

Ordering of column names to enforce. Columns not specified are shifted to the end of the order but retain their order amongst others not specified. If a specified column is not in the DataFrame it is ignored.


DataFrame the same as the input but with columns reordered.

allensdk.core.dataframe_utils.enforce_df_int_typing(input_df: DataFrame, int_columns: List[str], use_pandas_type: object = False) DataFrame[source]

Enforce integer typing for columns that may have lost int typing when combined into the final DataFrame.


DataFrame with typing to enforce.

int_columnslist of str

Columns to enforce int typing and fill any NaN/None values with the value set in INT_NULL in this file. Requested columns not in the dataframe are ignored.


Instead of filling with the value INT_NULL to enforce integer typing, use the pandas type Int64. This type can have issues converting to numpy/array type values.


DataFrame specific columns hard typed to Int64 to allow NA values without resorting to float type.

allensdk.core.dataframe_utils.patch_df_from_other(target_df: DataFrame, source_df: DataFrame, columns_to_patch: List[str], index_column: str) DataFrame[source]

Overwrite column values in target_df from column values in source_df in rows where the two dataframes share a value of index_column.

target_df: pd.DataFrame

The dataframe whose columns will get overwritten

source_df: pd.DataFrame

The dataframe from which correct values are to be read

columns_to_patch: List[str]

The columns to be overwritten

index_column: str

The column to join the dataframes on

patched_df: pd.DataFrame

target_df except with the specified columns and rows overwritten.


If any of the columns_to_patch are not in target_df, they will be added.

This function starts by creating a copy of target_df, so it will not alter the argument in-place.

allensdk.core.dataframe_utils.return_one_dataframe_row_only(input_table: DataFrame, index_value: int, table_name: str) Series[source]

Lookup and return one and only one row from the DataFrame returning an informative error if no or multiple rows are returned for a given index.

This method is used mainly to return a more informative error when attempting to retrieve metadata from the values behavior cache metadata tables.


Input dataframe to retrieve row from.

Index of the row to return. Must match an index in the input

dataframe/table. i.e. in the case of ecephys_session_table or



Name of the table being returned. Used to output the table name in case of error.


Row corresponding to the input index.