allensdk.core.dataframe_utils module¶
- allensdk.core.dataframe_utils.INT_NULL = -99¶
A collection of utilities to manipulate pandas DataFrames.
- allensdk.core.dataframe_utils.enforce_df_column_order(input_df: DataFrame, column_order: List[str]) DataFrame [source]¶
Return the data frame but with columns ordered.
- Parameters:
- input_dfpandas.DataFrame
Data frame with columns to be ordered.
- column_orderlist of str
Ordering of column names to enforce. Columns not specified are shifted to the end of the order but retain their order amongst others not specified. If a specified column is not in the DataFrame it is ignored.
- Returns:
- output_dfpandas.DataFrame
DataFrame the same as the input but with columns reordered.
- allensdk.core.dataframe_utils.enforce_df_int_typing(input_df: DataFrame, int_columns: List[str], use_pandas_type: object = False) DataFrame [source]¶
Enforce integer typing for columns that may have lost int typing when combined into the final DataFrame.
- Parameters:
- input_dfpandas.DataFrame
DataFrame with typing to enforce.
- int_columnslist of str
Columns to enforce int typing and fill any NaN/None values with the value set in INT_NULL in this file. Requested columns not in the dataframe are ignored.
- use_pandas_typebool
Instead of filling with the value INT_NULL to enforce integer typing, use the pandas type Int64. This type can have issues converting to numpy/array type values.
- Returns:
- output_dfpandas.DataFrame
DataFrame specific columns hard typed to Int64 to allow NA values without resorting to float type.
- allensdk.core.dataframe_utils.patch_df_from_other(target_df: DataFrame, source_df: DataFrame, columns_to_patch: List[str], index_column: str) DataFrame [source]¶
Overwrite column values in target_df from column values in source_df in rows where the two dataframes share a value of index_column.
- Parameters:
- target_df: pd.DataFrame
The dataframe whose columns will get overwritten
- source_df: pd.DataFrame
The dataframe from which correct values are to be read
- columns_to_patch: List[str]
The columns to be overwritten
- index_column: str
The column to join the dataframes on
- Returns:
- patched_df: pd.DataFrame
target_df except with the specified columns and rows overwritten.
Notes
If any of the columns_to_patch are not in target_df, they will be added.
This function starts by creating a copy of target_df, so it will not alter the argument in-place.
- allensdk.core.dataframe_utils.return_one_dataframe_row_only(input_table: DataFrame, index_value: int, table_name: str) Series [source]¶
Lookup and return one and only one row from the DataFrame returning an informative error if no or multiple rows are returned for a given index.
This method is used mainly to return a more informative error when attempting to retrieve metadata from the values behavior cache metadata tables.
- Parameters:
- input_tablepandas.DataFrame
Input dataframe to retrieve row from.
- index_valueint
- Index of the row to return. Must match an index in the input
dataframe/table. i.e. in the case of ecephys_session_table or
behavior_session_table.
- table_namestr
Name of the table being returned. Used to output the table name in case of error.
- Returns:
- rowpandas.Series
Row corresponding to the input index.