allensdk.api.warehouse_cache.cache module

class allensdk.api.warehouse_cache.cache.Cache(manifest=None, cache=True, version=None, **kwargs)[source]

Bases: object

add_manifest_paths(manifest_builder)[source]

Add cache-class specific paths to the manifest. In derived classes, should call super.

build_manifest(file_name)[source]

Creation of default path specifications.

Parameters:
file_namestring

where to save it

static cache_csv()[source]
static cache_csv_dataframe()[source]
static cache_csv_json()[source]
static cache_json()[source]
static cache_json_dataframe()[source]
static cacher(fn, *args, **kwargs)[source]

make an rma query, save it and return the dataframe.

Parameters:
fnfunction reference

makes the actual query using kwargs.

pathstring

where to save the data

strategystring or None, optional

‘create’ always generates the data, ‘file’ loads from disk, ‘lazy’ queries the server if no file exists, None generates the data and bypasses all caching behavior

prefunction

df|json->df|json, takes one data argument and returns filtered version, None for pass-through

postfunction

df|json->?, takes one data argument and returns Object

readerfunction, optional

path -> data, default NOP

writerfunction, optional

path, data -> None, default NOP

kwargsobjects

passed through to the query function

Returns:
Object or None

data type depends on fn, reader and/or post methods.

static csv_writer(pth, gen)[source]
get_cache_path(file_name, manifest_key, *args)[source]

Helper method for accessing path specs from manifest keys.

Parameters:
file_namestring
manifest_keystring
argsordered parameters
Returns:
string or None

path

static json_remove_keys(data, keys)[source]
static json_rename_columns(data, new_old_name_tuples=None)[source]

Convenience method to rename columns in a pandas dataframe.

Parameters:
datadataframe

edited in place.

new_old_name_tupleslist of string tuples (new, old)
load_csv(path, rename=None, index=None)[source]

Read a csv file as a pandas dataframe.

Parameters:
renamelist of string tuples (new old), optional

columns to rename

indexstring, optional

post-rename column to use as the row label.

load_json(path, rename=None, index=None)[source]

Read a json file as a pandas dataframe.

Parameters:
renamelist of string tuples (new old), optional

columns to rename

indexstring, optional

post-rename column to use as the row label.

load_manifest(file_name, version=None)[source]

Read a keyed collection of path specifications.

Parameters:
file_namestring

path to the manifest file

Returns:
Manifest
manifest_dataframe()[source]

Convenience method to view manifest as a pandas dataframe.

static nocache_dataframe()[source]
static nocache_json()[source]
static pathfinder(file_name_position, secondary_file_name_position=None, path_keyword=None)[source]

helper method to find path argument in legacy methods written prior to the @cacheable decorator. Do not use for new @cacheable methods.

Parameters:
file_name_positioninteger

zero indexed position in the decorated method args where file path may be found.

secondary_file_name_positioninteger

zero indexed position in the decorated method args where the file path may be found.

path_keywordstring

kwarg that may have the file path.

Notes

This method is only intended to provide backward-compatibility for some methods that otherwise do not follow the path conventions of the @cacheable decorator.

static remove_keys(data, keys=None)[source]

DataFrame version

static rename_columns(data, new_old_name_tuples=None)[source]

Convenience method to rename columns in a pandas dataframe.

Parameters:
datadataframe

edited in place.

new_old_name_tupleslist of string tuples (new, old)
wrap(fn, path, cache, save_as_json=True, return_dataframe=False, index=None, rename=None, **kwargs)[source]

make an rma query, save it and return the dataframe.

Parameters:
fnfunction reference

makes the actual query using kwargs.

pathstring

where to save the data

cacheboolean

True will make the query, False just loads from disk

save_as_jsonboolean, optional

True (default) will save data as json, False as csv

return_dataframeboolean, optional

True will cast the return value to a pandas dataframe, False (default) will not

indexstring, optional

column to use as the pandas index

renamelist of string tuples, optional

(new, old) columns to rename

kwargsobjects

passed through to the query function

Returns:
dict or DataFrame

data type depends on return_dataframe option.

Notes

Column renaming happens after the file is reloaded for json

allensdk.api.warehouse_cache.cache.cacheable(strategy=None, pre=None, writer=None, reader=None, post=None, pathfinder=None)[source]

decorator for rma queries, save it and return the dataframe.

Parameters:
fnfunction reference

makes the actual query using kwargs.

pathstring

where to save the data

strategystring or None, optional

‘create’ always gets the data from the source (server or generated), ‘file’ loads from disk, ‘lazy’ creates the data and saves to file if no file exists, None queries the server and bypasses all caching behavior

prefunction

df|json->df|json, takes one data argument and returns filtered version, None for pass-through

postfunction

df|json->?, takes one data argument and returns Object

readerfunction, optional

path -> data, default NOP

writerfunction, optional

path, data -> None, default NOP

kwargsobjects

passed through to the query function

Returns:
dict or DataFrame

data type depends on dataframe option.

Notes

Column renaming happens after the file is reloaded for json

allensdk.api.warehouse_cache.cache.get_default_manifest_file(cache_name)[source]
allensdk.api.warehouse_cache.cache.memoize(f)[source]

Creates an unbound cache of function calls and results. Note that arguments of different types are not cached separately (so f(3.0) and f(3) are not treated as distinct calls)

Arguments to the cached function must be hashable.

View the cache size with f.cache_size(). Clear the cache with f.cache_clear(). Access the underlying function with f.__wrapped__.