allensdk.internal.brain_observatory.roi_filter module

class allensdk.internal.brain_observatory.roi_filter.ROIClassifier(model_data=None)[source]

Bases: object

Wrapper for machine learning classifier.

Provides an underlying classifier model implementing fit, score, and predict. Tracks additional information for constructing the feature array from input datastreams, as well as training data used and cross validation scores generated.

Parameters:
model_data : dictionary

Dictionary of classifier properties sklearn_version: Version of sklearn used for training. model: Underlying classifier. training_features: Feature set used to train model. training_labels: Label set used to train model. trimmed_features: Features to remove from input data. structure_ids: Structure ID set used for training. drivers: Driver set used for training. reporters: Reporter set used for training. other_appended_labels: Labels appended outside model. cross_validation_scores: Cross validation if generated.

create_feature_array(self, object_data, depth, structure_id, drivers, reporters)[source]

Creates feature array from input data.

See also

create_feature_array
Create a feature array given model and inputs
cross_validate(self, features, labels, n_folds=5, n_jobs=1)[source]

Generate cross-validation scores for the classifier.

Parameters:
features : pandas.DataFrame

Set of features for classification.

labels : pandas.DataFrame

Set of ground truth labels for training and evaluation.

n_folds : int

Number of folds for K-Fold cross-validation.

n_jobjs : int

Number of CPUs to use.

Returns:
numpy.ndarray

n_folds cross-validation scores.

fit(self, features, labels)[source]

Fit model to data.

Parameters:
features : pandas.DataFrame

Training feature set.

labels : pandas.DataFrame

Training labels.

static from_file(filename)[source]

Load an ROIClassifier from file.

get_labels(self, object_data, depth, structure_id, drivers, reporters)[source]

Generate labels from input data.

label_names

Return label names for the classifier.

model_data

The classifier properties as a dictionary.

predict(self, features)[source]

Generate classification labels given features.

save(self, filename)[source]

Save the classifier to file by pickling.

score(self, features, labels)[source]

Calculate classifier score on data.

allensdk.internal.brain_observatory.roi_filter.apply_labels(rois, label_array, label_names)[source]

Apply labels to rois.

Parameters:
rois : list

List of RoiMask objects sorted to label_array order.

label_array : numpy.ndarray

Label array output from classifier.

label_names : list

Names to apply to columns of label_array.

Returns:
list

List of ROIs with labels appended.

allensdk.internal.brain_observatory.roi_filter.create_feature_array(model_data, object_data, depth, structure_id, drivers, reporters)[source]

Create feature array from input data.

This creates the feature array with column ordering matching what the classifier was trained on.

Parameters:
model_data : dictionary

Dictionary containing information about the machine learning model and training set.

object_data : pandas.DataFrame

Object list data.

depth : float

Imaging depth of the experiment.

structure_id : string

Targeted structure id.

drivers : list

List of drivers for the mouse.

reporters : list

List of reporters for the mouse.

allensdk.internal.brain_observatory.roi_filter.get_unexpected_features(model_data, object_data, structure_id, drivers, reporters)[source]

Get list of incoming features that weren’t in traning data.

Parameters:
model_data : dictionary

Dictionary containing information about the machine learning model and training set.

object_data : pandas.DataFrame

Object list data.

structure_id : string

Targeted structure id.

drivers : list

List of drivers for the mouse.

reporters : list

List of reporters for the mouse.

allensdk.internal.brain_observatory.roi_filter.label_unions_and_duplicates(rois, overlap_threshold)[source]

Detect unions and duplicates and label ROIs.

allensdk.internal.brain_observatory.roi_filter.mean_gray_to_sigma(meanInt0, snpoffsetstdv)[source]

Calculate intensity variation used in prior code.

Parameters:
meanInt0 : pandas.Series

Array of intensity averages.

snpoffsetstdv : pandas.Series

Array of soma-neuropil standard deviations.

Returns:
pandas.Series

meanInt0/snpoffsetstdv, preventing Inf (returns as 0).