LFP Analysis

Tutorial overview

This Jupyter notebook will demonstrate how to access and analyze LFP data from the Neuropixels Visual Coding dataset. LFP, which stands for "local field potential," contains information about low-frequency (0.1-500 Hz) voltage fluctations around each recording site. It's complementary to the spiking activity, and can be analyzed on its own or in conjunction with spikes.

This tutorial will cover the following topics:

This tutorial assumes you've already created a data cache, or are working with the files on AWS. If you haven't reached that step yet, we recommend going through the data access tutorial first.

Functions related to analyzing spike data will be covered in other tutorials. For a full list of available tutorials, see the SDK documentation.

Let's start by creating an EcephysProjectCache object, and pointing it to a new or existing manifest file:

If you're not sure what a manifest file is or where to put it, please check out this tutorial before going further.

Let's load the sessions table and grab the data for the first experiment in the list:

Learning about the available probes

The session object contains all of the spike data for one recording session, as well as information about the stimulus, mouse behavior, and probes. To prevent the underlying NWB file from becoming too large, the LFP data is stored separately, on a probe-by-probe basis. Before loading the LFP, let's take a look at the probes that are available for this session.

There are six probes available, and all of them have LFP data. Note that there is a small subset of probes across the entire dataset that only have spike data, due to high levels of electrical noise contamining the LFP band. To see which sessions are missing LFP data, you can check the probes table from the EcephysProjectCache:

Returning to the current session, let's get some information about which brain regions each probe is recording from before we select LFP data to look at in more detail.

Each probe has a unique ID, but the "description" of each probe is assigned based on its location on the recording rig. The Allen Institute Neuropixels rigs have six slot for probes, which are named A, B, C, D, E, and F. The probes are arranged in a ring centered on visual cortex. Probe A is in the anterior/medial position, and the letters go clockwise around the ring.

To visualize what this looks like, we can access the spatial information in this session's channels table:

As you can see, the channels table mainly contains columns related to the physical locations, either relative to the probe shank (probe_vertical_position and probe_horizontal_position), or relative to the Allen Common Coordinate Framework (anterior_posterior_ccf_coordinate, dorsal_ventral_ccf_coordinate, and left_right_ccf_coordinate). Let's use a combination of these to visualize the locations of the probes in this recording.

This is a top-down view of the locations of each channel, with the color corresponding to distance along the probe axis (darker = deeper in the brain). A few things to note about this plot:

To figure out which probe is missing, we can check the probes table for this session:

It looks like probeB was not registered to the CCF. That means that probeA is in the upper right, probeC is in the lower right, and D, E, and F go clockwise from there. It's helpful to keep these descriptions in mind, because probes with the same descriptions enter the brain at roughly the same angle across experiments. Although the precise target point for each probe depends on the retinotopic map for each mouse, the overall rig geometry remains the same across sessions.

Let's look at the structures each of these probes passes through:

In general, probes tend to be inserted through cortex (VIS structures), into to hippocampus (CA1, CA3, DG), and down into the thalamus or midbrain. If there's no CCF registration available (e.g., for probeB in this example), subcortical structures are marked as grey.

Loading LFP data

Now that we've seen a general overview of how the probes are oriented and what structures they pass through, let's choose one and load its associated LFP data:

If you haven't tried to access this data previously, you'll have to wait while the LFP NWB file downloads. Even if you already have the data stored locally, it may still take a minute to load, since the LFP data is quite large.

Once the data is loaded, we can take a closer look at the lfp object:

The LFP data is stored as an xarray.DataArray object, with coordinates of time and channel. The xarray library simplifies the process of working with N-dimensional data arrays, by keeping track of the meaning of each axis. If this is your first time encountering xarrays, we strongly recommend reading through the documentation before going further. Getting used to xarrays can be frustrating, especially when they don't behave like numpy arrays. But they are designed to prevent common mistakes when analyzing multidimensional arrays, so they are well worth learning more about. Plus, the syntax is modeled on that of the pandas library, so if you're familiar with that you already have a head start.

The print-out above already tells us a lot about what the lfp object contains. It stores an array with around 12 million points along the time axis and 95 points along the channel axis. The time axis ranges from 13.5 to around 9600 seconds, while the channel axis ranges from 850258492 to 850259244 (these are the unique IDs for each channel).

Let's use the DataArray.sel() method to select a slice through this array between 100 and 101 seconds:

We see that this new DataArray is smaller than before; it contains the same number of channels, but only 1250 samples, due to the LFP sample rate of ~1250 Hz.

Let's plot the data for one of the channels:

Alternatively, we can visualize this slice of data using matplotlib's imshow method:

Note that we've transposed the original array to place the time dimension along the x-axis. We've also configured the plot so that the origin of the array is in the lower-left, so that that channels closer to the probe tip are lower in the image.

A few things to note about this plot:

Let's do some additional data selection to look at just the hippocampal channels from this recording.

### Suggestions for further analysis * Can you use supervised learning to train a model to accurately identify brain regions based on LFP signals alone? What happens when you use this model to classify brain regions in recordings without CCF registration?

Aligning LFP data to a stimulus

In the above example, we selected LFP data based on an arbitrary time span (100 to 101 seconds). For many analyses, however, you'll want to align the data to the onset of a particular type of stimulus.

The AllenSDK provides a number of convenience functions for aligning spikes to visual stimuli. We are planning to implement similar functions for LFP data in the near future. Until then, the steps below will show you how to perform this alignment.

First, we need to select some stimulus presentations to use:

Now, we can align the LFP to these presentation times using some xarray magic:

aligned_lfp is a DataArray with dimensions of channels x trials x time. It's been downsampled to 500 Hz by changing the time step in the trial_window variable.

Because we're using xarrays, the alignment operation is fast, and doesn't require any for loops! There's a lot going on here, so we recommend referring to the pandas and xarray documentation if anything is confusing.

Now we can visualize the average LFP, aligned to the trial onset:

Here we see the effect of a 250 ms flash stimulus on the LFP. There are two large responses in cortex (channel index 60-80), one corresponding to the stimulus onset (around sample 280), and one corresponding to the stimulus offset (around sample 400).

Note that the voltage range is an order of magnitude less than it was for the single-trial LFP, around -100 to +100 microvolts.

You can use the code sample above to align the LFP to any type of event (e.g. spike times, running onset, optogenetic stimuli) just by changing the trial_window and time_selection variables.

### Suggestions for further analysis * How do the time delays of stimulus-evoked LFP deflections vary across areas and depths? Are these delays different for different stimulus conditions? * Are there consistent patterns in the LFP when aligning to the start of running, or the onset of an eye movement?

Aligning LFP data to units

The previous section demonstrated how to align the LFP in time. What if we want to extract the LFP at a particular location in space, corresponding to the location of a unit we're analyzing?

Let's start by finding a well-isolated, high-firing rate unit in visual cortex from the probe we're currently working with. For more information about using quality metrics to assess unit isolation quality, check out this tutorial.

Once we've selected a unit of interest, the xarray library makes it easy to find the associated LFP channel.

There are four units that meet our criteria. Let's choose one and find its associated channel index.

Note that this is the channel index relative to the original probe (out of 384 possible channels), rather than the index of the LFP DataArray.

We can now use the channel index to find the unique channel ID for this unit:

Using unit_id and channel_id, we can select the spikes and LFP within an arbitrary time interval. Note that we need to use method='nearest' when selecting the LFP data channel, since not every electrode is included in the LFP DataArray.

Finally, we can plot the spike times along with the LFP for this interval:

### Suggestions for further analysis * What does the spike-triggered LFP look like for different units? Is it possible to classify units based on the properties of the spike-triggered LFP? * How well can you predict spike rates from the nearby LFP signals? Does this vary across different brain regions?

Current Source Density

LFP data is commonly used to generate current source density (CSD) plots, which show the location of current sources and sinks along the probe axis. CSD analysis benefits from high spatial resolution, since it involves taking the second spatial derivative of the data. Because of Neuropixels dense site spacing, these probes are optimal for computing the CSD. However, the LFP data available through the AllenSDK has been spatially downsampled prior to NWB packaging.

To provide access to a high-resolution CSD plot, we've pre-computed the CSD in response to a flash stimulus for all probes with LFP.

The CSD object is a DataArray with dimensions of channels x time. Note that the channels are actually "virtual channels," based on interpolated signals along the central axis of the probe, with 10 micron inter-site spacing.

This CSD shows a clear stimulus-evoked response in cortex (2600-3500 microns), with an even earlier response in a subcortical region (700-1200 microns).

We can use the channels table to figure out where this signal is coming from:

It looks like this region of the probe is on the border between LGd and LP nuclei in the thalamus.

### Suggestions for further analysis * How does the cortical CSD vary across visual areas? Are there consistent patterns across all areas? * Are the locations of sources and sinks correlated with other features of the LFP, such as power in different frequency bands?