Identifying experiments and sessions of interest using the data manifest¶
This Jupyter notebook illustrates what data is available as part of the Visual Behavior Neuropixels dataset, and provides a brief description of the experimental design and dimensions of the dataset. The notebook will demonstrate how to identify experiments and sessions that you may be interested in analyzing using the data manifests provided by the VisualBehaviorNeuropixelsProjectCache
, and explore the metadata columns that describe the experimental conditions including transgenic lines, targeted areas, and session types.
Contents¶
- Introduction to the metadata tables
- Ecephys Sessions Table
- Behavior Sessions Table
- Units, Probes and Channels Table
We will first install allensdk into your environment by running the appropriate commands below.
Install AllenSDK into your local environment¶
You can install AllenSDK locally with:
!pip install allensdk
Requirement already satisfied: allensdk in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (2.16.2) Requirement already satisfied: psycopg2-binary in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (2.9.9) Requirement already satisfied: hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.* in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (3.11.0) Requirement already satisfied: h5py in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (3.10.0) Requirement already satisfied: matplotlib in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (3.7.4) Requirement already satisfied: numpy<1.24 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (1.23.5) Requirement already satisfied: pandas==1.5.3 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (1.5.3) Requirement already satisfied: jinja2 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (3.1.2) Requirement already satisfied: scipy<1.11 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (1.10.1) Requirement already satisfied: six in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (1.16.0) Requirement already satisfied: pynrrd in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (1.0.0) Requirement already satisfied: future in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (0.18.3) Requirement already satisfied: requests in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (2.31.0) Requirement already satisfied: requests-toolbelt in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (1.0.0) Requirement already satisfied: simplejson in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (3.19.2) Requirement already satisfied: scikit-image in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (0.21.0) Requirement already satisfied: scikit-build in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (0.17.6) Requirement already satisfied: statsmodels in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (0.14.0) Requirement already satisfied: simpleitk in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (2.3.1) Requirement already satisfied: argschema in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (3.0.4) Requirement already satisfied: glymur in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (0.12.9.post1) Requirement already satisfied: xarray<2023.2.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (2023.1.0) Requirement already satisfied: pynwb in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (2.5.0) Requirement already satisfied: tables in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (3.8.0) Requirement already satisfied: seaborn in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (0.13.0) Requirement already satisfied: aiohttp in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (3.9.1) Requirement already satisfied: nest-asyncio in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (1.5.8) Requirement already satisfied: tqdm in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (4.66.1) Requirement already satisfied: ndx-events in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (0.2.0) Requirement already satisfied: boto3 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (1.33.4) Requirement already satisfied: semver in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (3.0.2) Requirement already satisfied: cachetools in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (5.3.2) Requirement already satisfied: sqlalchemy in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (2.0.23) Requirement already satisfied: python-dateutil in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (2.8.2) Requirement already satisfied: pytz>=2020.1 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from pandas==1.5.3->allensdk) (2023.3.post1) Requirement already satisfied: importlib-resources in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.*->allensdk) (6.1.1) Requirement already satisfied: jsonschema>=2.6.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.*->allensdk) (4.20.0) Requirement already satisfied: ruamel-yaml>=0.16 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.*->allensdk) (0.18.5) Requirement already satisfied: packaging>=21.3 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from xarray<2023.2.0->allensdk) (23.2) Requirement already satisfied: attrs>=17.3.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from aiohttp->allensdk) (23.1.0) Requirement already satisfied: multidict<7.0,>=4.5 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from aiohttp->allensdk) (6.0.4) Requirement already satisfied: yarl<2.0,>=1.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from aiohttp->allensdk) (1.9.3) Requirement already satisfied: frozenlist>=1.1.1 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from aiohttp->allensdk) (1.4.0) Requirement already satisfied: aiosignal>=1.1.2 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from aiohttp->allensdk) (1.3.1) Requirement already satisfied: async-timeout<5.0,>=4.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from aiohttp->allensdk) (4.0.3) Requirement already satisfied: marshmallow<4.0,>=3.0.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from argschema->allensdk) (3.20.1) Requirement already satisfied: pyyaml in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from argschema->allensdk) (6.0.1) Requirement already satisfied: botocore<1.34.0,>=1.33.4 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from boto3->allensdk) (1.33.4) Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from boto3->allensdk) (1.0.1) Requirement already satisfied: s3transfer<0.9.0,>=0.8.2 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from boto3->allensdk) (0.8.2) Requirement already satisfied: lxml in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from glymur->allensdk) (4.9.3) Requirement already satisfied: MarkupSafe>=2.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from jinja2->allensdk) (2.1.3) Requirement already satisfied: contourpy>=1.0.1 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from matplotlib->allensdk) (1.1.1) Requirement already satisfied: cycler>=0.10 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from matplotlib->allensdk) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from matplotlib->allensdk) (4.45.1) Requirement already satisfied: kiwisolver>=1.0.1 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from matplotlib->allensdk) (1.4.5) Requirement already satisfied: pillow>=6.2.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from matplotlib->allensdk) (10.1.0) Requirement already satisfied: pyparsing>=2.3.1 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from matplotlib->allensdk) (3.1.1) Requirement already satisfied: setuptools in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from pynwb->allensdk) (56.0.0) Requirement already satisfied: nptyping in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from pynrrd->allensdk) (2.5.0) Requirement already satisfied: typing-extensions in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from pynrrd->allensdk) (4.8.0) Requirement already satisfied: charset-normalizer<4,>=2 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from requests->allensdk) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from requests->allensdk) (3.6) Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from requests->allensdk) (1.26.18) Requirement already satisfied: certifi>=2017.4.17 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from requests->allensdk) (2023.11.17) Requirement already satisfied: distro in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from scikit-build->allensdk) (1.8.0) Requirement already satisfied: tomli in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from scikit-build->allensdk) (2.0.1) Requirement already satisfied: wheel>=0.32.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from scikit-build->allensdk) (0.42.0) Requirement already satisfied: networkx>=2.8 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from scikit-image->allensdk) (3.1) Requirement already satisfied: imageio>=2.27 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from scikit-image->allensdk) (2.33.0) Requirement already satisfied: tifffile>=2022.8.12 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from scikit-image->allensdk) (2023.7.10) Requirement already satisfied: PyWavelets>=1.1.1 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from scikit-image->allensdk) (1.4.1) Requirement already satisfied: lazy_loader>=0.2 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from scikit-image->allensdk) (0.3) Requirement already satisfied: greenlet!=0.4.17 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from sqlalchemy->allensdk) (3.0.1) Requirement already satisfied: patsy>=0.5.2 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from statsmodels->allensdk) (0.5.3) Requirement already satisfied: cython>=0.29.21 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from tables->allensdk) (3.0.6) Requirement already satisfied: numexpr>=2.6.2 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from tables->allensdk) (2.8.6) Requirement already satisfied: blosc2~=2.0.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from tables->allensdk) (2.0.0) Requirement already satisfied: py-cpuinfo in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from tables->allensdk) (9.0.0) Requirement already satisfied: msgpack in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from blosc2~=2.0.0->tables->allensdk) (1.0.7) Requirement already satisfied: zipp>=3.1.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from importlib-resources->hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.*->allensdk) (3.17.0) Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from jsonschema>=2.6.0->hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.*->allensdk) (2023.11.1) Requirement already satisfied: pkgutil-resolve-name>=1.3.10 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from jsonschema>=2.6.0->hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.*->allensdk) (1.3.10) Requirement already satisfied: referencing>=0.28.4 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from jsonschema>=2.6.0->hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.*->allensdk) (0.31.1) Requirement already satisfied: rpds-py>=0.7.1 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from jsonschema>=2.6.0->hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.*->allensdk) (0.13.2) Requirement already satisfied: ruamel.yaml.clib>=0.2.7 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from ruamel-yaml>=0.16->hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.*->allensdk) (0.2.8)
Install AllenSDK into your notebook environment (good for Google Colab)¶
You can install AllenSDK into your notebook environment by executing the cell below.
If using Google Colab, click on the RESTART RUNTIME button that appears at the end of the output when this cell is complete,. Note that running this cell will produce a long list of outputs and some error messages. Clicking RESTART RUNTIME at the end will resolve these issues. You can minimize the cell after you are done to hide the output.
!pip install --upgrade pip
!pip install allensdk
Requirement already satisfied: pip in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (23.3.1) Requirement already satisfied: allensdk in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (2.16.2) Requirement already satisfied: psycopg2-binary in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (2.9.9) Requirement already satisfied: hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.* in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (3.11.0) Requirement already satisfied: h5py in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (3.10.0) Requirement already satisfied: matplotlib in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (3.7.4) Requirement already satisfied: numpy<1.24 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (1.23.5) Requirement already satisfied: pandas==1.5.3 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (1.5.3) Requirement already satisfied: jinja2 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (3.1.2) Requirement already satisfied: scipy<1.11 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (1.10.1) Requirement already satisfied: six in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (1.16.0) Requirement already satisfied: pynrrd in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (1.0.0) Requirement already satisfied: future in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (0.18.3) Requirement already satisfied: requests in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (2.31.0) Requirement already satisfied: requests-toolbelt in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (1.0.0) Requirement already satisfied: simplejson in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (3.19.2) Requirement already satisfied: scikit-image in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (0.21.0) Requirement already satisfied: scikit-build in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (0.17.6) Requirement already satisfied: statsmodels in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (0.14.0) Requirement already satisfied: simpleitk in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (2.3.1) Requirement already satisfied: argschema in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (3.0.4) Requirement already satisfied: glymur in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (0.12.9.post1) Requirement already satisfied: xarray<2023.2.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (2023.1.0) Requirement already satisfied: pynwb in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (2.5.0) Requirement already satisfied: tables in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (3.8.0) Requirement already satisfied: seaborn in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (0.13.0) Requirement already satisfied: aiohttp in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (3.9.1) Requirement already satisfied: nest-asyncio in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (1.5.8) Requirement already satisfied: tqdm in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (4.66.1) Requirement already satisfied: ndx-events in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (0.2.0) Requirement already satisfied: boto3 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (1.33.4) Requirement already satisfied: semver in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (3.0.2) Requirement already satisfied: cachetools in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (5.3.2) Requirement already satisfied: sqlalchemy in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (2.0.23) Requirement already satisfied: python-dateutil in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from allensdk) (2.8.2) Requirement already satisfied: pytz>=2020.1 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from pandas==1.5.3->allensdk) (2023.3.post1) Requirement already satisfied: importlib-resources in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.*->allensdk) (6.1.1) Requirement already satisfied: jsonschema>=2.6.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.*->allensdk) (4.20.0) Requirement already satisfied: ruamel-yaml>=0.16 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.*->allensdk) (0.18.5) Requirement already satisfied: packaging>=21.3 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from xarray<2023.2.0->allensdk) (23.2) Requirement already satisfied: attrs>=17.3.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from aiohttp->allensdk) (23.1.0) Requirement already satisfied: multidict<7.0,>=4.5 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from aiohttp->allensdk) (6.0.4) Requirement already satisfied: yarl<2.0,>=1.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from aiohttp->allensdk) (1.9.3) Requirement already satisfied: frozenlist>=1.1.1 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from aiohttp->allensdk) (1.4.0) Requirement already satisfied: aiosignal>=1.1.2 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from aiohttp->allensdk) (1.3.1) Requirement already satisfied: async-timeout<5.0,>=4.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from aiohttp->allensdk) (4.0.3) Requirement already satisfied: marshmallow<4.0,>=3.0.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from argschema->allensdk) (3.20.1) Requirement already satisfied: pyyaml in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from argschema->allensdk) (6.0.1) Requirement already satisfied: botocore<1.34.0,>=1.33.4 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from boto3->allensdk) (1.33.4) Requirement already satisfied: jmespath<2.0.0,>=0.7.1 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from boto3->allensdk) (1.0.1) Requirement already satisfied: s3transfer<0.9.0,>=0.8.2 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from boto3->allensdk) (0.8.2) Requirement already satisfied: lxml in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from glymur->allensdk) (4.9.3) Requirement already satisfied: MarkupSafe>=2.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from jinja2->allensdk) (2.1.3) Requirement already satisfied: contourpy>=1.0.1 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from matplotlib->allensdk) (1.1.1) Requirement already satisfied: cycler>=0.10 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from matplotlib->allensdk) (0.12.1) Requirement already satisfied: fonttools>=4.22.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from matplotlib->allensdk) (4.45.1) Requirement already satisfied: kiwisolver>=1.0.1 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from matplotlib->allensdk) (1.4.5) Requirement already satisfied: pillow>=6.2.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from matplotlib->allensdk) (10.1.0) Requirement already satisfied: pyparsing>=2.3.1 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from matplotlib->allensdk) (3.1.1) Requirement already satisfied: setuptools in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from pynwb->allensdk) (56.0.0) Requirement already satisfied: nptyping in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from pynrrd->allensdk) (2.5.0) Requirement already satisfied: typing-extensions in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from pynrrd->allensdk) (4.8.0) Requirement already satisfied: charset-normalizer<4,>=2 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from requests->allensdk) (3.3.2) Requirement already satisfied: idna<4,>=2.5 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from requests->allensdk) (3.6) Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from requests->allensdk) (1.26.18) Requirement already satisfied: certifi>=2017.4.17 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from requests->allensdk) (2023.11.17) Requirement already satisfied: distro in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from scikit-build->allensdk) (1.8.0) Requirement already satisfied: tomli in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from scikit-build->allensdk) (2.0.1) Requirement already satisfied: wheel>=0.32.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from scikit-build->allensdk) (0.42.0) Requirement already satisfied: networkx>=2.8 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from scikit-image->allensdk) (3.1) Requirement already satisfied: imageio>=2.27 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from scikit-image->allensdk) (2.33.0) Requirement already satisfied: tifffile>=2022.8.12 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from scikit-image->allensdk) (2023.7.10) Requirement already satisfied: PyWavelets>=1.1.1 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from scikit-image->allensdk) (1.4.1) Requirement already satisfied: lazy_loader>=0.2 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from scikit-image->allensdk) (0.3) Requirement already satisfied: greenlet!=0.4.17 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from sqlalchemy->allensdk) (3.0.1) Requirement already satisfied: patsy>=0.5.2 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from statsmodels->allensdk) (0.5.3) Requirement already satisfied: cython>=0.29.21 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from tables->allensdk) (3.0.6) Requirement already satisfied: numexpr>=2.6.2 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from tables->allensdk) (2.8.6) Requirement already satisfied: blosc2~=2.0.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from tables->allensdk) (2.0.0) Requirement already satisfied: py-cpuinfo in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from tables->allensdk) (9.0.0) Requirement already satisfied: msgpack in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from blosc2~=2.0.0->tables->allensdk) (1.0.7) Requirement already satisfied: zipp>=3.1.0 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from importlib-resources->hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.*->allensdk) (3.17.0) Requirement already satisfied: jsonschema-specifications>=2023.03.6 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from jsonschema>=2.6.0->hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.*->allensdk) (2023.11.1) Requirement already satisfied: pkgutil-resolve-name>=1.3.10 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from jsonschema>=2.6.0->hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.*->allensdk) (1.3.10) Requirement already satisfied: referencing>=0.28.4 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from jsonschema>=2.6.0->hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.*->allensdk) (0.31.1) Requirement already satisfied: rpds-py>=0.7.1 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from jsonschema>=2.6.0->hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.*->allensdk) (0.13.2) Requirement already satisfied: ruamel.yaml.clib>=0.2.7 in /opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages (from ruamel-yaml>=0.16->hdmf!=3.5.*,!=3.6.*,!=3.7.*,!=3.8.*->allensdk) (0.2.8)
Import necessary packages¶
import numpy as np
import pandas as pd
from pathlib import Path
import allensdk
from allensdk.brain_observatory.behavior.behavior_project_cache.\
behavior_neuropixels_project_cache \
import VisualBehaviorNeuropixelsProjectCache
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html from .autonotebook import tqdm as notebook_tqdm
First, load the project cache - your access point for all tables and data¶
# Update this to a valid directory in your filesystem. This is where the data will be stored.
output_dir = "/path/to/vbn_cache"
cache = VisualBehaviorNeuropixelsProjectCache.from_s3_cache(
cache_dir=Path(output_dir))
/home/runner/work/AllenSDK/AllenSDK/allensdk/api/cloud_cache/cloud_cache.py:439: MissingLocalManifestWarning: This cache directory appears to contain data files, but it has no record of what those files are. You might want to consider running VisualBehaviorNeuropixelsProjectCache.construct_local_manifest() to avoid needlessly downloading duplicates of data files that did not change between data releases. NOTE: running this method will require hashing every data file you have currently downloaded and could be very time consuming. To avoid this warning in the future, make sure that /tmp/tmpttu0or53/_downloaded_data.json is not deleted between instantiations of this cache warnings.warn(msg, MissingLocalManifestWarning) ecephys_sessions.csv: 100%|██████████| 64.7k/64.7k [00:00<00:00, 548kMB/s] behavior_sessions.csv: 100%|██████████| 562k/562k [00:00<00:00, 3.20MMB/s] units.csv: 100%|██████████| 132M/132M [00:06<00:00, 21.9MMB/s] probes.csv: 100%|██████████| 130k/130k [00:00<00:00, 772kMB/s] channels.csv: 100%|██████████| 27.9M/27.9M [00:02<00:00, 12.6MMB/s]
cache = VisualBehaviorNeuropixelsProjectCache.from_s3_cache(
cache_dir=Path(output_dir))
Introduction to the metadata tables¶
Introduction to the metadata tables¶
Introduction to the metadata tables¶
The data manifest is comprised of 5 types of tables:
ecephys_sessions_table
(64 kB)behavior_sessions_table
(531 kB)units_table
(130 MB)probes_table
(127 kB)channels_table
(28 MB)
The ecephys_sessions_table
contains metadata for every Neuropixels recording session in the dataset. We attempted to insert 6 probes for each experiment, but occasionally individual insertions failed. The probe_count
column tells you how many probes were inserted for a given session. The structure_acronyms
column indicates which brain areas were targeted. For the majority of mice, there are two recording sessions. These were run on consecutive days with two different image sets, G
and H
. The experience_level
column tells you whether the image set used for a particular recording was the same as the training image set (Familiar
), or different from the training image set (Novel
).
The behavior_sessions_table
contains metadata for each behavior session. Some behavior sessions have Neuropixels data associated with them, while others took place during training in the behavior facility. The different training stages that mice progressed through are described by the session_type
.
The units_table
contains metadata for every unit in the release. Each unit can be linked to the corresponding recording session, probe and channel by the ecephys_session_id
, ecephys_probe_id
and ecephys_channel_id
columns. This table also contains a number of helpful quality metrics, which can be used to filter out contaminated units before analysis. For more guidance on how to use these metrics, check out this tutorial.
The probes_table
contains metadata for each probe insertion.
The channels_table
contains metadata for each channel recorded during an ephys session. This table provides useful info about where a particular channel is located in the Allen Common Coordinate Framework as well as it's relative position on the probe.
Now let's look at a few of these tables in more detail to get a better sense of the dataset.
The data manifest is comprised of 5 types of tables:
ecephys_sessions_table
(64 kB)behavior_sessions_table
(531 kB)units_table
(130 MB)probes_table
(127 kB)channels_table
(28 MB)
The ecephys_sessions_table
contains metadata for every Neuropixels recording session in the dataset. We attempted to insert 6 probes for each experiment, but occasionally individual insertions failed. The probe_count
column tells you how many probes were inserted for a given session. The structure_acronyms
column indicates which brain areas were targeted. For the majority of mice, there are two recording sessions. These were run on consecutive days with two different image sets, G
and H
. The experience_level
column tells you whether the image set used for a particular recording was the same as the training image set (Familiar
), or different from the training image set (Novel
).
The behavior_sessions_table
contains metadata for each behavior session. Some behavior sessions have Neuropixels data associated with them, while others took place during training in the behavior facility. The different training stages that mice progressed through are described by the session_type
.
The units_table
contains metadata for every unit in the release. Each unit can be linked to the corresponding recording session, probe and channel by the ecephys_session_id
, ecephys_probe_id
and ecephys_channel_id
columns. This table also contains a number of helpful quality metrics, which can be used to filter out contaminated units before analysis. For more guidance on how to use these metrics, check out this tutorial.
The probes_table
contains metadata for each probe insertion.
The channels_table
contains metadata for each channel recorded during an ephys session. This table provides useful info about where a particular channel is located in the Allen Common Coordinate Framework as well as it's relative position on the probe.
Now let's look at a few of these tables in more detail to get a better sense of the dataset.
Ecephys Sessions Table¶
Load the ecephys_sessions_table
from the cache¶
First let's just look at the columns to see what metadata is provided for each session:
ecephys_sessions_table = cache.get_ecephys_session_table()
ecephys_sessions_table.columns
Index(['behavior_session_id', 'date_of_acquisition', 'equipment_name', 'session_type', 'mouse_id', 'genotype', 'sex', 'project_code', 'age_in_days', 'unit_count', 'probe_count', 'channel_count', 'structure_acronyms', 'image_set', 'prior_exposures_to_image_set', 'session_number', 'experience_level', 'prior_exposures_to_omissions', 'file_id', 'abnormal_histology', 'abnormal_activity'], dtype='object')
This table gives us lots of useful metadata about each recording session, including the genotype, sex and age of the mouse that was run, what brain areas were recorded and some important info about the stimulus.
To demystify a few of these columns, let's briefly review the experimental design. Each mouse was trained with one of two image sets (G
or H
). For the majority of mice, we recorded two sessions: one with the trained 'familiar' image set and one with a 'novel' image set. Note that two of the eight images were shared across these two image sets as diagrammed below for an example mouse. For this mouse, image set G
(images on blue and purple backgrounds) was used in training and was therefore 'familiar', while image set H
(the two holdover images on purple background plus six novel images on red background) was 'novel'.
So, each recording session can be defined by a few parameters, including the image_set
used (G or H), the experience_level
of the mouse (indicating whether the mouse had seen the image set in previous training sessions) and the session_number
(indicating whether it was the first or second recording day for the mouse). In bottom bubble of the above diagram, you can see the three different training/recording trajectories mice in this dataset took:
- Train on G; see G on first recording day; see H on second recording day
- Train on G; see H on first recording day; see G on second recording day
- Train on H; see H on first recording day; see G on second recording day
The numbers in the recording session cells indicate how many of each session type exist in this dataset. The first number is what the SDK returns by default. The second number (in parentheses) is what the SDK returns without filtering for abnormalities (see below as well as the Data Access tutorial).
Let's look at how many sessions we have of each type in our ecephys_sessions_table to corroborate these numbers:
sessions_by_imageset_experience_day = ecephys_sessions_table.pivot_table(index=['session_number', 'experience_level'],
columns=['image_set'],
values='behavior_session_id', aggfunc=len)
display(sessions_by_imageset_experience_day)
image_set | G | H | |
---|---|---|---|
session_number | experience_level | ||
1 | Familiar | 38.0 | 10.0 |
Novel | NaN | 3.0 | |
2 | Familiar | 3.0 | NaN |
Novel | 10.0 | 39.0 |
From this table we can see that for most mice, image set G
was used for training (and therefore Familiar
) and was also shown during the first recording session (38 sessions in the left cell of the first row above). For these mice, image set H
was novel
and shown on the second recording day (right cell of last row). Then there were 3 mice for which image set G
was used in training, but the novel image set H
was shown on the first recording day and G
on the second (rows 2 and 3). Finally, there were 10 mice trained on the H
image set. All 10 of these saw H
on their first recording day and G
on their second (rows 1 and 4).
Keep in mind that though we aimed to record for 2 sessions from each mouse, not every recording session passed quality control. Thus for some mice, only one of the two recording days are represented in the table. This explains the discrepancy between the first and last rows of the table above.
Also note that the probes are retracted after every recording (these are acute recordings). Moreover, we move the probes around 100 microns between session 1 and session 2, so it's impossible to map neurons across the two recording days (and it's unlikely that we are recording from the same neurons).
The transgenic line determines which neurons are labeled with ChR2 for identification by optotagging.¶
We can also use the ecephys_sessions
table to find the genotype of the mouse used for each recording session. Across the dataset, we used mice of three genotypes:
- C57Bl6J (wt)
- Sst-IRES-Cre;Ai32 to optotag putative Sst-expressing interneurons
- Vip-IRES-Cre;Ai32 to optotag putative Vip-expressing interneurons
Somatostatin-positive neurons (Sst) and Vasoactive Intestinal Polypeptide neurons (Vip) constitute two of the three major cortical inhibitory cell classes, and by crossing these lines to the Ai32 reporter line, we can identify these neurons by optotagging.
We also record from C57Bl6J wt/wt
mice.
print('the different transgenic lines included in this dataset are:\n')
print(np.sort(ecephys_sessions_table.genotype.unique()))
the different transgenic lines included in this dataset are: ['Sst-IRES-Cre/wt;Ai32(RCL-ChR2(H134R)_EYFP)/wt' 'Vip-IRES-Cre/wt;Ai32(RCL-ChR2(H134R)_EYFP)/wt' 'wt/wt']
#Number of sessions per genotype/sex
sessions_by_genotype_sex = ecephys_sessions_table.pivot_table(values='session_number', index='genotype',
columns='sex', aggfunc=len)
display(sessions_by_genotype_sex.rename(columns={'session_number': 'session_count'}))
sex | F | M |
---|---|---|
genotype | ||
Sst-IRES-Cre/wt;Ai32(RCL-ChR2(H134R)_EYFP)/wt | 24 | 19 |
Vip-IRES-Cre/wt;Ai32(RCL-ChR2(H134R)_EYFP)/wt | 4 | 18 |
wt/wt | 15 | 23 |
Now let's see how many mice we have for each genotype/sex. The column mouse_id
gives us a unique identifier for each mouse in the dataset.
#Number of mice per genotype/sex
mice_by_genotype_sex = ecephys_sessions_table.pivot_table(values='mouse_id', index='genotype',
columns='sex', aggfunc=lambda x: len(np.unique(x)))
display(mice_by_genotype_sex)
sex | F | M |
---|---|---|
genotype | ||
Sst-IRES-Cre/wt;Ai32(RCL-ChR2(H134R)_EYFP)/wt | 12 | 11 |
Vip-IRES-Cre/wt;Ai32(RCL-ChR2(H134R)_EYFP)/wt | 2 | 9 |
wt/wt | 8 | 12 |
Lastly, there are two columns to flag potential abnormalities in brain tissue or electrical activity. Since there are many analyses which may not be affected by these issues, we've decided to go ahead and release this data. But by default, the ecephys_sessions_table will not return these sessions. You can get all of the sessions with this call:
ecephys_sessions_no_filter = cache.get_ecephys_session_table(filter_abnormalities=False)
print(f'Number sessions returned by default: {len(ecephys_sessions_table)}')
print(f'Number of sessions returned without filtering abnormalities: {len(ecephys_sessions_no_filter)}')
Number sessions returned by default: 103 Number of sessions returned without filtering abnormalities: 153
The abnormal_histology
column indicates where for each mouse we noted possible bleeding or insertion damage. This will be a list of brain regions.
The abnormal_activity
column indicates when during the session we noted possible epileptiform activity. This will be a list of times in seconds.
Let's grab one of these 'abnormal' sessions to see what these columns look like:
# grab a session that was flagged for both tissue damage and epileptiform activity
ecephys_sessions_no_filter[['abnormal_histology', 'abnormal_activity']]\
[~ecephys_sessions_no_filter['abnormal_histology'].isnull() &
~ecephys_sessions_no_filter['abnormal_activity'].isnull()].iloc[0]
abnormal_histology ['Thalamus'] abnormal_activity [8899] Name: 1044389060, dtype: object
For this example session, it looks like we annotated potential damage in the Thalamus, and irregular firing activity 132 seconds into the session. For more details about how these abnormalities are flagged, check out the technical white paper.
Behavior Sessions Table¶
In this dataset, mice are trained on a visual change detection task. This task involves a continuous stream of stimuli, and mice learn to lick in response to a change in the stimulus identity to earn a water reward. There are different stages of training in this task, described below. The metadata for each behavior session in the dataset can be found in the behavior_sessions_table
and can be used to build a training history for each mouse.
Load the behavior_sessions_table
from the cache¶
behavior_sessions = cache.get_behavior_session_table()
print(f"Total number of behavior sessions: {len(behavior_sessions)}")
behavior_sessions.head()
Total number of behavior sessions: 3424
equipment_name | genotype | mouse_id | sex | age_in_days | session_number | prior_exposures_to_session_type | prior_exposures_to_image_set | prior_exposures_to_omissions | ecephys_session_id | date_of_acquisition | session_type | image_set | file_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
behavior_session_id | ||||||||||||||
1055198750 | NP.1 | wt/wt | 531237 | F | 158 | 45 | 5 | 38 | 0 | 1055182255 | 2020-10-07 09:40:30.407000+00:00 | HABITUATION_5_images_G_handoff_ready_3uL_reward | G | 1022 |
1052162513 | NP.1 | Vip-IRES-Cre/wt;Ai32(RCL-ChR2(H134R)_EYFP)/wt | 530862 | M | 143 | 41 | 2 | 29 | 0 | 1051331372 | 2020-09-18 10:18:25.503000+00:00 | HABITUATION_5_images_G_handoff_ready_3uL_reward | G | 1023 |
1056100734 | NP.1 | wt/wt | 531237 | F | 163 | 48 | 8 | 41 | 0 | 1056089509 | 2020-10-12 12:31:57.133000+00:00 | HABITUATION_5_images_G_handoff_ready_3uL_reward | G | 1024 |
1055397049 | NP.1 | wt/wt | 531237 | F | 159 | 46 | 6 | 39 | 0 | 1055386631 | 2020-10-08 10:46:16.623000+00:00 | HABITUATION_5_images_G_handoff_ready_3uL_reward | G | 1025 |
1052162543 | NP.1 | wt/wt | 533537 | M | 128 | 30 | 0 | 23 | 0 | 1052096186 | 2020-09-22 10:54:33.351000+00:00 | HABITUATION_5_images_G_handoff_ready_5uL_reward | G | 1026 |
What columns does the behavior_session table have and what values can they take?¶
behavior_sessions.columns
Index(['equipment_name', 'genotype', 'mouse_id', 'sex', 'age_in_days', 'session_number', 'prior_exposures_to_session_type', 'prior_exposures_to_image_set', 'prior_exposures_to_omissions', 'ecephys_session_id', 'date_of_acquisition', 'session_type', 'image_set', 'file_id'], dtype='object')
Behavior sessions can take place on different experimental systems¶
print('behavior data could be recorded on these experimental systems:\n')
print(np.sort(behavior_sessions.equipment_name.unique()))
behavior data could be recorded on these experimental systems: ['BEH.B-Box1' 'BEH.B-Box2' 'BEH.B-Box3' 'BEH.B-Box4' 'BEH.B-Box5' 'BEH.B-Box6' 'BEH.D-Box1' 'BEH.D-Box2' 'BEH.D-Box3' 'BEH.D-Box4' 'BEH.D-Box5' 'BEH.D-Box6' 'BEH.F-Box1' 'BEH.F-Box2' 'BEH.F-Box3' 'BEH.F-Box4' 'BEH.F-Box5' 'BEH.F-Box6' 'BEH.G-Box1' 'BEH.G-Box2' 'BEH.G-Box3' 'BEH.G-Box4' 'BEH.G-Box5' 'BEH.G-Box6' 'NP.0' 'NP.1']
equipment_name
values starting with 'BEH' indicate behavioral training in the behavior facility, while values starting with 'NP' indicate behavior sessions that took place on an experimental Neuropixels rig.
The mouse_id
is a 6-digit unique identifier for each experimental animal in the dataset
print('there are', len(behavior_sessions.mouse_id.unique()), 'mice in the dataset')
there are 81 mice in the dataset
Other mouse specific metadata includes sex
, age_in_days
and genotype
.
Session Type - a very important piece of information¶
The session_type
for each behavior session indicates the behavioral training stage or Neuropixels experiment conditions for that particular session. This determines what stimuli were shown and what task parameters were used.
print('the session_types available in this dataset are:\n')
print(np.sort(behavior_sessions.session_type.unique()))
the session_types available in this dataset are: ['EPHYS_1_images_G_3uL_reward' 'EPHYS_1_images_G_5uL_reward' 'EPHYS_1_images_H_3uL_reward' 'EPHYS_1_images_H_5uL_reward' 'HABITUATION_5_images_G_handoff_ready_3uL_reward' 'HABITUATION_5_images_G_handoff_ready_5uL_reward' 'HABITUATION_5_images_H_handoff_ready_3uL_reward' 'HABITUATION_5_images_H_handoff_ready_5uL_reward' 'TRAINING_0_gratings_autorewards_15min' 'TRAINING_0_gratings_autorewards_15min_0uL_reward' 'TRAINING_1_gratings' 'TRAINING_1_gratings_10uL_reward' 'TRAINING_2_gratings_flashed' 'TRAINING_2_gratings_flashed_10uL_reward' 'TRAINING_3_images_G_10uL_reward' 'TRAINING_3_images_H_10uL_reward' 'TRAINING_4_images_G_training' 'TRAINING_4_images_G_training_7uL_reward' 'TRAINING_4_images_H_training_7uL_reward' 'TRAINING_5_images_G_epilogue' 'TRAINING_5_images_G_epilogue_5uL_reward' 'TRAINING_5_images_G_handoff_lapsed_5uL_reward' 'TRAINING_5_images_G_handoff_ready_5uL_reward' 'TRAINING_5_images_H_epilogue_5uL_reward' 'TRAINING_5_images_H_handoff_lapsed_5uL_reward' 'TRAINING_5_images_H_handoff_ready_5uL_reward']
You can see that each session type is prepended with an indicator of when this session was run in the training sequence (for example TRAINING_0
or TRAINING_1
). Mice progress through a series of training stages to shape their behavior prior to recording. Mice are automatically advanced between stages depending on their behavioral performance. For a detailed description of the change detection task and advancement criteria, please see the technical whitepaper.
Training with the change detection task begins with simple static grating stimuli, changing between 0 and 90 degrees in orientation. On the very first day, mice are automatically given a water reward when the orientation of the stimulus changes (TRAINING_0_gratings_autorewards_15min
). On subsequent days, mice must lick following the change in order to receive a water reward (TRAINING_1_gratings
). In the next stage, stimuli are flashed, with a 500 ms inter stimulus interval of mean luminance gray screen (TRAINING_2_gratings_flashed
).
Once mice perform the task well with gratings, they are transitioned to natural image stimuli. Different groups of mice are trained with different sets of images, image set G
or H
(described above). In the following description, we use X
as a placeholder for G
or H
in the session_type
name. Training with images begins with a 10ul water reward volume (TRAINING_3...
), which is then decreased to 7ul once mice perform the task consistently with images (TRAINING_4...
). If mice continue to perform well, they are advanced to TRAINING_5_images_X_epilogue_5uL_reward
, during which they are exposed to the receptive field mapping stimulus that will be used during Neuropixels recordings and the reward is further reduced to 5 ul. When mice have reached criterion to be transferred to the Neuropixels portion of the experiment, they are labeled as 'handoff_ready' (TRAINING_5_images_X_handoff_ready_5uL_reward
.) If behavior performance returns to below criterion level before they are handed off, they are labeled as 'handoff_lapsed'(TRAINING_5_images_X_handoff_lapsed_5uL_reward
). You may notice inconsistencies with the suffix for a few of these stage names, this reflects a minor change we made early on during data collection to reduce the reward volume from 7ul for TRAINING_5
to 5ul. After that, we added the volume explicitly to the stage name.
So now, let's look at the training history for 1 mouse to see how this unfolds:
training_history = behavior_sessions[behavior_sessions['mouse_id']==556016]
training_history = training_history.sort_values(by='date_of_acquisition')
training_history[['session_type', 'equipment_name', 'date_of_acquisition', 'prior_exposures_to_image_set', 'prior_exposures_to_omissions']]
session_type | equipment_name | date_of_acquisition | prior_exposures_to_image_set | prior_exposures_to_omissions | |
---|---|---|---|---|---|
behavior_session_id | |||||
1079461839 | TRAINING_0_gratings_autorewards_15min_0uL_reward | BEH.G-Box5 | 2021-01-29 11:20:57.848000+00:00 | <NA> | 0 |
1080309252 | TRAINING_1_gratings_10uL_reward | BEH.G-Box5 | 2021-02-01 10:30:41.010000+00:00 | <NA> | 0 |
1080567296 | TRAINING_1_gratings_10uL_reward | BEH.G-Box5 | 2021-02-02 10:41:01.736000+00:00 | <NA> | 0 |
1081055727 | TRAINING_1_gratings_10uL_reward | BEH.G-Box5 | 2021-02-03 10:08:22.073000+00:00 | <NA> | 0 |
1081407988 | TRAINING_1_gratings_10uL_reward | BEH.G-Box5 | 2021-02-04 10:57:59.323000+00:00 | <NA> | 0 |
1081665901 | TRAINING_2_gratings_flashed_10uL_reward | BEH.G-Box5 | 2021-02-05 10:17:01.333000+00:00 | <NA> | 0 |
1082287921 | TRAINING_3_images_G_10uL_reward | BEH.G-Box5 | 2021-02-08 10:26:15.260000+00:00 | 0 | 0 |
1082721365 | TRAINING_3_images_G_10uL_reward | BEH.G-Box5 | 2021-02-09 10:11:40.992000+00:00 | 1 | 0 |
1082978971 | TRAINING_3_images_G_10uL_reward | BEH.G-Box5 | 2021-02-10 10:36:09.169000+00:00 | 2 | 0 |
1083179250 | TRAINING_4_images_G_training_7uL_reward | BEH.G-Box5 | 2021-02-11 09:55:53.764000+00:00 | 3 | 0 |
1083988326 | TRAINING_5_images_G_epilogue_5uL_reward | BEH.G-Box5 | 2021-02-15 11:08:10.871000+00:00 | 4 | 0 |
1084214262 | TRAINING_5_images_G_epilogue_5uL_reward | BEH.G-Box5 | 2021-02-16 10:40:55.939000+00:00 | 5 | 0 |
1084416013 | TRAINING_5_images_G_epilogue_5uL_reward | BEH.G-Box5 | 2021-02-17 10:05:26.648000+00:00 | 6 | 0 |
1084925549 | TRAINING_5_images_G_handoff_ready_5uL_reward | BEH.G-Box5 | 2021-02-18 10:30:15.669000+00:00 | 7 | 0 |
1085100426 | TRAINING_5_images_G_handoff_ready_5uL_reward | BEH.G-Box5 | 2021-02-19 10:22:15.645000+00:00 | 8 | 0 |
1085697640 | HABITUATION_5_images_G_handoff_ready_5uL_reward | NP.0 | 2021-02-22 10:35:13.367000+00:00 | 9 | 0 |
1085945525 | HABITUATION_5_images_G_handoff_ready_5uL_reward | NP.0 | 2021-02-23 09:55:39.738000+00:00 | 10 | 0 |
1086167535 | HABITUATION_5_images_G_handoff_ready_3uL_reward | NP.0 | 2021-02-24 10:03:34.439000+00:00 | 11 | 0 |
1086376321 | HABITUATION_5_images_G_handoff_ready_3uL_reward | NP.0 | 2021-02-25 09:28:17.677000+00:00 | 12 | 0 |
1086799944 | HABITUATION_5_images_G_handoff_ready_3uL_reward | NP.0 | 2021-02-26 10:16:30.479000+00:00 | 13 | 0 |
1087320527 | HABITUATION_5_images_G_handoff_ready_3uL_reward | NP.0 | 2021-03-01 10:01:49.984000+00:00 | 14 | 0 |
1087522745 | HABITUATION_5_images_G_handoff_ready_3uL_reward | NP.0 | 2021-03-02 11:24:41.073000+00:00 | 15 | 0 |
1087696440 | HABITUATION_5_images_G_handoff_ready_3uL_reward | NP.0 | 2021-03-03 09:40:18.913000+00:00 | 16 | 0 |
1087922676 | HABITUATION_5_images_G_handoff_ready_3uL_reward | NP.0 | 2021-03-04 09:47:42.840000+00:00 | 17 | 0 |
1088237976 | HABITUATION_5_images_G_handoff_ready_3uL_reward | NP.0 | 2021-03-05 10:26:29.594000+00:00 | 18 | 0 |
1088815471 | HABITUATION_5_images_G_handoff_ready_3uL_reward | NP.0 | 2021-03-08 12:57:46.523000+00:00 | 19 | 0 |
1089049814 | HABITUATION_5_images_G_handoff_ready_3uL_reward | NP.0 | 2021-03-09 12:56:02.899000+00:00 | 20 | 0 |
1089343256 | EPHYS_1_images_G_3uL_reward | NP.0 | 2021-03-10 14:24:06.841000+00:00 | 21 | 0 |
1089636567 | EPHYS_1_images_H_3uL_reward | NP.0 | 2021-03-11 15:14:20.619000+00:00 | 0 | 1 |
This table shows that mouse 556016 was trained for 29 days, starting with non-contingent rewards for changes in grating orientation during TRAINING_0
, and ending with two Neuropixels recording sessions running the change detection task with flashing images (the EPHYS
stages). All sessions before the HABITUATION
stage were run in behavior boxes. From HABITUATION
on, sessions were run on the experimental Neuropixels rig NP.0
.
The prior_exposures_to_image_set
column indicates how many times the mouse had seen the image set used in a particular session. For example, by the time the mouse above reached its first recording day (EPHYS_1_images_G_3uL_reward
), it had already seen the G
image set in 21 previous sessions. On the second recording day, it was exposed to the H
image set for the first time.
The EPHYS
sessions run during Neuropixels recordings are the first time the mouse encounters omitted stimuli. During these sessions, we omit a little under 5% of the stimulus flashes to investigate temporal expectation signals. The prior_exposures_to_omissions
column indicates whether the mouse has encountered omissions in a previous recording session. Note that it is '0' for all but the second recording day.
Units, Probes and Channels Tables¶
Now let's look at the units, probes and channels tables in a bit more detail. We'll start with the units table, which contains info about every unit recorded in this dataset:
units = cache.get_unit_table()
print(f'This dataset contains {len(units)} total units')
units.head()
This dataset contains 319013 total units
ecephys_channel_id | ecephys_probe_id | ecephys_session_id | amplitude_cutoff | anterior_posterior_ccf_coordinate | dorsal_ventral_ccf_coordinate | left_right_ccf_coordinate | cumulative_drift | d_prime | structure_acronym | ... | valid_data | amplitude | waveform_duration | waveform_halfwidth | PT_ratio | recovery_slope | repolarization_slope | spread | velocity_above | velocity_below | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
unit_id | |||||||||||||||||||||
1157005856 | 1157001834 | 1046469925 | 1046166369 | 0.500000 | 8453.0 | 3353.0 | 6719.0 | 140.32 | 6.088133 | MB | ... | True | 286.132665 | 0.151089 | 0.096147 | 0.310791 | -0.227726 | 0.961313 | 20.0 | -0.457845 | NaN |
1157005853 | 1157001834 | 1046469925 | 1046166369 | 0.323927 | 8453.0 | 3353.0 | 6719.0 | 239.76 | 4.635583 | MB | ... | True | 181.418835 | 0.357119 | 0.192295 | 0.531490 | -0.150522 | 0.732741 | 30.0 | 2.060302 | -2.060302 |
1157005720 | 1157001786 | 1046469925 | 1046166369 | 0.044133 | 8575.0 | 3842.0 | 6590.0 | 263.32 | 5.691955 | MRN | ... | True | 180.866205 | 0.521943 | 0.178559 | 0.612217 | -0.024239 | 0.539687 | 80.0 | 0.000000 | 0.863364 |
1157006074 | 1157001929 | 1046469925 | 1046166369 | 0.000583 | 8212.0 | 2477.0 | 6992.0 | 154.64 | 6.049284 | NOT | ... | True | 574.984215 | 0.343384 | 0.192295 | 0.470194 | -0.356670 | 2.258649 | 40.0 | 1.373534 | 0.000000 |
1157006072 | 1157001929 | 1046469925 | 1046166369 | 0.500000 | 8212.0 | 2477.0 | 6992.0 | 242.58 | 4.745499 | NOT | ... | True | 315.794115 | 0.329648 | 0.164824 | 0.488276 | -0.210010 | 1.320270 | 70.0 | 0.412060 | 0.343384 |
5 rows × 34 columns
For more information about many of the metrics included in this table and how to use them to guide your analysis, see our quality metrics tutorial. For now, here's a brief description of each column:
General Metadata
ecephys_channel_id
: unique ID of channel on which unit's peak waveform occurred
ecephys_probe_id
: unique ID for probe on which unit was recorded
ecephys_session_id
: unique ID for session during which unit was recorded
anterior_posterior_ccf_coordinate
: CCF coord in the AP axis
dorsal_ventral_ccf_coordinate
: CCF coord in the DV axis
left_right_ccf_coordinate
: CCF coord in the left/right axis
structure_acronym
: CCF acronym for area to which unit was assigned
structure_id
: CCF structure ID for the area to which unit was assigned
probe_horizontal_position
: Horizontal (perpindicular to shank) probe position of each unit's peak channel in microns
probe_vertical_position
: Vertical (along shank) probe position of each unit's peak channel in microns
Waveform metrics: Look here for more detail on these metrics and the code that computes them. For the below descriptions the '1D waveform' is defined as the waveform on the peak channel. The '2D waveform' is the waveform across channels centered on the peak channel.
amplitude
: Peak to trough amplitude for mean 1D waveform in microvolts
waveform_duration
: Time from trough to peak for 1D waveform in milliseconds
waveform_halfwidth
: Width of 1D waveform at half-amplitude in milliseconds
PT_ratio
: Ratio of the max (peak) to the min (trough) amplitudes for 1D waveform
recovery_slope
: Slope of recovery of 1D waveform to baseline after repolarization (coming down from peak)
repolarization_slope
: Slope of repolarization of 1D waveform to baseline after trough
spread
: Range of channels for which the spike amplitude was above 12% of the peak channel amplitude
velocity_above
: Slope of spike propagation velocity traveling in dorsal direction from soma (note to avoid infinite values, this is actaully the inverse of velocity: ms/mm)
velocity_below
: Slope of spike propagation velocity traveling in ventral direction from soma (note to avoid infinite values, this is actually the inverse of velocity: ms/mm)
snr
: signal-to-noise ratio for 1D waveform
Quality metrics: Look here for more detail on these metrics and the code that computes them.
amplitude_cutoff
: estimate of miss rate based on amplitude histogram (ie fraction of spikes estimated to have been below detection threshold)
cumulative_drift
: cumulative change in spike depth along probe throughout the recording
d_prime
: classification accuracy based on LDA
firing_rate
: Mean firing rate over entire recording
isi_violations
: Ratio of refractory violation rate to total spike rate
isolation_distance
: Distance to nearest cluster in Mahalanobis space
l_ratio
: The Mahalanobis distance and chi-squared inverse cdf are used to find the probability of cluster membership for each spike.
max_drift
: Maximum change in unit depth across recording
nn_hit_rate
: Fraction of nearest neighbors in PCA space for spikes in unit cluster that are also in unit cluster
nn_miss_rate
: Fraction of nearest neighbors for spikes outside unit cluster than are in unit cluster
presence_ratio
: Fraction of time during session for which a unit was spiking
silhouette_score
: Standard metric for cluster overlap, computed in PCA space
quality
: Label assigned based on waveform shape as described here. Either 'good' for physiological waveforms or 'noise' for artifactual waveforms.
Note that each unit can be traced to an experiment (ecephys_session_id
), probe (ecephys_probe_id
) and channel (ecephys_channel_id
). Let's filter this table to see all of the units recorded for one ecephys_session from our ecephys_sessions_table:
#grab the ecephys session id for one experiment; these session ids are the indices of the ecephys_sessions_table
session_id = ecephys_sessions_table.index.values[1]
session_units = units[units['ecephys_session_id']==session_id]
session_units.head()
ecephys_channel_id | ecephys_probe_id | ecephys_session_id | amplitude_cutoff | anterior_posterior_ccf_coordinate | dorsal_ventral_ccf_coordinate | left_right_ccf_coordinate | cumulative_drift | d_prime | structure_acronym | ... | valid_data | amplitude | waveform_duration | waveform_halfwidth | PT_ratio | recovery_slope | repolarization_slope | spread | velocity_above | velocity_below | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
unit_id | |||||||||||||||||||||
1053043842 | 1053041745 | 1044791095 | 1044594870 | 0.500000 | 8921.0 | 3358.0 | 7602.0 | 294.86 | 4.335807 | MRN | ... | True | 149.555445 | 0.357119 | 0.219765 | 0.567588 | -0.130128 | 0.574141 | 120.0 | 0.529792 | -1.079206 |
1053044108 | 1053041974 | 1044791095 | 1044594870 | 0.031346 | 9327.0 | 786.0 | 7877.0 | 0.00 | 3.455122 | VISp | ... | True | 430.525778 | 0.604355 | 0.206030 | 0.408889 | -0.135231 | 1.397335 | 70.0 | 0.961474 | -0.137353 |
1053044102 | 1053041974 | 1044791095 | 1044594870 | 0.000429 | 9327.0 | 786.0 | 7877.0 | 5.84 | 5.424172 | VISp | ... | True | 640.387114 | 0.645561 | 0.192295 | 0.405572 | -0.246621 | 1.997241 | 70.0 | 0.755444 | 0.480737 |
1053044101 | 1053041974 | 1044791095 | 1044594870 | 0.025936 | 9327.0 | 786.0 | 7877.0 | 14.03 | 7.543906 | VISp | ... | True | 593.836035 | 0.673032 | 0.206030 | 0.448527 | -0.175423 | 0.968392 | 50.0 | 0.480737 | 0.686767 |
1053044067 | 1053041974 | 1044791095 | 1044594870 | 0.001713 | 9327.0 | 786.0 | 7877.0 | 154.46 | 4.693757 | VISp | ... | True | 314.246010 | 0.274707 | 0.164824 | 0.562850 | -0.277458 | 1.396135 | 50.0 | 0.000000 | -0.343384 |
5 rows × 34 columns
# Looks like we inserted all 6 probes during this experiment
session_probes_from_units_table = np.sort(session_units.ecephys_probe_id.unique())
print(f'We recorded from {len(session_probes_from_units_table)} probes this session')
We recorded from 5 probes this session
Let's grab the probes table and check that when we filter by this ecephys session id, we get the same probes as above:
probes = cache.get_probe_table()
session_probes = probes[probes.ecephys_session_id==session_id].index.values
np.all(session_probes_from_units_table==session_probes)
True
The probes table has a bit more metadata about the probe type (Neuropixels 1.0), the areas that each probe passed through, and the unit count and sampling rates:
probes.head()
ecephys_session_id | name | sampling_rate | lfp_sampling_rate | phase | has_lfp_data | unit_count | channel_count | structure_acronyms | file_id | |
---|---|---|---|---|---|---|---|---|---|---|
ecephys_probe_id | ||||||||||
1044506933 | 1044385384 | probeB | 30000.178402 | 2500.014867 | 1.0 | True | 701 | 384 | ['CA1', 'DG', 'LP', 'POL', 'PoT', 'VISpm', 'ro... | 0 |
1044506934 | 1044385384 | probeC | 30000.049852 | 2500.004154 | 1.0 | True | 307 | 384 | ['MB', 'MRN', 'POST', 'SCig', 'VISp', 'root'] | 1 |
1044506935 | 1044385384 | probeD | 30000.029115 | 2500.002426 | 1.0 | True | 521 | 384 | ['CA1', 'CA3', 'DG', 'LGv', 'MB', 'TH', 'VISl'... | 2 |
1044506936 | 1044385384 | probeE | 30000.075851 | 2500.006321 | 1.0 | True | 282 | 384 | ['CA1', 'DG', 'MB', 'MGd', 'MGm', 'MRN', 'SGN'... | 3 |
1044506937 | 1044385384 | probeF | 29999.959578 | 2499.996631 | 1.0 | True | 368 | 384 | ['CA1', 'DG', 'LP', 'MRN', 'POL', 'PoT', 'SGN'... | 4 |
Now let's get the channels table:
channels = cache.get_channel_table()
channels.head()
ecephys_probe_id | ecephys_session_id | probe_channel_number | probe_vertical_position | probe_horizontal_position | anterior_posterior_ccf_coordinate | dorsal_ventral_ccf_coordinate | left_right_ccf_coordinate | structure_acronym | unit_count | valid_data | |
---|---|---|---|---|---|---|---|---|---|---|---|
ecephys_channel_id | |||||||||||
1049365509 | 1048089911 | 1047969464 | 0 | 20.0 | 43.0 | 8445.0 | 4013.0 | 6753.0 | MRN | 0 | True |
1049365511 | 1048089911 | 1047969464 | 1 | 20.0 | 11.0 | 8443.0 | 4005.0 | 6755.0 | MRN | 5 | True |
1049365512 | 1048089911 | 1047969464 | 2 | 40.0 | 59.0 | 8441.0 | 3997.0 | 6757.0 | MRN | 0 | True |
1049365513 | 1048089911 | 1047969464 | 3 | 40.0 | 27.0 | 8439.0 | 3989.0 | 6759.0 | MRN | 5 | True |
1049365514 | 1048089911 | 1047969464 | 4 | 60.0 | 43.0 | 8438.0 | 3981.0 | 6761.0 | MRN | 7 | True |
channels.probe_channel_number.max()
383
We can join the channels and units tables to get full CCF info about every unit.
#first let's merge the units and channels tables
session_units_channels = session_units.merge(channels, left_on='ecephys_channel_id', right_index=True)
Now let's use this info to plot each unit's CCF position grouped by probe for our example session:
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
fig.set_size_inches([14,8])
ax = fig.add_subplot(111, projection='3d')
def plot_probe_coords(probe_group):
ax.scatter(probe_group['left_right_ccf_coordinate_x'],
probe_group['anterior_posterior_ccf_coordinate_x'],
-probe_group['dorsal_ventral_ccf_coordinate_x'], #reverse the z coord so that down is into the brain
)
return probe_group['ecephys_probe_id_x'].values[0]
probe_ids = session_units_channels.groupby('ecephys_probe_id_x').apply(plot_probe_coords)
ax.set_zlabel('D/V')
ax.set_xlabel('Left/Right')
ax.set_ylabel('A/P')
ax.legend(probe_ids)
ax.view_init(elev=55, azim=70)
You can see that these probe trajectories wiggle a bit. That's because we're plotting them in CCF space. When we warp the brains into this space, the probe trajectories can bend.