Skip to content

NWB Export

The jabs-cli convert-to-nwb command converts a JABS pose estimation HDF5 file to NWB (Neurodata Without Borders) format using the ndx-pose extension.

Optional dependency

NWB support is not installed by default. Install the nwb extra before use:

pip install "jabs-behavior-classifier[nwb]"

The extra adds pynwb and ndx-pose as dependencies.

Two output modes are available. Choose the mode based on how the files will be used:

Mode When to use
Combined (default) Local analysis, sharing with collaborators who can parse JABS-specific fields
Per-identity (--per-identity) DANDI archive upload, tools that expect one subject per file

Command usage

jabs-cli convert-to-nwb INPUT_PATH OUTPUT [OPTIONS]
Argument / Option Description
INPUT_PATH JABS pose HDF5 file, any version v2–v8. Format version is inferred automatically from the filename (e.g. _pose_est_v6.h5).
OUTPUT Destination .nwb file. In --per-identity mode, used as a naming template; the file itself is not created directly.
--per-identity Write one NWB file per identity instead of a single combined file.
--session-description TEXT NWB session description string. Defaults to 'JABS PoseEstimation Data'.
--subjects PATH Path to a JSON file with per-animal biological metadata.
--session-metadata PATH Path to a JSON file with NWB session-level metadata (start time, experimenter, etc.).

Examples

# Single combined file — all identities in one NWB file
jabs-cli convert-to-nwb session_pose_est_v6.h5 session.nwb

# One NWB file per identity (recommended for DANDI upload)
jabs-cli convert-to-nwb session_pose_est_v6.h5 session.nwb --per-identity

# Include per-animal metadata
jabs-cli convert-to-nwb session_pose_est_v6.h5 session.nwb --subjects subjects.json

# Specify session start time and experimenter
jabs-cli convert-to-nwb session_pose_est_v6.h5 session.nwb --session-metadata session.json

Subjects JSON format

Pass a JSON file to --subjects to attach per-animal biological metadata to the NWB output. Keys are identity names: use external IDs from the pose file when present (e.g. "mouse_a"), or subject_1, subject_2, … when the pose file has no external IDs.

DANDI requires species, sex, and either age or date_of_birth on every subject. All other fields are optional.

{
  "subject_1": {
    "subject_id": "M123",
    "sex": "M",
    "species": "Mus musculus",
    "age": "P70D",
    "genotype": "WT",
    "strain": "C57BL/6J"
  },
  "subject_2": {
    "subject_id": "M124",
    "sex": "F",
    "species": "Mus musculus",
    "age": "P72D",
    "genotype": "Shank3+/-",
    "strain": "C57BL/6J"
  }
}
Field Type Notes
subject_id string Lab identifier for the animal
sex string Required by DANDI. "M", "F", "U", or "O"
species string Required by DANDI. Latin binomial, e.g. "Mus musculus"
age string Required by DANDI (or date_of_birth). ISO 8601 duration, e.g. "P70D" (70 days)
date_of_birth string Alternative to age. ISO 8601 datetime, e.g. "2024-01-15T00:00:00+00:00"
genotype string Genetic background, e.g. "Shank3B+/-"
strain string Inbred strain, e.g. "C57BL/6J"
weight string Body weight, e.g. "25g"
description string Free-text notes

In per-identity mode, subject metadata is written to both the standard NWBFile.subject field and the jabs_metadata scratch field. If no --subjects file is provided, a minimal subject with subject_id set to the identity name is written automatically. In combined mode, subject metadata is written only to jabs_metadata (see below).


Session metadata JSON format

Pass a JSON file to --session-metadata to set NWB session-level fields. This is the primary way to specify session_start_time, which is not currently stored in JABS pose files and otherwise defaults to the time the export was run.

{
  "session_start_time": "2024-03-15T10:30:00-05:00",
  "experimenter": ["Jane Smith", "John Doe"],
  "lab": "Kumar Lab",
  "institution": "The Jackson Laboratory",
  "experiment_description": "Open field test",
  "session_id": "session_001"
}
Key Type Description
session_start_time ISO 8601 string Recording start time. Should include a UTC offset (e.g. -05:00, +00:00, or Z). If no offset is provided, the time is assumed UTC and a warning is emitted. Defaults to export time.
experimenter string or list of strings Name(s) of the experimenter(s).
lab string Lab name.
institution string Institution name.
experiment_description string Free-text description of the experiment.
session_id string Lab-specific session identifier.

All fields are optional. Unknown keys are ignored with a warning.


Output modes

Combined file (default)

All identities from the recording session are written into a single NWB file.

session.nwb
  └── all identities, all objects

This is a non-standard NWB usage. Standard NWB (NWBFile.subject) only supports one subject per file, so combined files cannot populate that field. Instead, all per-animal metadata is stored in the jabs_metadata scratch field (a JSON string). Tools that do not know about jabs_metadata will not see subject metadata at all.

DANDI upload: Combined files will fail DANDI validation because NWBFile.subject is not set. Use per-identity mode for DANDI.

Per-identity (--per-identity)

One NWB file is written per animal. The OUTPUT path is used as a naming template; files are written as {output_stem}_{identity_name}.nwb in the same directory.

session_subject_1.nwb   ← identity 0 + all objects
session_subject_2.nwb   ← identity 1 + all objects
session_subject_3.nwb   ← identity 2 + all objects

This is the more standard output. Each file contains exactly one animal, so NWBFile.subject is populated with that animal's biological metadata (when provided via --subjects). Any standard NWB tool — including the DANDI archive — can read the subject field directly without knowing anything about JABS.

Identity names in the filenames come from external_ids in the pose file (sanitized for filesystem compatibility), or fall back to subject_1, subject_2, … when no external IDs are present. Static and dynamic objects are written to every per-identity file identically, since they are session-level data.

Reading per-identity files

The JABS reader re-assembles per-identity files transparently. Point it at any one sibling file; it detects the per_identity_files flag in jabs_metadata, finds all siblings, and merges them into a single result with all identities in their original order.

Subject metadata and NWBFile.subject

Mode NWBFile.subject jabs_metadata.subjects
Combined Not set Set (all identities)
Per-identity Set for this file's identity Set (all identities)

jabs_metadata.subjects always carries the full dict for all identities, even in per-identity files. This makes each file self-contained: the JABS reader can recover complete subject metadata from any sibling without loading the others.


NWB file structure

For the full format specification — including all field definitions, jabs_metadata keys, and worked examples for static and dynamic objects — see File Formats — NWB Pose File.

The layout below shows a combined file with two animal identities, two static objects (corners, lixit), and one dynamic object (fecal_boli).

NWBFile
├── subject/                               [Subject] per-identity mode only
├── processing/
│   └── behavior/                          [ProcessingModule]
│       ├── Skeletons/                     [Skeletons container]
│       │   ├── subject/                   Skeleton — animal keypoints + edges
│       │   ├── corners/                   Skeleton — static object (4 nodes)
│       │   ├── lixit/                     Skeleton — static object (1 or 3 nodes)
│       │   └── fecal_boli/                Skeleton — dynamic object (max_count nodes)
│       │
│       ├── subject_1/                     [PoseEstimation] animal identity 0
│       │   ├── nose/                      [PoseEstimationSeries] num_frames timestamps
│       │   ├── left_ear/
│       │   └── ...
│       │
│       ├── subject_2/                     [PoseEstimation] animal identity 1
│       │   ├── nose/
│       │   └── ...
│       │
│       ├── corners/                       [PoseEstimation] static object
│       │   ├── corners_0/                 [PoseEstimationSeries] 1 timestamp
│       │   ├── corners_1/
│       │   ├── corners_2/
│       │   └── corners_3/
│       │
│       ├── lixit/                         [PoseEstimation] static object
│       │   └── lixit_0/                   [PoseEstimationSeries] 1 timestamp
│       │
│       ├── fecal_boli/                    [PoseEstimation] dynamic object
│       │   ├── fecal_boli_0/              [PoseEstimationSeries] n_predictions timestamps
│       │   ├── fecal_boli_1/
│       │   └── ...
│       │
│       ├── jabs_identity_mask             [TimeSeries] uint8 identity presence mask
│       ├── jabs_bounding_boxes_subject_1  [TimeSeries] optional, one per identity
│       └── jabs_bounding_boxes_subject_2  [TimeSeries] optional, one per identity
└── scratch/
    └── jabs_metadata/                     [ScratchData] JSON string (see below)

In a per-identity file the layout is identical, except: - NWBFile.subject is populated (when subject metadata is provided) - Only one animal identity container is present - jabs_identity_mask / jabs_bounding_boxes_<identity> cover that identity only


Animal pose

Each animal identity is a PoseEstimation container in processing/behavior. The container name is the sanitized external ID from the pose file, or subject_1, subject_2, … (1-based) when no external IDs are available.

A single Skeleton named subject is shared by all animal identities and stored in the Skeletons container.

PoseEstimationSeries fields (per keypoint)

Field Value
name Keypoint name (e.g. "nose", "left_ear")
data shape (num_frames, 2)(x, y) coordinates in pixels
rate Frames per second (float)
unit "pixels"
reference_frame "Top-left corner of video frame, x increases rightward, y increases downward"
confidence shape (num_frames,)0.0 = missing keypoint, > 0.0 = valid
confidence_definition "0.0=invalid/missing keypoint, >0.0=valid keypoint"

Identity mask

jabs_identity_mask is a TimeSeries that records whether each identity is present in each frame.

Mode Shape stored in file Shape returned by reader
Combined (num_frames, num_identities) (num_identities, num_frames)
Per-identity (num_frames,) (1, num_frames)

Bounding boxes (optional)

When the pose file contains bounding box data, one TimeSeries per identity is written with the name jabs_bounding_boxes_{identity_name}.

Property Value
Name jabs_bounding_boxes_{identity_name}
Shape stored in file (num_frames, 2, 2)
Shape returned by reader (num_identities, num_frames, 2, 2)

Format: [[upper_left_x, upper_left_y], [lower_right_x, lower_right_y]] in pixels.


Static objects

Static objects are fixed-position spatial landmarks that do not move during a session. They are read from static_objects/ in JABS pose HDF5 files (pose format v5+).

Common static objects:

Object Shape Description
corners (4, 2) Four corners of the arena
lixit (1, 2) or (3, 2) Water spout — single tip, or tip + left + right
food_hopper (4, 2) Four corners of the food hopper opening

Each static object is a PoseEstimation container with a single timestamp (t = 0.0 s), one PoseEstimationSeries per keypoint, and a dedicated Skeleton. Nodes are named {object_name}_{i} (zero-indexed).

Each PoseEstimationSeries for a static object has data shape (1, 2) — one row for the single timestamp and two columns for (x, y). Because the time dimension (1) is shorter than the spatial dimension (2), the DANDI validator will emit a NWBI.check_data_orientation warning for each static keypoint:

[NWBI.check_data_orientation] — Data may be in the wrong orientation. Time should be
in the first dimension, and is usually the longest dimension. Here, another dimension
is longer.

These warnings are expected and can be ignored. The check is a heuristic designed to catch transposed animal pose arrays; it fires a false positive for static objects, which legitimately have only one timestamp by definition.


Dynamic objects

Dynamic objects are objects whose position or count may change over time. Unlike animal pose, predictions are made only for a sparse subset of frames. Dynamic objects are available from JABS pose format v7+.

Each dynamic object is a PoseEstimation container with n_predictions irregular timestamps. One PoseEstimationSeries is written per instance slot × keypoint combination.

Instance slot occupancy is encoded in the confidence field:

  • confidence = 1.0 — slot is occupied at this prediction
  • confidence = 0.0 — slot is unoccupied; coordinate values are padding and must be ignored

Node naming:

Condition Node name pattern Example
Single keypoint per instance {name}_{slot} fecal_boli_0
Multiple keypoints per instance {name}_{slot}_{kp} door_0_0, door_0_1

jabs_metadata scratch field

Every JABS NWB file contains a ScratchData object named jabs_metadata in the NWB scratch space. Its data field is a JSON string carrying JABS-specific metadata needed for a lossless round-trip. Standard NWB fields alone are insufficient because pynwb returns PoseEstimationSeries in alphabetical order from HDF5, which would otherwise scramble the keypoint ordering.

Tools that do not use the JABS reader can parse this JSON directly to recover ordered keypoint names, identity ordering, subject metadata, and object classification.

Keys

Key Type Present Description
format_version int Always JABS NWB format version. Currently 1.
identity_names list[str] Always Ordered list of animal identity container names. Defines identity order on read.
num_identities int Always Total number of animal identities in the recording session.
body_parts list[str] Always Ordered list of keypoint names for animal skeletons.
cm_per_pixel float \| null Always Pixel-to-centimetre scale factor. null if not available.
external_ids list[str] \| null Always Original external identity names from the pose file. null if the pose file had no external IDs.
subjects dict[str, dict] \| null Always Per-identity subject metadata keyed by identity name, for all identities. null if no subject metadata is available. Fields: subject_id, sex, species, age, date_of_birth, genotype, strain, weight, description. DANDI requires species, sex, and either age or date_of_birth.
metadata dict Always Provenance from the source pose file: source_file, pose_format_version, and optionally source_file_hash.
static_object_names list[str] When static objects present Names of all static object PoseEstimation containers.
dynamic_object_names list[str] When dynamic objects present Names of all dynamic object PoseEstimation containers.
dynamic_object_shapes dict[str, [int, int]] When dynamic objects present Maps each dynamic object name to [max_count, n_keypoints].
per_identity_files bool Per-identity mode only true if this file is one of a set of per-identity NWB files.
source_identity_index int Per-identity mode only Zero-based index of the identity in this file.
split_subject_count int Per-identity mode only Total number of subjects in the session across all split files.

Example — combined file

{
  "format_version": 1,
  "identity_names": ["subject_1", "subject_2"],
  "num_identities": 2,
  "body_parts": ["nose", "left_ear", "right_ear", "base_neck", "left_front_paw",
                 "right_front_paw", "center_spine", "left_rear_paw", "right_rear_paw",
                 "base_tail", "mid_tail", "tip_tail"],
  "cm_per_pixel": 0.043,
  "external_ids": null,
  "subjects": {
    "subject_1": {
      "subject_id": "M123",
      "sex": "M",
      "species": "Mus musculus",
      "age": "P70D",
      "genotype": "WT",
      "strain": "C57BL/6J",
      "weight": null,
      "description": null
    },
    "subject_2": {
      "subject_id": "M124",
      "sex": "F",
      "species": "Mus musculus",
      "age": "P72D",
      "genotype": "Shank3+/-",
      "strain": "C57BL/6J",
      "weight": null,
      "description": null
    }
  },
  "metadata": {
    "source_file": "/data/session_pose_est_v7.h5",
    "pose_format_version": 7,
    "source_file_hash": "a3f1c8..."
  },
  "static_object_names": ["corners", "lixit"],
  "dynamic_object_names": ["fecal_boli"],
  "dynamic_object_shapes": {
    "fecal_boli": [3, 1]
  }
}

Coordinate system

All coordinates in JABS NWB files use the following convention:

Property Value
Origin Top-left corner of the video frame
x axis Increases rightward (column direction)
y axis Increases downward (row direction)
Units Pixels

This applies to animal keypoints, static object points, and dynamic object points. NWB files always store coordinates in (x, y) order.


Data not exported

Pose files v6 and later may contain instance segmentation data. This data is not included in the NWB output. See File Formats — Data not exported to NWB for the full list of omitted fields. If you need segmentation data, read it directly from the source JABS pose HDF5 file.