File Formats

This section documents the format of JABS output files that may be needed for downstream analysis.

Prediction File

An inference file represents the predicted classes for each identity present in one video file.

Location

The prediction files are saved in <JABS project dir>/jabs/predictions/<video_name>.h5 if they were generated by the JABS GUI.

The jabs-classify script saves inference files in <out-dir>/<video_name>_behavior.h5

Attributes

The root file contains the following attributes:

pose_file: filename of the pose file used during prediction
pose_hash: blake2b hash of pose file
version: prediction output version

Each behavior prediction group contains the following attributes:

classifier_file: filename of the classifier file used to predict
classifier_hash: blake2b hash of the classifier file
app_version: JABS application version used to make predictions
prediction_date: date when predictions were made

predicted_class

dtype: 8-bit integer
shape: #identities x #frames

This dataset contains the predicted class. Each element contains one of three values:

0: "not behavior"
1: "behavior"
-1: no prediction

predicted_class_postprocessed [optional]

dtype: 8-bit integer
shape: #identities x #frames

This dataset contains the predicted class after applying postprocessing. Each element contains one of three values:

0: "not behavior"
1: "behavior"
-1: no prediction

probabilities

dtype: 32-bit floating point
shape: #identities x #frames

This dataset contains the probability (0.0-1.0) of each prediction. If there is no prediction (the identity doesn't exist at a given frame) then the prediction probability is 0.0.

identity_to_track [optional]

dtype: 32-bit integer
shape: #identities x #frames

This dataset is only present when using version 3 of the JABS pose estimation format. It maps each JABS-assigned identity back to the original track ID from the pose file at each frame. -1 indicates the identity does not map to a track for that frame. For Pose File Version 4 and greater, JABS uses the identity assignment contained in the pose file, and identity_to_track is omitted from the prediction file.

Feature File

A feature file represents features calculated by JABS for a single animal in a video file.

Location

Feature files are saved per identity at <JABS project dir>/jabs/features/<video_name>/<identity>/features.h5.

The H5 file contains feature data described in the feature documentation. Features used in JABS classifiers are located within the features group, further separated by per_frame and window_features_<window_size> groups. Features not used in JABS classifiers are located outside the features group.

All features are a vector of data containing the feature value for each frame in the video.

The root file contains the following attributes:

distance_scale_factor: scale factor used when converting from pixel space to cm space
identity: identity value from the original pose value
num_frames: number of frames in the video
pose_hash: blake2b hash of pose file
version: feature version used when generating this feature file

Per-Frame Feature Names

Per frame features are named <feature module> <feature name>. Feature modules cannot contain spaces, while feature names can.

Window Feature Names

Window features are named <feature module> <window operation> <feature name>. Feature modules and window operations cannot contain spaces, while feature names can.

NWB Pose File

JABS can export pose estimation data to NWB (Neurodata Without Borders) format using the ndx-pose 0.2 extension. The resulting files contain all animal keypoints, static and dynamic environmental objects, and supporting metadata needed for a lossless round-trip back into JABS.

See NWB Export for the CLI command and output-mode options.

Output modes

JABS writes NWB in two modes:

Mode	When to use
Per-identity (default)	DANDI archive upload; tools that expect one subject per file
Multisubject (`--multisubject`)	A single shareable file holding every subject (via the ndx-multisubjects extension)

In per-identity mode (the default) one file is written per animal, named {output_stem}_{identity_name}.nwb; the OUTPUT path itself is not created. The JABS reader re-assembles per-identity files transparently: point it at any sibling and it merges all siblings into a single result in the original identity order. In multisubject mode all identities are written into a single self-contained .nwb file (an NdxMultiSubjectsNWBFile with a SubjectsTable) using the ndx-multisubjects extension. Static and dynamic objects are written to every per-identity file identically (they are session-level data).

Full file layout

The layout below shows a multisubject file with two animal identities, two static objects (corners, lixit), and one dynamic object (fecal_boli).

NdxMultiSubjectsNWBFile
├── acquisition/
│   └── SubjectsTable                      [DynamicTable] multisubject mode only — one row per subject
├── processing/
│   └── behavior/                          [ProcessingModule]
│       ├── Skeletons/                     [Skeletons container]
│       │   ├── subject/                   Skeleton — animal keypoints + edges
│       │   ├── corners/                   Skeleton — static object (4 nodes)
│       │   ├── lixit/                     Skeleton — static object (1 or 3 nodes)
│       │   └── fecal_boli/                Skeleton — dynamic object (max_count nodes)
│       │
│       ├── subject_1/                     [PoseEstimation] animal identity 0
│       │   ├── nose/                      [PoseEstimationSeries] num_frames timestamps
│       │   ├── left_ear/
│       │   └── ...
│       │
│       ├── subject_2/                     [PoseEstimation] animal identity 1
│       │   ├── nose/
│       │   └── ...
│       │
│       ├── corners/                       [PoseEstimation] static object
│       │   ├── corners_0/                 [PoseEstimationSeries] 1 timestamp
│       │   ├── corners_1/
│       │   ├── corners_2/
│       │   └── corners_3/
│       │
│       ├── lixit/                         [PoseEstimation] static object
│       │   └── lixit_0/                   [PoseEstimationSeries] 1 timestamp
│       │
│       ├── fecal_boli/                    [PoseEstimation] dynamic object
│       │   ├── fecal_boli_0/              [PoseEstimationSeries] n_predictions timestamps
│       │   ├── fecal_boli_1/
│       │   └── ...
│       │
│       ├── jabs_identity_mask             [TimeSeries] uint8 identity presence mask
│       ├── jabs_bounding_boxes_subject_1  [TimeSeries] optional, one per identity
│       └── jabs_bounding_boxes_subject_2  [TimeSeries] optional, one per identity
│
└── scratch/
    └── jabs_metadata/                     [ScratchData] JSON string (see below)

A per-identity file (the default) uses a plain NWBFile with the same processing/behavior layout, except there is no SubjectsTable, NWBFile.subject is populated (when subject metadata is provided), only one animal identity container is present, and jabs_identity_mask / jabs_bounding_boxes_<identity> cover that identity only.

Animal pose

Each animal identity is a PoseEstimation container in processing/behavior. The container name is the sanitized external ID from the pose file (e.g. a cage ID), or subject_{i} when no external IDs are available. A single Skeleton named subject is shared by all animal identities and stored in the Skeletons container.

The default mouse skeleton has 12 keypoints:

nose, left_ear, right_ear, base_neck,
left_front_paw, right_front_paw, center_spine,
left_rear_paw, right_rear_paw,
base_tail, mid_tail, tip_tail

PoseEstimationSeries fields (per keypoint)

Field	Value
`name`	Keypoint name (e.g. `"nose"`, `"left_ear"`)
`data`	shape `(num_frames, 2)` — `(x, y)` coordinates in pixels
`rate`	Frames per second (float)
`unit`	`"pixels"`
`reference_frame`	`"Top-left corner of video frame, x increases rightward, y increases downward"`
`confidence`	shape `(num_frames,)` — `0.0` = missing keypoint, `> 0.0` = valid
`confidence_definition`	`"0.0=invalid/missing keypoint, >0.0=valid keypoint"`

Identity mask

jabs_identity_mask is a TimeSeries (dtype uint8) recording whether each identity is present in each frame.

Mode	Shape stored in file	Shape returned by reader
Multisubject	`(num_frames, num_identities)`	`(num_identities, num_frames)`
Per-identity	`(num_frames,)`	`(1, num_frames)`

Bounding boxes (optional)

When the source pose file contains bounding box data, one TimeSeries per identity is written with the name jabs_bounding_boxes_{identity_name}.

Property	Value
Shape stored in file	`(num_frames, 2, 2)`
Shape returned by reader	`(num_identities, num_frames, 2, 2)` (all stacked)
Unit	`"pixels"`

Format: [[upper_left_x, upper_left_y], [lower_right_x, lower_right_y]].

Static objects

Static objects are fixed-position spatial landmarks that do not move during a session. They are read from static_objects/ in JABS pose HDF5 files (pose format v5+).

Common static objects:

Object	Shape	Description
`corners`	`(4, 2)`	Four corners of the arena
`lixit`	`(1, 2)` or `(3, 2)`	Water spout — single tip, or tip + left + right
`food_hopper`	`(4, 2)`	Four corners of the food hopper opening

Each static object is a PoseEstimation container with a single timestamp (t = 0.0 s), one PoseEstimationSeries per keypoint, and a dedicated Skeleton. Nodes are named {object_name}_{i} (zero-indexed). Confidence is always 1.0 for static objects and should be ignored by consumers.

Example — `corners` (4 keypoints)

Skeletons/
  corners/
    nodes: ["corners_0", "corners_1", "corners_2", "corners_3"]

processing/behavior/
  corners/                         PoseEstimation
    corners_0/                     PoseEstimationSeries
      data:       [[10.0, 20.0]]   shape (1, 2)
      timestamps: [0.0]
      confidence: [1.0]
    corners_1/
      data:       [[300.0, 20.0]]
      ...

Example — `lixit` (3-keypoint variant)

Skeletons/
  lixit/
    nodes: ["lixit_0", "lixit_1", "lixit_2"]

processing/behavior/
  lixit/                           PoseEstimation
    lixit_0/                       tip
      data:       [[62.0, 166.0]]
      timestamps: [0.0]
      confidence: [1.0]
    lixit_1/                       left side
      data:       [[65.0, 160.0]]
      ...
    lixit_2/                       right side
      data:       [[60.0, 172.0]]
      ...

Dynamic objects

Dynamic objects are objects whose position or count may change over time. Unlike animal pose, predictions are not made every frame — only a sparse subset of frames is sampled. Dynamic objects are introduced in JABS pose format v7.

The source HDF5 pose file stores dynamic objects under dynamic_objects/{name}/:

Dataset	Shape	Description
`points`	`(n_predictions, max_count, 2)` single-keypoint; `(n_predictions, max_count, n_kp, 2)` multi-keypoint	Keypoint coordinates
`counts`	`(n_predictions,)`	Number of valid object instances at each prediction
`sample_indices`	`(n_predictions,)`	Frame indices at which predictions were made

The points dataset carries an optional HDF5 attribute axis_order ("xy" or "yx", default "yx"). JABS normalizes all coordinates to (x, y) on read.

In NWB, each dynamic object is a PoseEstimation container with n_predictions irregular timestamps (one per sampled frame). One PoseEstimationSeries is written per instance slot × keypoint combination.

Timestamps

timestamps[p] = sample_indices[p] / fps

# Recover on read:
sample_indices[p] = round(timestamps[p] * fps)

Instance slot validity via confidence

Not all max_count slots are occupied at every prediction. Occupancy is encoded in the confidence field:

confidence[p] = 1.0   if counts[p] > slot_index
              = 0.0   otherwise

Coordinate values in empty slots are meaningless padding and must not be used. counts can be recovered on read by summing slots where confidence > 0 at each prediction.

Node naming

Condition	Node name pattern	Example
`n_keypoints == 1`	`{name}_{slot}`	`fecal_boli_0`
`n_keypoints > 1`	`{name}_{slot}_{kp}`	`door_0_0`, `door_0_1`, `door_1_0`

Example — `fecal_boli` (single keypoint per instance, up to 3 instances)

50 predictions were made; up to 3 fecal boli visible at once.

Skeletons/
  fecal_boli/
    nodes: ["fecal_boli_0", "fecal_boli_1", "fecal_boli_2"]

processing/behavior/
  fecal_boli/                      PoseEstimation
    fecal_boli_0/                  slot 0
      data:       shape (50, 2)
      timestamps: [t_0, t_1, ..., t_49]
      confidence: [1.0, 1.0, 0.0, ...]    # 1.0 where counts > 0
    fecal_boli_1/                  slot 1
      data:       shape (50, 2)
      timestamps: [t_0, t_1, ..., t_49]
      confidence: [1.0, 0.0, 0.0, ...]    # 1.0 where counts > 1
    fecal_boli_2/                  slot 2
      data:       shape (50, 2)
      timestamps: [t_0, t_1, ..., t_49]
      confidence: [0.0, 0.0, 0.0, ...]    # 1.0 where counts > 2

At prediction p=0, counts[0]=2: slots 0 and 1 are valid, slot 2 is padding.

Example — multi-keypoint dynamic object (`door`: 2 keypoints, up to 2 instances)

Skeletons/
  door/
    nodes: ["door_0_0", "door_0_1", "door_1_0", "door_1_1"]
    #          slot 0     slot 0     slot 1     slot 1
    #          kp 0       kp 1       kp 0       kp 1

processing/behavior/
  door/                            PoseEstimation
    door_0_0/                      slot 0, keypoint 0 — left edge
      data:       shape (30, 2)
      timestamps: [t_0, ..., t_29]
      confidence: [1.0, 1.0, ...]  # 1.0 where counts > 0
    door_0_1/                      slot 0, keypoint 1 — right edge
      data:       shape (30, 2)
      timestamps: [t_0, ..., t_29]
      confidence: [1.0, 1.0, ...]  # same as door_0_0 — slot-level validity
    door_1_0/                      slot 1, keypoint 0
      data:       shape (30, 2)
      timestamps: [t_0, ..., t_29]
      confidence: [0.0, 1.0, ...]  # 1.0 where counts > 1
    door_1_1/                      slot 1, keypoint 1
      data:       shape (30, 2)
      timestamps: [t_0, ..., t_29]
      confidence: [0.0, 1.0, ...]  # same as door_1_0

`jabs_metadata` scratch field

Every JABS NWB file contains a ScratchData object named jabs_metadata in the NWB scratch space. Its data field is a JSON string carrying JABS-specific metadata needed for a lossless round-trip. It is required because pynwb returns PoseEstimationSeries in alphabetical order from HDF5, which would otherwise scramble the keypoint ordering. Tools that do not use the JABS reader can parse this JSON directly to recover identity ordering, subject metadata, and object classification. (Keypoint ordering is not stored here; the JABS reader restores it from the canonical keypoint index.)

Keys

Key	Type	Present	Description
`format_version`	`int`	Always	JABS NWB format version. Currently `1`.
`identity_names`	`list[str]`	Always	Ordered list of animal identity container names. Defines identity order on read.
`num_identities`	`int`	Always	Total number of animal identities in the recording session.
`cm_per_pixel`	`float \\| null`	Always	Pixel-to-centimetre scale factor. `null` if not available.
`external_ids`	`list[str] \\| null`	Always	Original external identity names from the pose file (e.g. cage IDs). `null` if the pose file had no external IDs.
`subjects`	`dict[str, dict] \\| null`	Always	Per-identity biological metadata keyed by identity name, for all identities. Fields: `subject_id`, `sex`, `genotype`, `strain`, `age`, `weight`, `species`, `description`. `null` if none provided.
`metadata`	`dict`	Always	Provenance from the source pose file: `source_file`, `pose_format_version`, and optionally `source_file_hash`.
`static_object_names`	`list[str]`	When static objects present	Names of all static object `PoseEstimation` containers.
`dynamic_object_names`	`list[str]`	When dynamic objects present	Names of all dynamic object `PoseEstimation` containers.
`dynamic_object_shapes`	`dict[str, [int, int]]`	When dynamic objects present	Maps each dynamic object name to `[max_count, n_keypoints]`. Required to reconstruct the 4-D points array on read.
`multisubject`	`bool`	Multisubject mode only	`true` if this is a single multi-subject file written with the ndx-multisubjects extension.
`per_identity_files`	`bool`	Per-identity mode only	`true` if this file is one of a set of per-identity NWB files.
`source_identity_index`	`int`	Per-identity mode only	Zero-based index of the identity in this file. Used to restore original order when merging siblings.
`split_subject_count`	`int`	Per-identity mode only	Total number of subjects in the session across all split files. Used to validate all siblings are present before merging.

Example — multisubject file

{
  "format_version": 1,
  "multisubject": true,
  "identity_names": ["subject_1", "subject_2"],
  "num_identities": 2,
  "cm_per_pixel": 0.043,
  "external_ids": null,
  "subjects": {
    "subject_1": {
      "subject_id": "M123",
      "sex": "M",
      "genotype": "WT",
      "strain": "C57BL/6J",
      "age": "P70D",
      "weight": null,
      "species": "Mus musculus",
      "description": null
    },
    "subject_2": {
      "subject_id": "M124",
      "sex": "F",
      "genotype": "Shank3+/-",
      "strain": "C57BL/6J",
      "age": "P72D",
      "weight": null,
      "species": "Mus musculus",
      "description": null
    }
  },
  "metadata": {
    "source_file": "/data/session_pose_est_v7.h5",
    "pose_format_version": 7,
    "source_file_hash": "a3f1c8..."
  },
  "static_object_names": ["corners", "lixit"],
  "dynamic_object_names": ["fecal_boli"],
  "dynamic_object_shapes": {
    "fecal_boli": [3, 1]
  }
}

Example — per-identity file (identity 1 of 3)

{
  "format_version": 1,
  "identity_names": ["subject_1"],
  "num_identities": 3,
  "cm_per_pixel": 0.043,
  "external_ids": null,
  "subjects": {
    "subject_1": { "subject_id": "M123", "sex": "M", "genotype": "WT" },
    "subject_2": { "subject_id": "M124", "sex": "F", "genotype": "Shank3+/-" },
    "subject_3": { "subject_id": "M125", "sex": "M", "genotype": "WT" }
  },
  "metadata": { "source_file": "...", "pose_format_version": 7 },
  "static_object_names": ["corners", "lixit"],
  "dynamic_object_names": ["fecal_boli"],
  "dynamic_object_shapes": { "fecal_boli": [3, 1] },
  "per_identity_files": true,
  "source_identity_index": 1,
  "split_subject_count": 3
}

Per-identity files store the full subjects dict for all identities, not just the one identity in that file — making each file self-contained.

Container classification

All PoseEstimation containers in processing/behavior are classified using the three explicit lists in jabs_metadata:

all PoseEstimation containers in behavior
    ├── name in identity_names        →  animal identity
    ├── name in static_object_names   →  static object
    └── name in dynamic_object_names  →  dynamic object

Using explicit lists rather than inference rules ensures classification remains correct if new container types are added in future format versions.

Coordinate system

All coordinates in JABS NWB files use the following convention:

Property	Value
Origin	Top-left corner of the video frame
x axis	Increases rightward (column direction)
y axis	Increases downward (row direction)
Units	Pixels

This applies to animal keypoints, static object points, and dynamic object points. NWB files always store coordinates in (x, y) order. Dynamic object coordinates may be stored as (y, x) in the HDF5 pose source (controlled by the axis_order attribute); JABS flips them to (x, y) before writing NWB.

Data not exported to NWB

The following data present in JABS pose HDF5 files is not included in the NWB output:

Data	Pose version	HDF5 location	Notes
Instance segmentation masks	v6+	`poseest/seg_data`	Per-frame per-identity binary or instance masks
Long-term segmentation IDs	v6+	`poseest/longterm_seg_id`	Identity tracking via segmentation
Instance segmentation IDs	v6+	`poseest/instance_seg_id`	Frame-level instance assignments
Segmentation external flags	v6+	`poseest/seg_external_flag`	Internal/external identity classification

There is no standard NWB or ndx-pose representation for instance segmentation masks. If you need this data downstream, read it directly from the source JABS pose HDF5 file.

File Formats

Prediction File

Location

Contents

Attributes

predicted_class

predicted_class_postprocessed [optional]

probabilities

identity_to_track [optional]

Feature File

Location

Contents

Per-Frame Feature Names

Window Feature Names

NWB Pose File

Output modes

Full file layout

Animal pose

PoseEstimationSeries fields (per keypoint)

Identity mask

Bounding boxes (optional)

Static objects

Example — `corners` (4 keypoints)

Example — `lixit` (3-keypoint variant)

Dynamic objects

Timestamps

Instance slot validity via confidence

Node naming

Example — `fecal_boli` (single keypoint per instance, up to 3 instances)

Example — multi-keypoint dynamic object (`door`: 2 keypoints, up to 2 instances)

`jabs_metadata` scratch field

Keys

Example — multisubject file

Example — per-identity file (identity 1 of 3)

Container classification

Coordinate system

Data not exported to NWB

File Formats

Prediction File

Location

Contents

Attributes

predicted_class

predicted_class_postprocessed [optional]

probabilities

identity_to_track [optional]

Feature File

Location

Contents

Per-Frame Feature Names

Window Feature Names

NWB Pose File

Output modes

Full file layout

Animal pose

PoseEstimationSeries fields (per keypoint)

Identity mask

Bounding boxes (optional)

Static objects

Example — corners (4 keypoints)

Example — lixit (3-keypoint variant)

Dynamic objects

Timestamps

Instance slot validity via confidence

Node naming

Example — fecal_boli (single keypoint per instance, up to 3 instances)

Example — multi-keypoint dynamic object (door: 2 keypoints, up to 2 instances)

jabs_metadata scratch field

Keys

Example — multisubject file

Example — per-identity file (identity 1 of 3)

Container classification

Coordinate system

Data not exported to NWB

Example — `corners` (4 keypoints)

Example — `lixit` (3-keypoint variant)

Example — `fecal_boli` (single keypoint per instance, up to 3 instances)

Example — multi-keypoint dynamic object (`door`: 2 keypoints, up to 2 instances)

`jabs_metadata` scratch field