Skip to content

balaboom123/signdata-slt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

83 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SignDATA: Data Pipeline for Sign Language Translation

SignDATA – Data Pipeline for Sign Language Translation

License Python 3.11+ Stars Issues

A config-driven, modular pipeline for preprocessing multiple Sign Language (SL) datasets. Supports two landmark extractors (MediaPipe Holistic and MMPose RTMPose3D) and two output modes (pose landmarks and video clips).

Quick Start Installation Pipeline Stages Architecture


✨ Key Features

📝 Config-Driven

YAML job configs, experiment configs, and CLI overrides

🦴 Two Extractors

MediaPipe Holistic (553 keypoints) and MMPose RTMPose3D (133 keypoints)

🎬 Two Pipeline Modes

pose (landmarks) and video (clip extraction)

🧩 Registry Architecture

Add datasets, processors, and extractors via decorators

⚡ Parallel Processing

Multi-worker extraction, normalization, and clipping

📦 WebDataset Output

Sharded tar archives for efficient training data loading

📖 New? See the Installation Guide to get started.


Installation

git clone https://github.com/balaboom123/signdata-slt.git
cd signdata-slt
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Optional: MMPose (GPU required)

MediaPipe works on CPU out of the box. MMPose requires a CUDA-capable GPU and additional dependencies -- see the Installation Guide for full setup instructions.


Quick Start

# Download YouTube-ASL videos, extract MediaPipe landmarks, normalize, and package into WebDataset shards
python -m signdata run configs/jobs/youtube_asl/mediapipe.yaml

# Extract MMPose landmarks from pre-downloaded How2Sign data (CUDA required)
python -m signdata run configs/jobs/how2sign/mmpose.yaml

# Override any config value from the command line (e.g. more workers, stop after extraction)
python -m signdata run configs/jobs/youtube_asl/mediapipe.yaml \
  --override processing.max_workers=8 stop_at=extract

Output

Both modes produce WebDataset tar shards for efficient training data loading. See Pipeline Stages for detailed output formats and data shapes.


Supported Datasets

Dataset Venue Description License
YouTube-ASL NeurIPS 2023 11,000+ videos, 73,000+ segments -- open-domain ASL-English parallel corpus Apache-2.0
How2Sign CVPR 2021 80+ hours of instructional ASL in a controlled studio environment CC BY-NC 4.0

For paper-aligned preprocessing methodology, see Research-Aligned Preprocessing.


Documentation

License

The MIT license in this repository applies to the code and documentation in this project. Use of external datasets, research artifacts, and upstream repos referenced above must comply with their original licenses and usage terms.

MIT -- see LICENSE.

About

Modular, config-driven pipeline for preprocessing Sign Language datasets with pose and video outputs using MediaPipe, MMPose, and YOLO..

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages