Skip to content

shadowbatcode/VisionCart-Perception-Suite

Repository files navigation

VisionCart Perception Suite

VisionCart Perception Suite is a two-part computer-vision engineering project for a small visual robot car:

  1. maze-mapping: color-coded maze perception, grid mapping, stable ROI voting, and control-message packaging.
  2. roi-segmentation-classification: OpenART Plus / OpenMV target segmentation and three-route classification for real camera scenes.

The repository is packaged for GitHub display while keeping large datasets and model binaries under artifacts/. Fill in the dataset and weight URLs below before publishing.

artifact URL
Dataset archive TODO: paste dataset download URL here
Model weights archive TODO: paste weight download URL here

ROI segmentation demo

Engineering Scope

area implementation detail evidence
Maze mapping LAB color thresholds, pixel-to-grid calibration, ROI majority voting, stable map updates, UART-style frame builder maze-mapping/main.py, maze-mapping/test_lab_vote.py
Segmentation lightweight ROI mask model with int8 TFLite export and geometry filtering roi-segmentation-classification/tools/train/train_cameral_rect_seg.py
Classification split routing for empty, digit, and cartoon targets to reduce task interference roi-segmentation-classification/tools/train/
Deployment callable OpenMV scripts, no auto-run loop, lazy model loading, framebuffer memory release roi-segmentation-classification/deploy/openmv/
Evaluation PC-side reports for mask quality, routing, final labels, video stability, wrong intervals roi-segmentation-classification/reports/

Results Snapshot

module result
ROI segmenter val IoU 96.71%, Dice 97.97%, int8 size 107 KB
cartoon classifier int8 val 257/259, 99.23%, int8 size 90 KB
digit classifier int8 val 31/31, 100.00%, int8 size 48 KB
latest three-type set detection 1280/1280, type accuracy 99.92%, final label accuracy 99.69%
sampled video evaluation 14601 frames, accepted accuracy 99.92%, stable wrong intervals 2

Repository Layout

maze-mapping/                         Maze grid perception and mapping logic
roi-segmentation-classification/       Segmentation, classification, deployment, reports
artifacts/
  datasets/                            Local datasets, ignored by git except README
  weights/                             Local model binaries, ignored by git except README
PROJECT_MANIFEST.md                    Project inventory and publication checklist

Artifact Layout

Expected local asset layout:

artifacts/datasets/raw-train-images/
artifacts/datasets/roi-segmentation-classification-local/
artifacts/weights/openmv/
artifacts/weights/segmentation/
artifacts/weights/cartoon-classifier/
artifacts/weights/digit-classifier/

See artifacts/README.md for the exact files and the URL placeholders.

Quick Start

Maze mapping tests:

cd maze-mapping
python -m unittest discover -p "test_*.py"

ROI segmentation and classification evaluation:

cd roi-segmentation-classification
pip install -r requirements.txt
python tools/eval/eval_three_type_latest.py

OpenMV deployment package:

Copy scripts from roi-segmentation-classification/deploy/openmv/
Copy int8 TFLite models from artifacts/weights/openmv/
Place the files together at the OpenART Plus SD-card root.

Design Notes

  • The maze system treats color recognition as a stability problem, not a single-pixel lookup. It combines calibrated LAB thresholds, ROI voting, confidence margins, and previous-state hints.
  • The real-scene recognition system avoids a single monolithic classifier. Empty boxes are handled by deterministic color/texture rules, numbers use a small digit classifier, and cartoon cards use a separate 10-class classifier.
  • Deployment scripts are designed around OpenMV memory and API constraints: one callable frame pass, no SD writes during inference, and explicit model-memory release in the segmentation pipeline.

About

VisionCart Perception Suite — a dual-module CV project for a small vision robot car: 1. maze-mapping: color-coded maze perception, grid mapping, stable ROI voting, and control-message packaging. 2. roi-segmentation-classification: OpenART Plus / OpenMV target segmentation and three-route classification on real camera scenes.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages