VisionCart Perception Suite is a two-part computer-vision engineering project for a small visual robot car:
maze-mapping: color-coded maze perception, grid mapping, stable ROI voting, and control-message packaging.roi-segmentation-classification: OpenART Plus / OpenMV target segmentation and three-route classification for real camera scenes.
The repository is packaged for GitHub display while keeping large datasets and model binaries under artifacts/. Fill in the dataset and weight URLs below before publishing.
| artifact | URL |
|---|---|
| Dataset archive | TODO: paste dataset download URL here |
| Model weights archive | TODO: paste weight download URL here |
| area | implementation detail | evidence |
|---|---|---|
| Maze mapping | LAB color thresholds, pixel-to-grid calibration, ROI majority voting, stable map updates, UART-style frame builder | maze-mapping/main.py, maze-mapping/test_lab_vote.py |
| Segmentation | lightweight ROI mask model with int8 TFLite export and geometry filtering | roi-segmentation-classification/tools/train/train_cameral_rect_seg.py |
| Classification | split routing for empty, digit, and cartoon targets to reduce task interference | roi-segmentation-classification/tools/train/ |
| Deployment | callable OpenMV scripts, no auto-run loop, lazy model loading, framebuffer memory release | roi-segmentation-classification/deploy/openmv/ |
| Evaluation | PC-side reports for mask quality, routing, final labels, video stability, wrong intervals | roi-segmentation-classification/reports/ |
| module | result |
|---|---|
| ROI segmenter | val IoU 96.71%, Dice 97.97%, int8 size 107 KB |
| cartoon classifier | int8 val 257/259, 99.23%, int8 size 90 KB |
| digit classifier | int8 val 31/31, 100.00%, int8 size 48 KB |
| latest three-type set | detection 1280/1280, type accuracy 99.92%, final label accuracy 99.69% |
| sampled video evaluation | 14601 frames, accepted accuracy 99.92%, stable wrong intervals 2 |
maze-mapping/ Maze grid perception and mapping logic
roi-segmentation-classification/ Segmentation, classification, deployment, reports
artifacts/
datasets/ Local datasets, ignored by git except README
weights/ Local model binaries, ignored by git except README
PROJECT_MANIFEST.md Project inventory and publication checklist
Expected local asset layout:
artifacts/datasets/raw-train-images/
artifacts/datasets/roi-segmentation-classification-local/
artifacts/weights/openmv/
artifacts/weights/segmentation/
artifacts/weights/cartoon-classifier/
artifacts/weights/digit-classifier/
See artifacts/README.md for the exact files and the URL placeholders.
Maze mapping tests:
cd maze-mapping
python -m unittest discover -p "test_*.py"ROI segmentation and classification evaluation:
cd roi-segmentation-classification
pip install -r requirements.txt
python tools/eval/eval_three_type_latest.pyOpenMV deployment package:
Copy scripts from roi-segmentation-classification/deploy/openmv/
Copy int8 TFLite models from artifacts/weights/openmv/
Place the files together at the OpenART Plus SD-card root.
- The maze system treats color recognition as a stability problem, not a single-pixel lookup. It combines calibrated LAB thresholds, ROI voting, confidence margins, and previous-state hints.
- The real-scene recognition system avoids a single monolithic classifier. Empty boxes are handled by deterministic color/texture rules, numbers use a small digit classifier, and cartoon cards use a separate 10-class classifier.
- Deployment scripts are designed around OpenMV memory and API constraints: one callable frame pass, no SD writes during inference, and explicit model-memory release in the segmentation pipeline.
