🎓 A comprehensive web-based evaluation system for testing minimal sentence pairs using multiple language models. Supports Ollama (local LLMs) and HuggingFace models with interactive visualisations and bulk processing. Built for AIS710 course.
nlp flask machine-learning natural-language-processing ai evaluation transformers pytorch language-models blimp huggingface grammaticality ollama semantic-plausibility
-
Updated
Dec 3, 2025 - Python