This web application allows users to upload an image and receive a text description of its content using a pre-trained image classification model from Hugging Face.
| Model Name | Model ID | Task |
|---|---|---|
| BLIP Large | Salesforce/blip-image-captioning-large | image-to-text |
| GIT Large COCO | microsoft/git-large-coco | image-to-text |
| ViT-GPT2 | nlpconnect/vit-gpt2-image-captioning | image-to-text |
| BLIP Base | Salesforce/blip-image-captioning-base | image-to-text |
| GIT Base COCO | microsoft/git-base-coco | image-to-text |
| ViT-GPT2 COCO | ydshieh/vit-gpt2-coco-en | image-to-text |
- Next.js
- React
- Tailwind CSS
- shadcn/ui
- Hugging Face Inference API
- Bun installed on your system
- A Hugging Face account and API key
- Clone the repository:
git clone https://github.com/yourusername/image-description-poc.git
cd image-description-poc- Install dependencies:
bun install- Set up environment variables:
# Copy the example environment file
cp .env.example .env- Open
.envand add your Hugging Face API key:
HUGGING_FACE_API_KEY=your_api_key_here- Start the development server:
bun dev- Open http://localhost:3000 in your browser to see the application.