Deepseek-Dataset-Generator creates conversational datasets for LLM fine-tuning via DeepSeek API. Supports various formats (ChatML, ShareGPT, Alpaca, JSON, CSV), easy configuration via YAML and detailed logs. Ideal for generating realistic and customized data quickly.
python nlp open-source machine-learning dataset-generation data-augmentation conversational-ai synthetic-data ai-tools prompt-engineering sharegpt chatml llm-finetuning deepseek alpaca-format
-
Updated
Jun 2, 2025 - Python