Datasets for Instruction Tuning of Large Language Models
-
Updated
Nov 30, 2023
Datasets for Instruction Tuning of Large Language Models
利用免费的大模型api来结合你的私域数据来生成sft训练数据 支持llamafactory等工具的训练数据格式synthetic data
Exports a chat as a ShareGPT dataset
Proxy server that automatically stores messages exchanged between any OAI-compatible frontend and backend as a ShareGPT dataset to be used for training/finetuning.
Genshin Impact Character Chat Models tuned by Lora on LLM
Deepseek-Dataset-Generator creates conversational datasets for LLM fine-tuning via DeepSeek API. Supports various formats (ChatML, ShareGPT, Alpaca, JSON, CSV), easy configuration via YAML and detailed logs. Ideal for generating realistic and customized data quickly.
Fork of GeoAnima's Claude.ai chat exporter userscript, improving button UI and exporting directly to ShareGPT-format JSON
A JSON viewer/editor for multi-line string values - allows to render and edit strings in plain mode (handles escaping/unescaping). Ideal for editing ShareGPT or Alpaca type LLM training examples.
Add a description, image, and links to the sharegpt topic page so that developers can more easily learn about it.
To associate your repository with the sharegpt topic, visit your repo's landing page and select "manage topics."