Popular repositories Loading
-
-
llm-awq
llm-awq PublicForked from mit-han-lab/llm-awq
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Python 1
-
neural-compressor
neural-compressor PublicForked from intel/neural-compressor
Intel® Neural Compressor (formerly known as Intel® Low Precision Optimization Tool), targeting to provide unified APIs for network compression technologies, such as low precision quantization, spar…
Python
-
qlora
qlora PublicForked from artidoro/qlora
QLoRA: Efficient Finetuning of Quantized LLMs
Jupyter Notebook
-
peft
peft PublicForked from huggingface/peft
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Python
-
OmniQuant
OmniQuant PublicForked from OpenGVLab/OmniQuant
OmniQuant is a simple and powerful quantization technique for LLMs.
Python
Repositories
- free-cluely Public Forked from Prat011/free-cluely
Cluely - The invisible desktop assistant that provides real-time insights, answers, and support during meetings, interviews, presentations, and professional conversations.
compressa-ai/free-cluely’s past year of commit activity - compressa-deploy Public
compressa-ai/compressa-deploy’s past year of commit activity - compressa-perf Public
compressa-ai/compressa-perf’s past year of commit activity - compressa-ai.github.io Public
compressa-ai/compressa-ai.github.io’s past year of commit activity - pluely Public Forked from iamsrikanthnani/pluely
The Open Source Alternative to Cluely - A lightning-fast, privacy-first AI assistant that works seamlessly during meetings, interviews, and conversations without anyone knowing. Built with Tauri for native performance, just 10MB. Completely undetectable in video calls, screen shares, and recordings.
compressa-ai/pluely’s past year of commit activity - compressa-unstructured-api Public Forked from Unstructured-IO/unstructured-api
unstructured-api fork with GPU inference support
compressa-ai/compressa-unstructured-api’s past year of commit activity - compressa-guidance Public Forked from guidance-ai/guidance
A guidance language for controlling large language models. (Qwen compatible)
compressa-ai/compressa-guidance’s past year of commit activity - vllm Public Forked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
compressa-ai/vllm’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…