diff --git a/providers/anthropic.mdx b/providers/anthropic.mdx
index ed7952b..31127c2 100644
--- a/providers/anthropic.mdx
+++ b/providers/anthropic.mdx
@@ -1,60 +1,43 @@
---
title: "Anthropic"
-description: "Learn how to configure and use Anthropic Claude models with CodinIT. Covers API key setup, model selection, and advanced features like prompt caching."
+description: "Configure Anthropic Claude models with CodinIT for advanced reasoning and code generation."
---
**Website:** [https://www.anthropic.com/](https://www.anthropic.com/)
-### Getting an API Key
+## Getting an API Key
-1. **Sign Up/Sign In:** Go to the [Anthropic Console](https://console.anthropic.com/). Create an account or sign in.
-2. **Navigate to API Keys:** Go to the [API keys](https://console.anthropic.com/settings/keys) section.
-3. **Create a Key:** Click "Create Key". Give your key a descriptive name (e.g., "CodinIT").
-4. **Copy the Key:** **Important:** Copy the API key _immediately_. You will not be able to see it again. Store it securely.
+1. Go to [Anthropic Console](https://console.anthropic.com/) and sign in
+2. Navigate to [API keys](https://console.anthropic.com/settings/keys)
+3. Click "Create Key" and name it (e.g., "CodinIT")
+4. Copy the key immediately - you won't see it again
-### Supported Models
+## Configuration
-CodinIT supports the following Anthropic Claude models:
+1. Click the settings icon (⚙️) in CodinIT
+2. Select "Anthropic" as the API Provider
+3. Paste your API key
+4. Choose your model
-- `claude-haiku-4-5-20251001`
-- `claude-opus-4-1-20250805`
-- `claude-opus-4-20250514`
-- `anthropic/claude-sonnet-4.5` (Recommended)
-- `claude-3-7-sonnet-20250219`
-- `claude-3-5-sonnet-20241022`
-- `claude-3-5-haiku-20241022`
-- `claude-3-opus-20240229`
-- `claude-3-haiku-20240307`
+## Supported Models
-See [Anthropic's Model Documentation](https://docs.anthropic.com/en/about-claude/models) for more details on each model's capabilities.
+- `anthropic/claude-sonnet-4.5` (Recommended)
+- `claude-opus-4-1-20250805`
+- `claude-3-7-sonnet-20250219`
+- `claude-3-5-sonnet-20241022`
+- `claude-3-5-haiku-20241022`
-### Configuration in CodinIT
+See [Anthropic's documentation](https://docs.anthropic.com/en/about-claude/models) for full model details.
-1. **Open CodinIT Settings:** Click the settings icon (⚙️) in the CodinIT panel.
-2. **Select Provider:** Choose "Anthropic" from the "API Provider" dropdown.
-3. **Enter API Key:** Paste your Anthropic API key into the "Anthropic API Key" field.
-4. **Select Model:** Choose your desired Claude model from the "Model" dropdown.
-5. **(Optional) Custom Base URL:** If you need to use a custom base URL for the Anthropic API, check "Use custom base URL" and enter the URL. Most users won't need to adjust this setting.
+## Extended Thinking
-### Extended Thinking
+Enable enhanced reasoning for complex tasks by checking "Enable Extended Thinking" in CodinIT settings. Available for Claude Opus 4, Sonnet 4.5, and Sonnet 3.7.
-Anthropic models offer an "Extended Thinking" feature, designed to give them enhanced reasoning capabilities for complex tasks. This feature allows the model to output its step-by-step thought process before delivering a final answer, providing transparency and enabling more thorough analysis for challenging prompts.
+Learn more in the [Extended Thinking documentation](https://docs.anthropic.com/en/build-with-claude/extended-thinking).
-When extended thinking is in CodinIT, the model generates `thinking` content blocks that detail its internal reasoning. These insights are then incorporated into its final response.
-CodinIT users can leverage this by checking the `Enable Extended Thinking` box below the model selection menu after selecting a Claude Model from any provider.
+## Notes
-**Key Aspects of Extended Thinking:**
-
-- **Supported Models:** This feature is available for select models, including Claude Opus 4, Claude Sonnet 4.5, and Claude Sonnet 3.7.
-- **Summarized Thinking (Claude 4):** For Claude 4 and 4.5 models, the API returns a summary of the full thinking process to balance insight with efficiency and prevent misuse. You are billed for the full thinking tokens, not just the summary.
-- **Streaming:** Extended thinking responses, including the `thinking` blocks, can be streamed.
-- **Tool Use & Prompt Caching:** Extended thinking interacts with tool use (requiring thinking blocks to be passed back) and prompt caching (with specific behaviors around cache invalidation and context).
-
-For comprehensive details on how extended thinking works, including API examples, interaction with tool use, prompt caching, and pricing, please refer to the [official Anthropic documentation on Extended Thinking](https://docs.anthropic.com/en/build-with-claude/extended-thinking).
-
-### Tips and Notes
-
-- **Prompt Caching:** Claude 3 models support [prompt caching](https://docs.anthropic.com/en/build-with-claude/prompt-caching), which can significantly reduce costs and latency for repeated prompts.
-- **Context Window:** Claude models have large context windows (200,000 tokens), allowing you to include a significant amount of code and context in your prompts.
-- **Pricing:** Refer to the [Anthropic Pricing](https://www.anthropic.com/pricing) page for the latest pricing information.
-- **Rate Limits:** Anthropic has strict rate limits based on [usage tiers](https://docs.anthropic.com/en/api/rate-limits#requirements-to-advance-tier). If you're repeatedly hitting rate limits, consider contacting Anthropic sales or accessing Claude through a different provider like [OpenRouter](/providers/openrouter) or [Requesty](/providers/openrouter).
\ No newline at end of file
+- **Context Window:** 200,000 tokens
+- **Prompt Caching:** Reduces costs for repeated prompts
+- **Rate Limits:** Based on [usage tiers](https://docs.anthropic.com/en/api/rate-limits). Consider [OpenRouter](/providers/openrouter) if hitting limits
+- **Pricing:** See [Anthropic Pricing](https://www.anthropic.com/pricing)
\ No newline at end of file
diff --git a/providers/aws-bedrock.mdx b/providers/aws-bedrock.mdx
index 8456081..e8f6afb 100644
--- a/providers/aws-bedrock.mdx
+++ b/providers/aws-bedrock.mdx
@@ -1,136 +1,79 @@
---
title: "AWS Bedrock"
sidebarTitle: "API Key"
-description: "Set up AWS Bedrock with CodinIT using Bedrock API Keys. Simplest setup for individual developers to access frontier models."
+description: "Set up AWS Bedrock with CodinIT using API Keys to access frontier models like Claude and Amazon Nova."
---
-### Overview
+Access leading AI models through AWS Bedrock with simplified API Key setup.
-- **AWS Bedrock:** A fully managed service that offers access to leading generative AI models (e.g., Anthropic Claude, Amazon Nova) through AWS.\
- [Learn more about AWS Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html).
-- **CodinIT:** A VS Code extension that acts as a coding assistant by integrating with AI models—empowering developers to generate code, debug, and analyze data.
-- **Developer Focus:** This guide is tailored for individual developers that want to enable access to frontier models via AWS Bedrock with a simplified setup using API Keys.
+**Website:** [https://docs.aws.amazon.com/bedrock/](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html)
----
-
-### Step 1: Prepare Your AWS Environment
-
-#### 1.1 Individual user setup - Create a Bedrock API Key
+## Setup Steps
-For more detailed instructions check the [documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/api-keys.html).
+### 1. Create Bedrock API Key
-1. **Sign in to the AWS Management Console:**\
- [AWS Console](https://aws.amazon.com/console)
-2. **Access Bedrock Console:**
- - [Bedrock Console](https://console.aws.amazon.com/bedrock)
- - Create a new Long Lived API Key. This API Key will have by default the `AmazonBedrockLimitedAccess` IAM policy
- [View AmazonBedrockLimitedAccess Policy Details](https://docs.aws.amazon.com/bedrock/latest/userguide/security-iam.html#managed-policies)
+1. **Sign in:** [AWS Console](https://aws.amazon.com/console)
+2. **Access Bedrock:** Go to [Bedrock Console](https://console.aws.amazon.com/bedrock)
+3. **Create API Key:** Create a new Long Lived API Key
+ - Default policy: `AmazonBedrockLimitedAccess`
+ - [View policy details](https://docs.aws.amazon.com/bedrock/latest/userguide/security-iam.html#managed-policies)
-#### 1.2 Create or Modify the Policy
+### 2. Configure IAM Permissions
-To ensure CodinIT can interact with AWS Bedrock, your IAM user or role needs specific permissions. While the `AmazonBedrockLimitedAccess` managed policy provides comprehensive access, for a more restricted and secure setup adhering to the principle of least privilege, the following minimal permissions are sufficient for CodinIT's core model invocation functionality:
+**Minimal permissions required:**
+```json
+{
+ "Version": "2012-10-17",
+ "Statement": [{
+ "Effect": "Allow",
+ "Action": [
+ "bedrock:InvokeModel",
+ "bedrock:InvokeModelWithResponseStream",
+ "bedrock:CallWithBearerToken"
+ ],
+ "Resource": "*"
+ }]
+}
+```
-- `bedrock:InvokeModel`
-- `bedrock:InvokeModelWithResponseStream`
-- `bedrock:CallWithBearerToken`
+Create custom policy and attach to IAM user associated with your API key.
-You can create a custom IAM policy with these permissions and attach it to your IAM user or role.
+**Important:**
+- For model listing in CodinIT, add `bedrock:ListFoundationModels` permission
+- For AWS Marketplace models (e.g., Anthropic Claude), use `AmazonBedrockLimitedAccess` policy
+- For Anthropic models, submit First Time Use (FTU) form via [Playground](https://console.aws.amazon.com/bedrock/home#/text-generation-playground)
-1. In the AWS IAM console, create a new policy.
-2. Use the JSON editor to add the following policy document:
- ```json
- {
- "Version": "2012-10-17",
- "Statement": [
- {
- "Effect": "Allow",
- "Action": ["bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream", "bedrock:CallWithBearerToken"],
- "Resource": "*" // For enhanced security, scope this to specific model ARNs if possible.
- }
- ]
- }
- ```
-3. Name the policy (e.g., `CodinITBedrockInvokeAccess`) and attach it to the IAM user associated with the key you created. The IAM user and the API key have the same prefix.
+### 3. Choose Region
-**Important Considerations:**
-
-- **Model Listing in CodinIT:** The minimal permissions (`bedrock:InvokeModel`, `bedrock:InvokeModelWithResponseStream`) are sufficient for CodinIT to _use_ a model if you specify the model ID directly in CodinIT's settings. If you rely on CodinIT to dynamically list available Bedrock models, you might need additional permissions like `bedrock:ListFoundationModels`.
-- **AWS Marketplace Subscriptions:** For third-party models (e.g., Anthropic Claude), the **`AmazonBedrockLimitedAccess`** policy grants you the necessary permissions to subscribe via the AWS Marketplace. There is no explicit access to be enabled. For Anthropic models you are still required to submit a First Time Use (FTU) form via the Console. If you get the following message in the CodinIT chat `[ERROR] Failed to process response: Model use case details have not been submitted for this account. Fill out the Anthropic use case details form before using the model.` then open the [Playground in the AWS Bedrock Console](https://console.aws.amazon.com/bedrock/home?#/text-generation-playground), select any Anthropic model and fill in the form (you might need to send a prompt first)
-
----
-
-### Step 2: Verify Regional and Model Access
-
-#### 2.1 Choose and Confirm a Region
-
-1. **Select a Region:**\
- AWS Bedrock is available in multiple regions (e.g., US East, Europe, Asia Pacific). Choose the region that meets your latency and compliance needs.\
- [AWS Global Infrastructure](https://aws.amazon.com/about-aws/global-infrastructure/regions_az/)
-2. **Verify Model Access:**
- - **Note:** Some models are only accessible via an [Inference Profile](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html). In such case check the box "Cross Region Inference".
-
----
+Select region for latency/compliance needs:
+- `us-east-1` (N. Virginia)
+- `us-west-2` (Oregon)
+- `eu-west-1` (Ireland)
+- `ap-southeast-1` (Singapore)
-### Step 3: Configure the CodinIT VS Code Extension
+**Note:** Some models require [Inference Profile](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html) - check "Cross Region Inference" box if needed.
-#### 3.1 Install and Open CodinIT
+### 4. Configure CodinIT
-1. **Install VS Code:**\
- Download from the [VS Code website](https://code.visualstudio.com).
-2. **Install the CodinIT Extension:**
- - Open VS Code.
- - Go to the Extensions Marketplace (`Ctrl+Shift+X` or `Cmd+Shift+X`).
- - Search for **CodinIT** and install it.
+1. Install CodinIT extension in VS Code
+2. Click settings icon (⚙️)
+3. Select **AWS Bedrock** as API Provider
+4. Enter your **API Key**
+5. Specify **AWS Region** (e.g., `us-east-1`)
+6. Select **Model** (e.g., `anthropic.claude-3-5-sonnet-20241022-v2:0`)
+7. Save and test
-#### 3.2 Configure CodinIT Settings
+## Security Best Practices
-1. **Open CodinIT Settings:**
- - Click on the settings ⚙️ to select your API Provider.
-2. **Select AWS Bedrock as the API Provider:**
- - From the API Provider dropdown, choose **AWS Bedrock**.
-3. **Enter Your AWS API Key:**
- - Input your **API Key**
- - Specify the correct **AWS Region** (e.g., `us-east-1` or your enterprise-approved region).
-4. **Select a Model:**
- - Choose an on-demand model (e.g., **anthropic.claude-3-5-sonnet-20241022-v2:0**).
-5. **Save and Test:**
- - Click **Done/Save** to apply your settings.
- - Test the integration by sending a simple prompt (e.g., "Generate a Python function to check if a number is prime.").
+1. **Secure access:** Prefer AWS SSO/federated roles over long-lived API keys when possible
+2. **Network security:** Consider [AWS PrivateLink](https://docs.aws.amazon.com/vpc/latest/userguide/endpoint-services-overview.html)
+3. **Monitoring:** Enable CloudTrail for API logging and CloudWatch for metrics
+4. **Cost management:** Use AWS Cost Explorer and set billing alerts
+5. **Regular audits:** Review IAM roles and CloudTrail logs periodically
----
-
-### Step 4: Security, Monitoring, and Best Practices
-
-1. **Secure Access:**
- - Prefer AWS SSO/federated roles over long-lived API Key when possible.
- - [AWS IAM Best Practices](https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html)
-2. **Enhance Network Security:**
- - Consider setting up [AWS PrivateLink](https://docs.aws.amazon.com/vpc/latest/userguide/endpoint-services-overview.html) to securely connect to Bedrock.
-3. **Monitor and Log Activity:**
- - Enable AWS CloudTrail to log Bedrock API calls.
- - Use CloudWatch to monitor metrics like invocation count, latency, and token usage.
- - Set up alerts for abnormal activity.
-4. **Handle Errors and Manage Costs:**
- - Implement exponential backoff for throttling errors.
- - Use AWS Cost Explorer and set billing alerts to track usage.\
- [AWS Cost Management](https://docs.aws.amazon.com/cost-management/latest/userguide/what-is-aws-cost-management.html)
-5. **Regular Audits and Compliance:**
- - Periodically review IAM roles and CloudTrail logs.
- - Follow internal data privacy and governance policies.
-
----
-
-### Conclusion
-
-By following these steps, you can quickly integrate AWS Bedrock with the CodinIT VS Code extension to accelerate development:
-
-1. **Prepare Your AWS Environment:** Create a Bedrock API Key with the necessary permissions.
-2. **Verify Region and Model Access:** Confirm that your selected region supports your required models.
-3. **Configure CodinIT in VS Code:** Install and set up CodinIT with your AWS API Key and choose an appropriate model.
-4. **Implement Security and Monitoring:** Use best practices for IAM, network security, monitoring, and cost management.
-
-For further details, consult the [AWS Bedrock Documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html). Happy coding!
-
----
+## Notes
-_This guide will be updated as AWS Bedrock and CodinIT evolve. Always refer to the latest documentation and internal policies for up-to-date practices._
\ No newline at end of file
+- **Pricing:** Usage-based, see [AWS Bedrock Pricing](https://aws.amazon.com/bedrock/pricing/)
+- **Compliance:** HIPAA and SOC 2 Type II compliant
+- **Documentation:** [AWS Bedrock User Guide](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html)
+- **IAM Best Practices:** [AWS IAM Best Practices](https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html)
diff --git a/providers/cloud-providers.mdx b/providers/cloud-providers.mdx
index d27665d..8f9cb5b 100644
--- a/providers/cloud-providers.mdx
+++ b/providers/cloud-providers.mdx
@@ -3,227 +3,133 @@ title: 'Providers'
description: 'Connect CodinIT with 19+ AI providers including cloud models, local inference, and specialized services.'
---
-### Enterprise & Research Models
+## Enterprise & Research Models
- Claude models with advanced reasoning capabilities
+ Claude models with advanced reasoning
+
+
+ GPT-5 and o-series models
+
+
+ Gemini models via GCP Vertex AI
-
-
- GPT-5 and GPT-4 models for versatile AI assistance
-
-
-
- Gemini models with multimodal capabilities
-
-
- Advanced reasoning models for complex tasks
+ Advanced reasoning models
-### Specialized & Fast Inference
+## Fast Inference & Specialized
- Ultra-fast inference with LPU technology
+ Ultra-fast LPU inference
+
+
+ 50+ open-source models
+
+
+ Optimized open-source inference
+
+
+ AI with integrated web search
-
-
- Access to 50+ open-source models
-
-
-
- Optimized inference for open-source models
-
-
-
- AI models with integrated web search
-
-
- X.AI's Grok models with real-time knowledge
+ Grok models with large context
+
+
+ Fast inference, 40+ models
-### Open Source & Community
+## Open Source & Community
- Command R series models for coding and analysis
+ Command R series models
+
+
+ Thousands of community models
+
+
+ Mistral and Codestral models
-
-
- Open-source model hub with community models
-
-
-
- Open-source and commercial Mistral models
-
-
- Chinese language models with Kimi series
+ Kimi series, Chinese language
-### Unified & Routing
+## Unified & Routing
- Access multiple models through a unified API
+ Multiple models, unified API
-
- Connect to any OpenAI-compatible API endpoint
+ Any OpenAI-compatible endpoint
-### Cloud & Enterprise
+## Cloud & Enterprise
- Enterprise-grade AI models through AWS infrastructure
+ Enterprise AI via AWS
-
- Access OpenAI and other models through GitHub
+ Models through GitHub platform
-### Local & Private
+## Local & Private
- Run open-source models locally with Ollama
+ Run models locally with Ollama
-
- Desktop app for running models locally
+ Desktop app for local models
-## Choosing the Right Provider
-
-With 19+ AI providers available, selecting the right model depends on your specific needs. Consider these key factors:
-
-
-
- * **Ultra-fast inference**: Groq (LPU technology), Together AI
- * **Best reasoning**: Anthropic Claude, DeepSeek, OpenAI o1
- * **Balanced performance**: OpenAI GPT-4, Google Gemini, Cohere
- * **Local speed**: Ollama, LM Studio (no network latency)
-
-
-
- * **Free/Low-cost**: Local models (Ollama, LM Studio), OpenRouter * **Budget-friendly**: Together AI, HuggingFace,
- Hyperbolic * **Premium**: Anthropic, OpenAI, Google (higher quality) * **Enterprise**: AWS Bedrock, GitHub Models
- (included benefits)
-
-
-
- * **Maximum privacy**: Local models (Ollama, LM Studio) - data never leaves your device * **Enterprise-grade**: AWS
- Bedrock, Anthropic (SOC 2 compliant) * **Cloud security**: OpenAI, Google, Cohere (encrypted transmission) *
- **Specialized**: Perplexity (search integration with privacy considerations)
-
-
-
- * **Code generation**: All providers support coding, specialized: Cohere, Together AI, GitHub * **Multimodal**: Google
- Gemini, OpenAI GPT-4 Vision, Moonshot * **Long context**: Claude (200K+), Gemini (1M+), GPT-4 (128K) * **Function
- calling**: OpenAI, Anthropic, Google, Cohere * **Search integration**: Perplexity (real-time web search) *
- **Multilingual**: Cohere, Google, Moonshot (Chinese), Mistral
-
-
-
- * **Rapid prototyping**: Groq, Together AI (fast iteration)
- * **Production applications**: Anthropic, OpenAI, AWS Bedrock
- * **Research & analysis**: DeepSeek, Perplexity, Cohere
- * **Offline development**: Ollama, LM Studio
- * **Enterprise integration**: AWS Bedrock, GitHub Models
- * **Cost optimization**: Hyperbolic, HuggingFace, OpenRouter
-
-
+## Choosing a Provider
-## Quick Start
+**Performance & Speed:**
+- Ultra-fast: Groq, Together AI, Fireworks
+- Best reasoning: Anthropic Claude, DeepSeek, OpenAI o1
+- Balanced: OpenAI GPT-4, Google Gemini, Cohere
-
-
- Select from 19+ providers based on your needs: speed, cost, capabilities, or privacy requirements
-
+**Cost:**
+- Free/Low-cost: Local models (Ollama, LM Studio), OpenRouter
+- Budget-friendly: Together AI, HuggingFace, Hyperbolic
+- Premium: Anthropic, OpenAI, Google
-
- For cloud providers: Sign up and get API keys. For local providers: Download and install the software
-
+**Privacy:**
+- Maximum privacy: Local models (Ollama, LM Studio)
+- Enterprise-grade: AWS Bedrock, Anthropic
+- Cloud security: OpenAI, Google, Cohere
-
- Add your credentials in CodinIT's settings under AI Providers or use provider-specific setup prompts
-
+**Capabilities:**
+- Code generation: All providers, specialized: Cohere, Together AI
+- Multimodal: Google Gemini, OpenAI GPT-4 Vision, Moonshot
+- Long context: Claude (200K+), Gemini (1M+), GPT-4 (128K)
+- Search integration: Perplexity
+- Multilingual: Cohere, Google, Moonshot (Chinese)
-
- Choose from available models within your selected provider, considering context limits and capabilities
-
+## Quick Start
-
- Begin using AI assistance in your development workflow with the configured provider
-
+
+ Select based on needs: speed, cost, capabilities, or privacy
+ Sign up and get API keys (cloud) or install software (local)
+ Add credentials in CodinIT settings
+ Choose from available models
+ Begin using AI in your workflow
-## Configuration Tips
-
-
- **Multi-Provider Setup**: Configure multiple providers simultaneously and switch between them based on task
- requirements, cost considerations, or performance needs.
-
-
-
- **API Key Security**: Your API keys are stored locally and never transmitted to CodinIT servers. They are only used to
- communicate directly with your chosen AI provider.
-
-
-
- **Rate Limits**: Each provider has different rate limits and usage quotas. Monitor your usage and consider provider
- switching for high-volume workloads.
-
-
-
- **Provider Switching**: Easily switch between providers mid-project. CodinIT maintains separate contexts for different
- providers, allowing you to leverage specialized capabilities as needed.
-
-
-
- **Local vs Cloud**: Local providers (Ollama, LM Studio) offer maximum privacy but require hardware resources. Cloud
- providers offer convenience and advanced features but involve data transmission.
-
-
-## Next Steps
-
-
-
- Learn about context windows and model parameters
-
-
-
- Compare different models and their capabilities
-
-
-
- Set up local models for complete privacy
-
-
-
- Optimize your prompts for better results
-
-
-
- Optimize costs and performance across providers
-
-
-
- Connect with databases, deployments, and APIs
-
-
+## Notes
-
- **Provider Ecosystem**: With 19+ AI providers, you can choose the perfect model for every task - from rapid
- prototyping to production deployment, from cost optimization to maximum privacy.
-
+- **Multi-provider:** Configure multiple providers and switch between them
+- **API security:** Keys stored locally, never transmitted to CodinIT servers
+- **Rate limits:** Each provider has different limits
+- **Local vs Cloud:** Local offers privacy but requires hardware; cloud offers convenience and advanced features
diff --git a/providers/cohere.mdx b/providers/cohere.mdx
index b50da2b..74467fd 100644
--- a/providers/cohere.mdx
+++ b/providers/cohere.mdx
@@ -1,185 +1,40 @@
---
title: Cohere
-description: Access Cohere's powerful Command R series language models for advanced reasoning, code generation, and multilingual text analysis.
+description: Configure Cohere's Command R series models for reasoning, code generation, and multilingual tasks.
---
-Cohere provides advanced language models that excel at understanding context, generating human-like text, and performing complex reasoning tasks. Their models are particularly strong in areas like code generation, analysis, and multilingual capabilities.
+**Website:** [https://cohere.com/](https://cohere.com/)
-## Overview
+## Getting an API Key
-Cohere's AI models are designed to be helpful, truthful, and scalable. They offer a range of models from lightweight options to powerful enterprise-grade solutions, making them suitable for everything from quick prototyping to production applications.
+1. Go to [Cohere Dashboard](https://dashboard.cohere.com/) and sign in
+2. Navigate to API Keys section
+3. Create a new API key
+4. Copy the key immediately
-
-
- Latest generation models with enhanced reasoning capabilities
-
-
- Strong performance across multiple languages
-
-
- Specialized models for programming and code analysis
-
-
+## Configuration
-## Available Models
+1. Click the settings icon (⚙️) in CodinIT
+2. Select "Cohere" as the API Provider
+3. Paste your API key
+4. Choose your model
-
-
- ### Command R Plus Latest
- The most advanced model in Cohere's lineup, offering superior reasoning and code generation capabilities.
+## Supported Models
- - **Context Window**: 128,000 tokens
- - **Best for**: Complex reasoning, code generation, analysis
- - **Pricing**: Higher cost, premium performance
+- `command-r-plus` (Latest) - 128K context
+- `command-r` (Latest) - 128K context
+- `command` - 4K context
+- `command-light` - 4K context
+- `aya-expanse` - Multilingual, 8K context
-
+## Features
-
- ### Command R Latest
- Balanced performance with excellent reasoning capabilities for most use cases.
+- **Tool calling:** External tools and APIs
+- **RAG support:** Retrieval augmented generation
+- **Multilingual:** Strong multi-language performance
+- **Code intelligence:** Programming specialized
- - **Context Window**: 128,000 tokens
- - **Best for**: General AI tasks, analysis, writing
- - **Pricing**: Moderate cost, good performance balance
+## Notes
-
-
-
- ### Command R Plus
- Enterprise-grade model with enhanced capabilities for complex tasks.
-
- - **Context Window**: 128,000 tokens
- - **Best for**: Advanced reasoning, technical writing
- - **Pricing**: Premium tier
-
-
-
-
- ### Command R
- Reliable workhorse model for most AI applications.
-
- - **Context Window**: 128,000 tokens
- - **Best for**: General purpose AI tasks
- - **Pricing**: Standard tier
-
-
-
-
- ### Command
- Earlier generation model, still powerful for many tasks.
-
- - **Context Window**: 4,096 tokens
- - **Best for**: Basic text generation, simpler tasks
- - **Pricing**: Lower cost option
-
-
-
-
- ### Command Light Series
- Faster, more efficient models for simpler tasks and cost optimization.
-
- - **Context Window**: 4,096 tokens
- - **Best for**: Quick responses, basic analysis
- - **Pricing**: Most affordable option
-
-
-
-
- ### Aya Expanse Series
- Specialized multilingual models for global applications.
-
- - **Context Window**: 8,000 tokens
- - **Best for**: Multilingual content, global applications
- - **Pricing**: Moderate cost
-
-
-
-
-## Setup Instructions
-
-
- Visit [Cohere Dashboard](https://dashboard.cohere.com/) and create an account
- Navigate to API Keys section and create a new API key
- Add your API key to the Cohere provider settings in Codinit
- Select a Cohere model and test it with a simple prompt
-
-
-## Key Features
-
-
- Advanced Reasoning
- Code Generation
- Multilingual
- Tool Calling
- RAG Support
-
-
-### Advanced Capabilities
-
-- **Tool Calling**: Can use external tools and APIs
-- **Retrieval Augmented Generation**: Enhanced with external knowledge
-- **Multilingual Support**: Strong performance in multiple languages
-- **Code Intelligence**: Specialized for programming tasks
-- **Reasoning**: Advanced logical reasoning capabilities
-
-## Use Cases
-
-
-
- ### Programming Assistance
- Perfect for code generation, debugging, and technical documentation.
-
- - Generate complete functions and classes
- - Debug existing code
- - Write technical documentation
- - Code review and analysis
-
-
-
-
- ### Writing and Analysis
- Excellent for content creation and analytical tasks.
-
- - Technical writing
- - Business analysis
- - Research summaries
- - Creative content generation
-
-
-
-
- ### Global Applications
- Strong performance for international and multilingual use cases.
-
- - Translation assistance
- - Multilingual content creation
- - Cross-cultural communication
- - Global business applications
-
-
-
-
-## Pricing Information
-
-Cohere offers flexible pricing based on usage:
-
-- **Free Tier**: Limited usage for testing
-- **Pay-as-you-go**: Based on tokens processed
-- **Enterprise**: Custom pricing for high-volume usage
-
-
-
-
-
-
-
-
-
- **Rate Limits**: Cohere implements rate limits based on your account tier. Free accounts have lower limits than paid
- accounts.
-
-
-
- **Best Practices**: Start with Command R models for most applications. Use Command Light for simple tasks where speed
- is prioritized over quality.
-
+- **Pricing:** See [Cohere Pricing](https://cohere.com/pricing)
+- **Free tier:** Available for testing
diff --git a/providers/deepseek.mdx b/providers/deepseek.mdx
index d5febdb..1e5cd1f 100644
--- a/providers/deepseek.mdx
+++ b/providers/deepseek.mdx
@@ -1,33 +1,29 @@
---
title: "DeepSeek"
-description: "Learn how to configure and use DeepSeek models like deepseek-chat and deepseek-reasoner with CodinIT."
+description: "Configure DeepSeek models for coding and reasoning tasks with CodinIT."
---
-CodinIT supports accessing models through the DeepSeek API, including `deepseek-chat` and `deepseek-reasoner`.
-
**Website:** [https://platform.deepseek.com/](https://platform.deepseek.com/)
-### Getting an API Key
-
-1. **Sign Up/Sign In:** Go to the [DeepSeek Platform](https://platform.deepseek.com/). Create an account or sign in.
-2. **Navigate to API Keys:** Find your API keys in the [API keys](https://platform.deepseek.com/api_keys) section of the platform.
-3. **Create a Key:** Click "Create new API key". Give your key a descriptive name (e.g., "CodinIT").
-4. **Copy the Key:** **Important:** Copy the API key _immediately_. You will not be able to see it again. Store it securely.
+## Getting an API Key
-### Supported Models
+1. Go to [DeepSeek Platform](https://platform.deepseek.com/) and sign in
+2. Navigate to [API keys](https://platform.deepseek.com/api_keys)
+3. Click "Create new API key" and name it (e.g., "CodinIT")
+4. Copy the key immediately - you won't see it again
-CodinIT supports the following DeepSeek models:
+## Configuration
-- `deepseek-v3-0324` (Recommended for coding tasks)
-- `deepseek-r1` (Recommended for reasoning tasks)
+1. Click the settings icon (⚙️) in CodinIT
+2. Select "DeepSeek" as the API Provider
+3. Paste your API key
+4. Choose your model
-### Configuration in CodinIT
+## Supported Models
-1. **Open CodinIT Settings:** Click the ⚙️ icon in the CodinIT panel.
-2. **Select Provider:** Choose "DeepSeek" from the "API Provider" dropdown.
-3. **Enter API Key:** Paste your DeepSeek API key into the "DeepSeek API Key" field.
-4. **Select Model:** Choose your desired model from the "Model" dropdown.
+- `deepseek-v3-0324` (Recommended for coding)
+- `deepseek-r1` (Recommended for reasoning)
-### Tips and Notes
+## Notes
-- **Pricing:** Refer to the [DeepSeek Pricing](https://api-docs.deepseek.com/quick_start/pricing/) page for details on model costs.
\ No newline at end of file
+- **Pricing:** See [DeepSeek Pricing](https://api-docs.deepseek.com/quick_start/pricing/)
\ No newline at end of file
diff --git a/providers/fireworks.mdx b/providers/fireworks.mdx
index 7f74116..579c517 100644
--- a/providers/fireworks.mdx
+++ b/providers/fireworks.mdx
@@ -1,131 +1,44 @@
---
title: "Fireworks AI"
-description: "Configure Fireworks AI's lightning-fast inference platform with CodinIT for up to 4x faster performance and access to 40+ optimized models."
+description: "Configure Fireworks AI for fast inference with 40+ optimized models."
---
-Fireworks AI is a leading infrastructure platform for generative AI that focuses on delivering exceptional performance through optimized inference capabilities. With up to 4x faster inference speeds than alternative platforms and support for over 40 different AI models, Fireworks eliminates the operational complexity of running AI models at scale.
+Fireworks AI provides optimized inference with up to 4x faster performance than alternatives.
**Website:** [https://fireworks.ai/](https://fireworks.ai/)
-### Getting an API Key
+## Getting an API Key
-1. **Sign Up/Sign In:** Go to [Fireworks AI](https://fireworks.ai/) and create an account or sign in.
-2. **Navigate to API Keys:** Access the API keys section in your dashboard.
-3. **Create a Key:** Generate a new API key. Give it a descriptive name (e.g., "CodinIT").
-4. **Copy the Key:** Copy the API key immediately. Store it securely.
+1. Go to [Fireworks AI](https://fireworks.ai/) and sign in
+2. Navigate to API Keys in your dashboard
+3. Create a new API key and name it (e.g., "CodinIT")
+4. Copy the key immediately
-### Supported Models
+## Configuration
-Fireworks AI supports a wide variety of models across different categories. Popular models include:
+1. Click the settings icon (⚙️) in CodinIT
+2. Select "Fireworks" as the API Provider
+3. Paste your API key
+4. Enter the model ID (e.g., "accounts/fireworks/models/llama-v3p1-70b-instruct")
-**Text Generation Models:**
-- Llama 3.1 series (8B, 70B, 405B)
-- Mixtral 8x7B and 8x22B
-- Qwen 2.5 series
-- DeepSeek models with reasoning capabilities
-- Code Llama models for programming tasks
+## Supported Models
-**Vision Models:**
-- Llama 3.2 Vision models
-- Qwen 2-VL models
+- Llama 3.1 series (8B, 70B, 405B)
+- Mixtral 8x7B and 8x22B
+- Qwen 2.5 series
+- DeepSeek models
+- Code Llama models
+- Vision models (Llama 3.2, Qwen 2-VL)
-**Embedding Models:**
-- Various text embedding models for semantic search
+## Key Features
-The platform curates, optimizes, and deploys models with custom kernels and inference optimizations for maximum performance.
+- **Ultra-fast inference:** Up to 4x faster than alternatives
+- **Custom optimizations:** Advanced kernels for maximum performance
+- **40+ models:** Wide selection of optimized models
+- **Fine-tuning:** Available for custom models
+- **OpenAI compatible:** Standard API format
-### Configuration in CodinIT
+## Notes
-1. **Open CodinIT Settings:** Click the settings icon (⚙️) in the CodinIT panel.
-2. **Select Provider:** Choose "Fireworks" from the "API Provider" dropdown.
-3. **Enter API Key:** Paste your Fireworks API key into the "Fireworks API Key" field.
-4. **Enter Model ID:** Specify the model you want to use (e.g., "accounts/fireworks/models/llama-v3p1-70b-instruct").
-5. **Configure Tokens:** Optionally set max completion tokens and context window size.
-
-### Fireworks AI's Performance Focus
-
-Fireworks AI's competitive advantages center on performance optimization and developer experience:
-
-#### Lightning-Fast Inference
-- **Up to 4x faster inference** than alternative platforms
-- **250% higher throughput** compared to open source inference engines
-- **50% faster speed** with significantly reduced latency
-- **6x lower cost** than HuggingFace Endpoints with 2.5x generation speed
-
-#### Advanced Optimization Technology
-- **Custom kernels** and inference optimizations increase throughput per GPU
-- **Multi-LoRA architecture** enables efficient resource sharing
-- **Hundreds of fine-tuned model variants** can run on shared base model infrastructure
-- **Asset-light model** focuses on optimization software rather than expensive GPU ownership
-
-#### Comprehensive Model Support
-- **40+ different AI models** curated and optimized for performance
-- **Multiple GPU types** supported: A100, H100, H200, B200, AMD MI300X
-- **Pay-per-GPU-second billing** with no extra charges for start-up times
-- **OpenAI API compatibility** for seamless integration
-
-### Pricing Structure
-
-Fireworks AI uses a usage-based pricing model with competitive rates:
-
-#### Text and Vision Models (2025)
-| Parameter Count | Price per 1M Input Tokens |
-|---|---|
-| Less than 4B parameters | $0.10 |
-| 4B - 16B parameters | $0.20 |
-| More than 16B parameters | $0.90 |
-| MoE 0B - 56B parameters | $0.50 |
-
-#### Fine-Tuning Services
-| Base Model Size | Price per 1M Training Tokens |
-|---|---|
-| Up to 16B parameters | $0.50 |
-| 16.1B - 80B parameters | $3.00 |
-| DeepSeek R1 / V3 | $10.00 |
-
-#### Dedicated Deployments
-| GPU Type | Price per Hour |
-|---|---|
-| A100 80GB | $2.90 |
-| H100 80GB | $5.80 |
-| H200 141GB | $6.99 |
-| B200 180GB | $11.99 |
-| AMD MI300X | $4.99 |
-
-### Special Features
-
-#### Fine-Tuning Capabilities
-Fireworks offers sophisticated fine-tuning services accessible through CLI interface, supporting JSON-formatted data from databases like MongoDB Atlas. Fine-tuned models cost the same as base models for inference.
-
-#### Developer Experience
-- **Browser playground** for direct model interaction
-- **REST API** with OpenAI compatibility
-- **Comprehensive cookbook** with ready-to-use recipes
-- **Multiple deployment options** from serverless to dedicated GPUs
-
-#### Enterprise Features
-- **HIPAA and SOC 2 Type II compliance** for regulated industries
-- **Self-serve onboarding** for developers
-- **Enterprise sales** for larger deployments
-- **Post-paid billing options** and Business tier
-
-#### Reasoning Model Support
-Advanced support for reasoning models with `` tag processing and reasoning content extraction, making complex multi-step reasoning practical for real-time applications.
-
-### Performance Advantages
-
-Fireworks AI's optimization delivers measurable improvements:
-- **250% higher throughput** vs open source engines
-- **50% faster speed** with reduced latency
-- **6x cost reduction** compared to alternatives
-- **2.5x generation speed** improvement per request
-
-### Tips and Notes
-
-- **Model Selection:** Choose models based on your specific use case - smaller models for speed, larger models for complex reasoning.
-- **Performance Focus:** Fireworks excels at making AI inference fast and cost-effective through advanced optimizations.
-- **Fine-Tuning:** Leverage fine-tuning capabilities to improve model accuracy with your proprietary data.
-- **Compliance:** HIPAA and SOC 2 Type II compliance enables use in regulated industries.
-- **Pricing Model:** Usage-based pricing scales with your success rather than traditional seat-based models.
-- **Developer Resources:** Extensive documentation and cookbook recipes accelerate implementation.
-- **GPU Options:** Multiple GPU types available for dedicated deployments based on performance needs.
\ No newline at end of file
+- **Pricing:** Usage-based, see [Fireworks Pricing](https://fireworks.ai/pricing)
+- **Compliance:** HIPAA and SOC 2 Type II certified
diff --git a/providers/github.mdx b/providers/github.mdx
index 582cbaf..ede2fe5 100644
--- a/providers/github.mdx
+++ b/providers/github.mdx
@@ -1,160 +1,37 @@
---
title: GitHub Models
-description: Access OpenAI GPT-4, o1, and other advanced AI models through GitHub's secure platform with seamless integration and enterprise features.
+description: Access OpenAI GPT-4, o1, and other AI models through GitHub's platform.
---
-GitHub Models provides access to leading AI models through GitHub's infrastructure, offering a convenient way to use OpenAI's GPT series, o1 models, and other advanced AI systems directly within the GitHub ecosystem.
+Access leading AI models through GitHub's infrastructure with seamless integration.
-## Overview
+**Website:** [https://github.com/marketplace/models](https://github.com/marketplace/models)
-GitHub Models serves as a gateway to premium AI models, allowing developers to access cutting-edge AI capabilities without managing complex API integrations. It's particularly useful for teams already using GitHub and wanting seamless AI integration.
+## Setup
-
-
- Direct access to GPT-4, GPT-4o, and o1 models
-
-
- Seamless integration with GitHub workflows
-
-
- Suitable for teams and enterprise use
-
-
+1. **GitHub account:** Ensure you have appropriate permissions
+2. **Create token:** Go to [GitHub Settings > Personal Access Tokens](https://github.com/settings/personal-access-tokens)
+3. **Configure permissions:** Enable Models API access
+4. **Add to CodinIT:** Enter token in GitHub provider settings
+5. **Test models:** Verify connection
## Available Models
-
-
- ### GPT-4o Models
- OpenAI's latest multimodal models with enhanced reasoning and creativity.
+- **GPT-4o series:** GPT-4o, GPT-4o Mini (128K context)
+- **o1 series:** o1, o1-preview, o1-mini (200K context)
+- **GPT-4.1 series:** GPT-4.1, GPT-4.1-mini (1M+ context)
+- **DeepSeek:** DeepSeek-R1 (reasoning)
- - **GPT-4o**: Most advanced model with 128K context
- - **GPT-4o Mini**: Faster, cost-effective version
- - **Best for**: Complex reasoning, creative tasks, analysis
+## Features
-
+- **Large context:** Up to 1M+ tokens
+- **Multimodal:** Text and image support
+- **Advanced reasoning:** Specialized o1 models
+- **GitHub integration:** Native GitHub workflows
+- **Enterprise security:** GitHub's security features
-
- ### o1 Series
- OpenAI's specialized reasoning models for complex problem-solving.
+## Notes
- - **o1**: Advanced reasoning with 200K context window
- - **o1-preview**: Preview version of o1 capabilities
- - **o1-mini**: Efficient reasoning for technical tasks
- - **Best for**: Mathematics, coding, scientific reasoning
-
-
-
-
- ### GPT-4.1 Models
- Latest generation with massive context windows.
-
- - **GPT-4.1**: 1M+ context window for large documents
- - **GPT-4.1-mini**: Efficient version with large context
- - **Best for**: Document analysis, long-form content
-
-
-
-
- ### DeepSeek-R1
- Advanced reasoning model from DeepSeek.
-
- - **DeepSeek-R1**: Specialized reasoning capabilities
- - **Best for**: Technical analysis, research tasks
-
-
-
-
-## Setup Instructions
-
-
- Ensure you have a GitHub account with appropriate permissions
-
- Go to [GitHub Settings > Personal Access Tokens](https://github.com/settings/personal-access-tokens) and create a
- new token
-
- Enable the necessary permissions for Models API access
- Enter your GitHub token in the GitHub provider settings
- Verify connection by testing different available models
-
-
-## Key Features
-
-
- Large Context Windows
- Multimodal Support
- Reasoning Models
- GitHub Integration
- Enterprise Security
-
-
-### Advanced Capabilities
-
-- **Massive Context**: Up to 1M+ tokens for large documents
-- **Multimodal Input**: Text, images, and other media types
-- **Advanced Reasoning**: Specialized models for complex problem-solving
-- **Enterprise Security**: GitHub's security and compliance features
-- **API Compatibility**: Standard OpenAI API format
-
-## Use Cases
-
-
-
- ### Software Development
- Perfect for coding assistance and technical problem-solving.
-
- - Code generation and completion
- - Debugging and error analysis
- - Architecture design
- - Technical documentation
-
-
-
-
- ### Research Tasks
- Excellent for complex analysis and research work.
-
- - Document analysis and summarization
- - Research paper review
- - Data interpretation
- - Scientific reasoning
-
-
-
-
- ### Business Use Cases
- Suitable for enterprise-grade AI applications.
-
- - Business analysis and reporting
- - Customer service automation
- - Content generation
- - Process optimization
-
-
-
-
-## Pricing Information
-
-GitHub Models pricing is based on token usage:
-
-
-
-
-
-
-
-
-
-
- **Billing**: Usage is billed through your GitHub account. Monitor usage in GitHub's billing section.
-
-
-
- **Model Selection**: Choose GPT-4o for most tasks, o1 models for complex reasoning, and GPT-4o Mini for cost-effective
- general use.
-
-
-
- **Token Limits**: Be aware of context window limits. GPT-4.1 supports up to 1M tokens, while others have smaller
- limits.
-
+- **Pricing:** Token-based, billed through GitHub account
+- **Context windows:** Vary by model (128K to 1M+ tokens)
+- **Use cases:** Code development, research, business applications
diff --git a/providers/google.mdx b/providers/google.mdx
index 1b6f11a..3290f73 100644
--- a/providers/google.mdx
+++ b/providers/google.mdx
@@ -1,201 +1,82 @@
---
title: "Google Gemini"
-description: "Configure GCP Vertex AI with CodinIT to access leading generative AI models like Claude 4.5 Sonnet v2. This guide covers GCP environment setup."
+description: "Configure GCP Vertex AI to access Gemini and Claude models through Google Cloud."
---
-### Overview
+Access leading AI models like Gemini and Claude 4.5 Sonnet through Google Cloud's Vertex AI platform.
-**GCP Vertex AI:**\
-A fully managed service that provides access to leading generative AI models—such as Anthropic's Claude 4.5 Sonnet v2—through Google Cloud.\
-[Learn more about GCP Vertex AI](https://cloud.google.com/vertex-ai).
+**Website:** [https://cloud.google.com/vertex-ai](https://cloud.google.com/vertex-ai)
-This guide is tailored for organizations with established GCP environments (leveraging IAM roles, service accounts, and best practices in resource management) to ensure secure and compliant usage.
+## Prerequisites
----
-
-### Step 1: Prepare Your GCP Environment
-
-#### 1.1 Create or Use a GCP Project
-
-- **Sign in to the GCP Console:**\
- [Google Cloud Console](https://console.cloud.google.com/)
-- **Select or Create a Project:**\
- Use an existing project or create a new one dedicated to Vertex AI.
-
-#### 1.2 Set Up IAM Permissions and Service Accounts
-
-- **Assign Required Roles:**
-
- - Grant your user (or service account) the **Vertex AI User** role (`roles/aiplatform.user`)
- - For service accounts, also attach the **Vertex AI Service Agent** role (`roles/aiplatform.serviceAgent`) to enable certain operations
- - Consider additional predefined roles as needed:
- - Vertex AI Platform Express Admin
- - Vertex AI Platform Express User
- - Vertex AI Migration Service User
-
-- **Cross-Project Resource Access:**
- - For BigQuery tables in different projects, assign the **BigQuery Data Viewer** role
- - For Cloud Storage buckets in different projects, assign the **Storage Object Viewer** role
- - For external data sources, refer to the [GCP Vertex AI Access Control documentation](https://cloud.google.com/vertex-ai/general/access-control)
-
----
-
-### Step 2: Verify Regional and Model Access
-
-#### 2.1 Choose and Confirm a Region
-
-Vertex AI supports multiple regions. Select a region that meets your latency, compliance, and capacity needs. Examples include:
+- GCP account with billing enabled
+- GCP project created
+- IAM permissions configured
-- **us-east5 (Columbus, Ohio)**
-- **us-central1 (Iowa)**
-- **europe-west1 (Belgium)**
-- **europe-west4 (Netherlands)**
-- **asia-southeast1 (Singapore)**
-- **global (Global)**
-
-The Global endpoint may offer higher availability and reduce resource exhausted errors. Only Gemini models are supported.
-
-#### 2.2 Enable the Claude 4.5 Sonnet v2 Model
-
-- **Open Vertex AI Model Garden:**\
- In the Cloud Console, navigate to **Vertex AI → Model Garden**
-- **Enable Claude 4.5 Sonnet v2:**\
- Locate the model card for Claude 4.5 Sonnet v2 and click **Enable**
-
----
-
-
-#### 3.1 Install and Open CodinIT
-
-- **Download VS Code:**\
- [Download Visual Studio Code](https://code.visualstudio.com/)
-- **Install the CodinIT Extension:**
- - Open VS Code
- - Navigate to the Extensions Marketplace (Ctrl+Shift+X or Cmd+Shift+X)
- - Search for **Github** and install the extension & Clone the repository
-
-#### 3.2 Configure CodinIT Settings
-
-- **Open CodinIT Settings:**\
- Click the settings ⚙️ icon within the CodinIT extension
-- **Set API Provider:**\
- Choose **GCP Vertex AI** from the API Provider dropdown
-- **Enter Your Google Cloud Project ID:**\
- Provide the project ID you set up earlier
-- **Select the Region:**\
- Choose one of the supported regions (e.g., `us-east5`)
-- **Select the Model:**\
- From the available list, choose **Claude 4.5 Sonnet v2**
-- **Save and Test:**\
- Save your settings and test by sending a simple prompt (e.g., "Generate a Python function to check if a number is prime.")
-
----
+## Setup Steps
-### Step 4: Authentication and Credentials Setup
+### 1. Prepare GCP Environment
-#### Option A: Using Your Google Account (User Credentials)
+1. **Sign in:** [Google Cloud Console](https://console.cloud.google.com/)
+2. **Create/select project:** Use existing or create new project
+3. **Set up IAM:**
+ - Grant **Vertex AI User** role (`roles/aiplatform.user`)
+ - For service accounts, add **Vertex AI Service Agent** role (`roles/aiplatform.serviceAgent`)
-1. **Install the Google Cloud CLI:**\
- Follow the [installation guide](https://cloud.google.com/sdk/install)
-2. **Initialize and Authenticate:**
+### 2. Choose Region and Enable Models
- ```bash
- gcloud init
- gcloud auth application-default login
- ```
+1. **Select region:** Choose region for latency/compliance needs (e.g., `us-east5`, `us-central1`, `europe-west1`)
+ - Use `global` endpoint for higher availability (Gemini only)
+2. **Enable models:** Go to Vertex AI → Model Garden and enable desired models (e.g., Claude 4.5 Sonnet v2)
- - This sets up Application Default Credentials (ADC) using your Google account
+### 3. Configure CodinIT
-3. **Restart VS Code:**\
- Ensure VS Code is restarted so that the CodinIT extension picks up the new credentials
+1. Install CodinIT extension in VS Code
+2. Click settings icon (⚙️)
+3. Select **GCP Vertex AI** as API Provider
+4. Enter your **Google Cloud Project ID**
+5. Select your **Region**
+6. Choose your **Model** (e.g., Claude 4.5 Sonnet v2)
+7. Save and test
-#### Option B: Using a Service Account (JSON Key)
+### 4. Authentication
-1. **Create a Service Account:**
+**Option A: User Credentials**
+```bash
+gcloud init
+gcloud auth application-default login
+```
+Restart VS Code after authentication.
- - In the GCP Console, navigate to **IAM & Admin > Service Accounts**
- - Create a new service account (e.g., "vertex-ai-client")
+**Option B: Service Account**
+1. Create service account in GCP Console
+2. Assign Vertex AI User and Service Agent roles
+3. Generate JSON key
+4. Set environment variable:
+ ```bash
+ export GOOGLE_APPLICATION_CREDENTIALS="/path/to/key.json"
+ ```
+5. Launch VS Code from terminal with this variable set
-2. **Assign Roles:**
-
- - Attach **Vertex AI User** (`roles/aiplatform.user`)
- - Attach **Vertex AI Service Agent** (`roles/aiplatform.serviceAgent`)
- - Optionally, add other roles as required
-
-3. **Generate a JSON Key:**
-
- - In the Service Accounts section, manage keys for your service account and download the JSON key
-
-4. **Set the Environment Variable:**
-
- ```bash
- export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/service-account-key.json"
- ```
-
- - This instructs Google Cloud client libraries (and CodinIT) to use this key
-
-5. **Restart VS Code:**\
- Launch VS Code from a terminal where the `GOOGLE_APPLICATION_CREDENTIALS` variable is set
-
----
-
-### Step 5: Security, Monitoring, and Best Practices
-
-#### 5.1 Enforce Least Privilege
-
-- **Principle of Least Privilege:**\
- Only grant the minimum necessary permissions. Custom roles can offer finer control compared to broad predefined roles
-- **Best Practices:**\
- Refer to [GCP IAM Best Practices](https://cloud.google.com/iam/best-practices)
-
-#### 5.2 Manage Resource Access
-
-- **Project vs. Resource-Level Access:**\
- Access can be managed at both levels. Note that resource-level permissions (e.g., for BigQuery or Cloud Storage) add to, but do not override, project-level policies
-
-#### 5.3 Monitor Usage and Quotas
-
-- **Model Observability Dashboard:**
-
- - In the Vertex AI Console, navigate to the **Model Observability** dashboard
- - Monitor metrics such as request throughput, latency, and error rates (including 429 quota errors)
-
-- **Quota Management:**
- - If you encounter 429 errors, check the **IAM & Admin > Quotas** page
- - Request a quota increase if necessary\
- [Learn more about GCP Vertex AI Quotas](https://cloud.google.com/vertex-ai/quotas)
-
-#### 5.4 Service Agents and Cross-Project Considerations
-
-- **Service Agents:**\
- Be aware of the different service agents:
-
- - Vertex AI Service Agent
- - Vertex AI RAG Data Service Agent
- - Vertex AI Custom Code Service Agent
- - Vertex AI Extension Service Agent
-
-- **Cross-Project Access:**\
- For resources in other projects (e.g., BigQuery, Cloud Storage), ensure that the appropriate roles (BigQuery Data Viewer, Storage Object Viewer) are assigned
-
----
+## Supported Regions
-### Conclusion
+- `us-east5` (Columbus, Ohio)
+- `us-central1` (Iowa)
+- `europe-west1` (Belgium)
+- `europe-west4` (Netherlands)
+- `asia-southeast1` (Singapore)
+- `global` (Global - Gemini only)
-By following these steps, your enterprise team can securely integrate GCP Vertex AI with the CodinIT VS Code extension to harness the power of **Claude 4.5 Sonnet v2**:
+## Notes
-- **Prepare Your GCP Environment:**\
- Create or use a project, configure IAM with least privilege, and ensure necessary roles (including the Vertex AI Service Agent role) are attached
-- **Verify Regional and Model Access:**\
- Confirm that your chosen region supports Claude 4.5 Sonnet v2 and that the model is enabled
-- **Configure CodinIT in VS Code:**\
- Install CodinIT, enter your project ID, select the appropriate region, and choose the model
-- **Set Up Authentication:**\
- Use either user credentials (via `gcloud auth application-default login`) or a service account with a JSON key
-- **Implement Security and Monitoring:**\
- Adhere to best practices for IAM, manage resource access carefully, and monitor usage with the Model Observability dashboard
+- **Cross-region inference:** Check "Cross Region Inference" for models requiring inference profiles
+- **First-time use:** Some models (e.g., Anthropic) require submitting use case form via Console
+- **Permissions:** Minimal required: `bedrock:InvokeModel`, `bedrock:InvokeModelWithResponseStream`
+- **Monitoring:** Use CloudWatch and CloudTrail for logging and monitoring
+- **Security:** Follow [GCP IAM Best Practices](https://cloud.google.com/iam/best-practices)
-For further details, please consult the [GCP Vertex AI Documentation](https://cloud.google.com/vertex-ai/docs) and your internal security policies.\
-Happy coding!
+## Resources
-_This guide will be updated as GCP Vertex AI and CodinIT evolve. Always refer to the latest documentation for current practices._
\ No newline at end of file
+- [GCP Vertex AI Documentation](https://cloud.google.com/vertex-ai/docs)
+- [Access Control](https://cloud.google.com/vertex-ai/general/access-control)
+- [Quotas](https://cloud.google.com/vertex-ai/quotas)
diff --git a/providers/groq.mdx b/providers/groq.mdx
index aa7d106..f7a034f 100644
--- a/providers/groq.mdx
+++ b/providers/groq.mdx
@@ -1,80 +1,45 @@
---
title: "Groq"
-description: "Learn how to configure and use Groq's lightning-fast inference to access models from OpenAI, Meta, DeepSeek, and more with Groq."
+description: "Configure Groq's ultra-fast LPU inference for models from OpenAI, Meta, and DeepSeek."
---
-Groq provides ultra-fast AI inference through their custom LPU™ (Language Processing Unit) architecture, purpose-built for inference rather than adapted from training hardware. Groq hosts open-source models from various providers including OpenAI, Meta, DeepSeek, Moonshot AI, and others.
+Groq provides ultra-fast AI inference through custom LPU™ (Language Processing Unit) architecture. Hosts open-source models from OpenAI, Meta, DeepSeek, and others.
**Website:** [https://groq.com/](https://groq.com/)
-### Getting an API Key
+## Getting an API Key
-1. **Sign Up/Sign In:** Go to [Groq](https://groq.com/) and create an account or sign in.
-2. **Navigate to Console:** Go to the [Groq Console](https://console.groq.com/) to access your dashboard.
-3. **Create a Key:** Navigate to the API Keys section and create a new API key. Give your key a descriptive name (e.g., "CodinIT").
-4. **Copy the Key:** Copy the API key immediately. You will not be able to see it again. Store it securely.
+1. Go to [Groq Console](https://console.groq.com/) and sign in
+2. Navigate to API Keys section
+3. Create a new API key and name it (e.g., "CodinIT")
+4. Copy the key immediately - you won't see it again
-### Supported Models
+## Configuration
-CodinIT supports the following Groq models:
+1. Click the settings icon (⚙️) in CodinIT
+2. Select "Groq" as the API Provider
+3. Paste your API key
+4. Choose your model
-- `llama-3.3-70b-versatile` (Meta) - Balanced performance with 131K context
-- `llama-3.1-8b-instant` (Meta) - Fast inference with 131K context
-- `openai/gpt-oss-120b` (OpenAI) - Featured flagship model with 131K context
-- `openai/gpt-oss-20b` (OpenAI) - Featured compact model with 131K context
-- `moonshotai/kimi-k2-instruct` (Moonshot AI) - 1 trillion parameter model with prompt caching
-- `deepseek-r1-distill-llama-70b` (DeepSeek/Meta) - Reasoning-optimized model
-- `qwen/qwen3-32b` (Alibaba Cloud) - Enhanced for Q&A tasks
-- `meta-llama/llama-4-maverick-17b-128e-instruct` (Meta) - Latest Llama 4 variant
-- `meta-llama/llama-4-scout-17b-16e-instruct` (Meta) - Latest Llama 4 variant
+## Supported Models
-### Configuration in CodinIT
+- `llama-3.3-70b-versatile` (Meta) - 131K context
+- `openai/gpt-oss-120b` (OpenAI) - 131K context
+- `moonshotai/kimi-k2-instruct` - 1T parameters with caching
+- `deepseek-r1-distill-llama-70b` - Reasoning optimized
+- `qwen/qwen3-32b` (Alibaba) - Q&A enhanced
+- `meta-llama/llama-4-maverick-17b-128e-instruct`
-1. **Open CodinIT Settings:** Click the settings icon (⚙️) in the CodinIT panel.
-2. **Select Provider:** Choose "Groq" from the "API Provider" dropdown.
-3. **Enter API Key:** Paste your Groq API key into the "Groq API Key" field.
-4. **Select Model:** Choose your desired model from the "Model" dropdown.
+## Key Features
-### Groq's Speed Revolution
+- **Ultra-fast inference:** Sub-millisecond latency with LPU architecture
+- **Large context:** Up to 131K tokens
+- **Prompt caching:** Available on select models
+- **Vision support:** Available on select models
-Groq's LPU architecture delivers several key advantages over traditional GPU-based inference:
+Learn more about [LPU architecture](https://groq.com/blog/inside-the-lpu-deconstructing-groq-speed).
-#### LPU Architecture
-Unlike GPUs that are adapted from training workloads, Groq's LPU is purpose-built for inference. This eliminates architectural bottlenecks that create latency in traditional systems.
+## Notes
-#### Unmatched Speed
-- **Sub-millisecond latency** that stays consistent across traffic, regions, and workloads
-- **Static scheduling** with pre-computed execution graphs eliminates runtime coordination delays
-- **Tensor parallelism** optimized for low-latency single responses rather than high-throughput batching
-
-#### Quality Without Tradeoffs
-- **TruePoint numerics** reduce precision only in areas that don't affect accuracy
-- **100-bit intermediate accumulation** ensures lossless computation
-- **Strategic precision control** maintains quality while achieving 2-4× speedup over BF16
-
-#### Memory Architecture
-- **SRAM as primary storage** (not cache) with hundreds of megabytes on-chip
-- **Eliminates DRAM/HBM latency** that plagues traditional accelerators
-- **Enables true tensor parallelism** by splitting layers across multiple chips
-
-Learn more about Groq's technology in their [LPU architecture blog post](https://groq.com/blog/inside-the-lpu-deconstructing-groq-speed).
-
-### Special Features
-
-#### Prompt Caching
-The Kimi K2 model supports prompt caching, which can significantly reduce costs and latency for repeated prompts.
-
-#### Vision Support
-Select models support image inputs and vision capabilities. Check the model details in the Groq Console for specific capabilities.
-
-#### Reasoning Models
-Some models like DeepSeek variants offer enhanced reasoning capabilities with step-by-step thought processes.
-
-### Tips and Notes
-
-- **Model Selection:** Choose models based on your specific use case and performance requirements.
-- **Speed Advantage:** Groq excels at single-request latency rather than high-throughput batch processing.
-- **OSS Model Provider:** Groq hosts open-source models from multiple providers (OpenAI, Meta, DeepSeek, etc.) on their fast infrastructure.
-- **Context Windows:** Most models offer large context windows (up to 131K tokens) for including substantial code and context.
-- **Pricing:** Groq offers competitive pricing with their speed advantages. Check the [Groq Pricing](https://groq.com/pricing) page for current rates.
-- **Rate Limits:** Groq has generous rate limits, but check their documentation for current limits based on your usage tier.
\ No newline at end of file
+- **Speed:** Optimized for single-request latency
+- **Pricing:** See [Groq Pricing](https://groq.com/pricing)
\ No newline at end of file
diff --git a/providers/huggingface.mdx b/providers/huggingface.mdx
index 6e0c3d6..6314f4e 100644
--- a/providers/huggingface.mdx
+++ b/providers/huggingface.mdx
@@ -1,206 +1,41 @@
---
title: Hugging Face
-description: Access thousands of open-source AI models including Qwen, Llama, and CodeLlama through Hugging Face's cost-effective inference API.
+description: Access thousands of open-source AI models through Hugging Face's inference API.
---
-## Overview
+**Website:** [https://huggingface.co/](https://huggingface.co/)
-Hugging Face democratizes AI by providing easy access to cutting-edge models. Their platform hosts models for text generation, code completion, analysis, and more, all accessible through a simple API interface.
+## Getting an API Key
-
-
- Access to thousands of community models
-
-
- Latest models from top AI research labs
-
-
- Affordable pricing for open-source models
-
-
+1. Go to [Hugging Face](https://huggingface.co/) and sign in
+2. Navigate to [Settings > Access Tokens](https://huggingface.co/settings/tokens)
+3. Create a new token with "Read" permissions
+4. Copy the token immediately
-## Available Models
+## Configuration
-
-
- ### Qwen Series
- High-quality models from Alibaba Cloud, excellent for coding and general tasks.
+1. Click the settings icon (⚙️) in CodinIT
+2. Select "Hugging Face" as the API Provider
+3. Paste your token
+4. Choose your model
- - **Qwen2.5-Coder-32B**: Specialized for code generation
- - **Qwen2.5-72B**: General purpose large model
- - **Best for**: Code completion, technical writing, analysis
+## Supported Models
-
+- **Qwen series:** Qwen2.5-Coder-32B, Qwen2.5-72B
+- **Llama series:** Llama-3.1-70B, Llama-3.1-405B
+- **CodeLlama:** CodeLlama-34B
+- **Yi series:** Yi-1.5-34B
+- **Hermes series:** Hermes-3-Llama-3.1-8B
-
- ### Meta Llama Series
- Industry-leading open-source models from Meta.
+## Features
- - **Llama-3.1-70B**: Powerful general-purpose model
- - **Llama-3.1-405B**: Massive model for complex tasks
- - **Best for**: Advanced reasoning, creative tasks
+- **Open source:** All models openly available
+- **Community driven:** Constantly updated
+- **Research access:** Latest models from top labs
+- **Cost effective:** Affordable pricing
+- **Free tier:** Available for testing
-
+## Notes
-
- ### CodeLlama Series
- Specialized models for programming and code-related tasks.
-
- - **CodeLlama-34B**: Large coding model
- - **Best for**: Code generation, debugging, technical analysis
-
-
-
-
- ### Yi Series
- High-performance models from 01.AI.
-
- - **Yi-1.5-34B**: Balanced performance and capability
- - **Best for**: General AI tasks, analysis, writing
-
-
-
-
- ### Hermes Series
- Fine-tuned models optimized for helpfulness and reasoning.
-
- - **Hermes-3-Llama-3.1-8B**: Efficient and capable
- - **Best for**: Conversational AI, helpful responses
-
-
-
-
-## Setup Instructions
-
-
-
- Visit [Hugging Face](https://huggingface.co/) and create a free account
-
-
- Go to [Settings > Access Tokens](https://huggingface.co/settings/tokens) and create a new token
-
- Ensure your token has "Read" permissions for model inference
- Enter your Hugging Face token in the provider settings
- Try different models to find the best fit for your needs
-
-
-## Key Features
-
-
- Open Source
- Community Driven
- Research Models
- Cost Effective
- Diverse Models
-
-
-### Platform Advantages
-
-- **Open Source**: All models are openly available and auditable
-- **Community Driven**: Constantly updated by global AI community
-- **Research Access**: Latest models from top research institutions
-- **Flexible Pricing**: Pay only for what you use
-- **Wide Selection**: Models for every use case and skill level
-
-## Use Cases
-
-
-
- ### Programming Tasks
- Specialized models for software development and coding assistance.
-
- - Code generation and completion
- - Code review and analysis
- - Debugging assistance
- - Technical documentation
-
-
-
-
- ### Academic Research
- Powerful models for research, analysis, and academic work.
-
- - Scientific paper analysis
- - Research summarization
- - Data interpretation
- - Academic writing assistance
-
-
-
-
- ### Creative Writing
- Models for content creation and creative tasks.
-
- - Creative writing
- - Content generation
- - Language translation
- - Educational content
-
-
-
-
- ### Enterprise Use
- Suitable for business and productivity applications.
-
- - Business analysis
- - Report generation
- - Customer communication
- - Process automation
-
-
-
-
-## Pricing Information
-
-Hugging Face offers flexible pricing based on model size and usage:
-
-
-
-
-
-
-
-
-**Free Tier**: Hugging Face offers a generous free tier for testing and light usage.
-
-
- **Model Selection**: Start with smaller models for testing, then scale up to larger models for production use.
-
-
-
- **Rate Limits**: Free tier has usage limits. Paid plans offer higher rate limits and priority access.
-
-
-## Model Performance Notes
-
-
-
- ### Speed Considerations
- Model size affects response time and resource usage.
-
- - **Small models**: Fast responses, lower resource usage
- - **Large models**: Slower responses, higher resource usage
- - **Consider trade-offs**: Speed vs. quality based on your needs
-
-
-
-
- ### Token Limits
- Different models have varying context window sizes.
-
- - **Most models**: 4K-8K token context windows
- - **Specialized models**: May have different limits
- - **Check documentation**: Verify limits for your chosen model
-
-
-
-
- ### Staying Current
- Hugging Face models are frequently updated by the community.
-
- - **Regular updates**: New model versions released frequently
- - **Version pinning**: Specify exact model versions for consistency
- - **Community contributions**: New models added regularly
-
-
-
+- **Pricing:** Based on model size and usage
+- **Rate limits:** Free tier has usage limits
diff --git a/providers/hyperbolic.mdx b/providers/hyperbolic.mdx
index aa6fe3b..0d6b0e6 100644
--- a/providers/hyperbolic.mdx
+++ b/providers/hyperbolic.mdx
@@ -1,130 +1,39 @@
---
title: Hyperbolic
-description: Access high-performance open-source AI models through Hyperbolic's optimized infrastructure with fast inference and competitive pricing.
+description: Access high-performance open-source AI models through Hyperbolic's optimized infrastructure.
---
-Hyperbolic provides access to cutting-edge open-source AI models through their optimized cloud infrastructure, offering fast inference speeds and a wide selection of models from leading AI research organizations.
+**Website:** [https://app.hyperbolic.xyz/](https://app.hyperbolic.xyz/)
-## Overview
+## Getting an API Key
-Hyperbolic specializes in running open-source AI models with enterprise-grade performance and reliability. Their platform offers models from Qwen, DeepSeek, and other leading AI labs, optimized for speed and cost-effectiveness.
+1. Go to [Hyperbolic](https://app.hyperbolic.xyz/) and sign in
+2. Navigate to Settings
+3. Create a new API key
+4. Copy the key immediately
-
-
- Optimized infrastructure for fast inference
-
-
- Access to leading open-source AI models
-
-
- Competitive pricing for enterprise use
-
-
+## Configuration
-## Available Models
+1. Click the settings icon (⚙️) in CodinIT
+2. Select "Hyperbolic" as the API Provider
+3. Paste your API key
+4. Choose your model
-
-
- ### Qwen Series
- Advanced models from Alibaba Cloud, excellent for coding and general AI tasks.
+## Supported Models
- - **Qwen2.5-Coder-32B**: Specialized code generation model
- - **Qwen2.5-72B**: Large general-purpose model
- - **QwQ-32B-Preview**: Advanced reasoning model
- - **Qwen2-VL-72B**: Multimodal model with vision capabilities
+- **Qwen series:** Qwen2.5-Coder-32B, Qwen2.5-72B, QwQ-32B-Preview
+- **Qwen Vision:** Qwen2-VL-72B (multimodal)
+- **DeepSeek:** DeepSeek-V2.5
-
+## Features
-
- ### DeepSeek Series
- Efficient and capable models from DeepSeek AI.
+- **High performance:** Optimized infrastructure for fast inference
+- **Open source focus:** Latest open-source models
+- **Multimodal:** Vision and text capabilities
+- **Cost effective:** Competitive pricing
+- **Free credits:** Available for new accounts
- - **DeepSeek-V2.5**: Balanced performance and efficiency
- - **Best for**: General AI tasks, analysis, reasoning
+## Notes
-
-
-
-## Setup Instructions
-
-
- Visit [Hyperbolic](https://app.hyperbolic.xyz/) and create an account
- Navigate to Settings and create a new API key
- Add your API key to the Hyperbolic provider settings
- Choose from available models based on your needs
-
-
-## Key Features
-
-
- High Performance
- Open Source
- Multimodal Support
- Enterprise Ready
- Cost Optimized
-
-
-### Platform Advantages
-
-- **Optimized Infrastructure**: Fast inference speeds with low latency
-- **Open Source Focus**: Access to the latest open-source models
-- **Multimodal Capabilities**: Models with vision and text understanding
-- **Enterprise Features**: Reliability and scalability for business use
-- **Competitive Pricing**: Cost-effective compared to proprietary models
-
-## Use Cases
-
-
-
- ### Programming Tasks
- Perfect for software development and technical work.
-
- - Code generation and completion
- - Technical problem-solving
- - Code review and analysis
- - Documentation generation
-
-
-
-
- ### Research Applications
- Suitable for research, analysis, and academic work.
-
- - Scientific research assistance
- - Data analysis and interpretation
- - Academic writing support
- - Complex reasoning tasks
-
-
-
-
- ### Enterprise Use
- Reliable for business and productivity applications.
-
- - Business intelligence
- - Content creation
- - Customer service automation
- - Process optimization
-
-
-
-
-## Pricing Information
-
-Hyperbolic offers flexible pricing based on model usage:
-
-
-
-
-
-
-
-**Free Credits**: New accounts receive free credits for testing and evaluation.
-
-
- **Model Selection**: Choose Qwen models for coding tasks and DeepSeek for general-purpose applications.
-
-
-
- **Rate Limits**: Monitor your usage to stay within rate limits. Enterprise plans offer higher limits.
-
+- **Pricing:** Usage-based, competitive rates
+- **Rate limits:** Monitor usage to stay within limits
diff --git a/providers/lmstudio.mdx b/providers/lmstudio.mdx
index b977922..1a29f89 100644
--- a/providers/lmstudio.mdx
+++ b/providers/lmstudio.mdx
@@ -1,265 +1,73 @@
---
title: LM Studio
-description: Run AI models locally with LM Studio's user-friendly interface for privacy, speed, and offline development capabilities.
+description: Run AI models locally with LM Studio's user-friendly interface for privacy and offline development.
---
-LM Studio provides a user-friendly way to run large language models locally on your computer, offering privacy, speed, and offline capabilities without requiring an internet connection.
+LM Studio provides a user-friendly way to run AI models locally with privacy, speed, and offline capabilities.
-## Overview
+**Website:** [https://lmstudio.ai](https://lmstudio.ai)
-LM Studio bridges the gap between powerful AI models and local computing, allowing you to run advanced AI models directly on your machine. It's perfect for users who want privacy, speed, and control over their AI interactions.
+## Setup
-
-
- Run AI models directly on your computer
-
-
- Keep conversations and data completely private
-
-
- Work without internet connectivity
-
-
+1. **Download:** Visit [lmstudio.ai](https://lmstudio.ai) and download for your OS
+2. **Install and launch:** Open LM Studio
+3. **Download a model:** Go to "Discover" tab and download a model
+ - **Recommended:** Qwen3 Coder 30B A3B Instruct for best CodinIT experience
+4. **Start server:** Go to "Developer" tab and toggle server to "Running" (runs at `http://localhost:51732`)
+5. **Configure model settings:**
+ - **Context Length:** Set to 262,144 (maximum)
+ - **KV Cache Quantization:** Leave unchecked (critical for performance)
+ - **Flash Attention:** Enable if available
-## How It Works
+## Configuration in CodinIT
-LM Studio downloads and runs AI models locally using your computer's resources. It provides a simple interface to manage models, start local servers, and connect to various applications including Codinit.
+1. Click the settings icon (⚙️) in CodinIT
+2. Select "LM Studio" as the API Provider
+3. Set server URL to `http://localhost:51732`
+4. Choose your model
-
-
- ### Downloading Models
- Choose from thousands of available models in various sizes and capabilities.
+## Quantization Guide
- - **Model Library**: Browse and download models from Hugging Face
- - **Size Options**: From small 1GB models to large 100GB+ models
- - **Format Support**: GGUF, SafeTensor, and other formats
- - **Automatic Updates**: Stay current with latest model versions
+Choose based on available RAM:
+- **32GB RAM:** 4-bit quantization (~17GB download)
+- **64GB RAM:** 8-bit quantization (~32GB download)
+- **128GB+ RAM:** Full precision or larger models
-
+## Model Format
-
- ### Running AI Locally
- Start a local API server that applications can connect to.
+- **Mac (Apple Silicon):** Use MLX format
+- **Windows/Linux:** Use GGUF format
- - **One-Click Setup**: Start local server with single button
- - **API Compatibility**: OpenAI-compatible API endpoints
- - **Multi-Platform**: Windows, macOS, and Linux support
- - **Resource Management**: Monitor CPU/GPU usage and memory
+## Features
-
-
-
- ### Optimization Settings
- Fine-tune performance based on your hardware capabilities.
-
- - **GPU Acceleration**: Utilize NVIDIA/AMD GPUs when available
- - **CPU Optimization**: Efficient CPU inference for all systems
- - **Memory Management**: Control RAM usage and model loading
- - **Quantization**: Balance speed vs. quality with different precisions
-
-
-
-
-## Setup Instructions
-
-
-
- Visit [lmstudio.ai](https://lmstudio.ai) and download the application for your operating system.
-
- 
-
-
- Install LM Studio and launch the application. You'll see four tabs on the left:
- - **Chat**: Interactive chat interface
- - **Developer**: Where you will start the server
- - **My Models**: Your downloaded models storage
- - **Discover**: Browse and add new models
-
-
- Navigate to the "Discover" tab, browse available models, and download your preferred model. Wait for the download to complete.
-
- **Recommended**: Use **Qwen3 Coder 30B A3B Instruct** for the best experience with CodinIT. This model delivers strong coding performance and reliable tool use.
-
-
- Navigate to the "Developer" tab and toggle the server switch to "Running". The server will run at `http://localhost:51732`.
-
- 
-
-
- After loading your model in the Developer tab, configure these critical settings:
- - **Context Length**: Set to 262,144 (the model's maximum)
- - **KV Cache Quantization**: Leave unchecked (critical for consistent performance)
- - **Flash Attention**: Enable if available (improves performance)
-
-
- Set the server URL in CodinIT settings and verify the connection to start using local AI models.
-
-
-
-### Quantization Guide
-
-Choose quantization based on your available RAM:
-
-- **32GB RAM**: Use 4-bit quantization (~17GB download)
-- **64GB RAM**: Use 8-bit quantization (~32GB download) for better quality
-- **128GB+ RAM**: Consider full precision or larger models
-
-### Model Format
-
-- **Mac (Apple Silicon)**: Use MLX format for optimized performance
-- **Windows/Linux**: Use GGUF format
-
-## Key Features
-
-
- Local Execution
- Privacy Focused
- Offline Capable
- Cost Free
- Customizable
-
-
-### Platform Advantages
-
-- **Complete Privacy**: All conversations stay on your device
-- **No API Costs**: Run unlimited AI interactions for free
-- **Offline Operation**: Work without internet connectivity
-- **Hardware Flexibility**: Run on any modern computer
-- **Model Variety**: Access thousands of different AI models
-
-## Use Cases
-
-
-
- ### Secure Development
- Perfect for sensitive development work and private projects.
-
- - Code review without sharing code externally
- - Private documentation and analysis
- - Secure brainstorming and planning
- - Confidential business applications
-
-
-
-
- ### Offline Productivity
- Continue working with AI assistance even without internet.
-
- - Travel and remote work scenarios
- - Limited connectivity environments
- - Data-sensitive offline processing
- - Emergency backup AI capabilities
-
-
-
-
- ### Budget-Friendly AI
- Access advanced AI capabilities without ongoing costs.
-
- - Unlimited usage without API fees
- - No per-token or per-request charges
- - One-time setup, ongoing free usage
- - Cost-effective for heavy AI users
-
-
-
-
- ### Educational Use
- Learn about AI and experiment with different models.
-
- - Study different model architectures
- - Compare model performance and capabilities
- - Learn prompt engineering techniques
- - Understand AI model behaviors
-
-
-
+- **Complete privacy:** All data stays on your device
+- **No API costs:** Unlimited free usage
+- **Offline operation:** Works without internet
+- **Hardware flexibility:** Runs on any modern computer
## System Requirements
-
-
- ### Basic Setup
- Requirements for running small to medium models.
-
- - **RAM**: 8GB minimum, 16GB recommended
- - **Storage**: 10GB free space for models and application
- - **OS**: Windows 10+, macOS 10.15+, Ubuntu 18.04+
- - **CPU**: Modern multi-core processor
-
-
-
-
- ### Optimal Performance
- Recommended specifications for large models and best performance.
-
- - **RAM**: 32GB or more for large models
- - **GPU**: NVIDIA GPU with 8GB+ VRAM (optional but recommended)
- - **Storage**: SSD with 50GB+ free space
- - **CPU**: Multi-core processor with AVX2 support
-
-
-
-
- ### Hardware Acceleration
- Utilize GPU acceleration for faster inference speeds.
-
- - **NVIDIA GPUs**: CUDA support for maximum performance
- - **AMD GPUs**: ROCm support on Linux
- - **Apple Silicon**: Native acceleration on M1/M2/M3 Macs
- - **CPU Fallback**: Automatic fallback to CPU when GPU unavailable
+**Minimum:**
+- 8GB RAM (16GB recommended)
+- 10GB free storage
+- Modern multi-core CPU
-
-
-
-## Model Selection Guide
-
-
-
- ### Choosing Model Size
- Balance between performance and resource requirements.
-
- - **Small Models (1-3GB)**: Fast, basic capabilities, good for simple tasks
- - **Medium Models (3-7GB)**: Balanced performance, good for most applications
- - **Large Models (7-20GB)**: High quality, slower but more capable
- - **XL Models (20GB+)**: Maximum quality, requires powerful hardware
-
-
-
-
- ### Specialized Models
- Choose models based on your specific needs.
-
- - **Code Models**: Code generation, debugging, technical writing
- - **General Chat**: Conversation, analysis, creative writing
- - **Math/Science**: Mathematical reasoning, scientific analysis
- - **Multilingual**: Support for multiple languages and cultures
-
-
-
-
-**Free Forever**: LM Studio is completely free to use. No subscriptions or hidden costs.
-
-
- **Start Small**: Begin with smaller models to test your setup, then upgrade to larger models as needed.
-
-
-
- **Resource Intensive**: Large models require significant RAM and may run slowly on lower-end hardware.
-
+**Recommended:**
+- 32GB+ RAM for large models
+- NVIDIA GPU with 8GB+ VRAM (optional)
+- SSD with 50GB+ free space
## Troubleshooting
-If CodinIT can't connect to LM Studio:
-
-1. Verify LM Studio server is running (check Developer tab)
+If CodinIT can't connect:
+1. Verify LM Studio server is running
2. Ensure a model is loaded
-3. Check your system meets hardware requirements
-4. Confirm the server URL matches in CodinIT settings
+3. Check system meets hardware requirements
+4. Confirm server URL matches in CodinIT settings
-## Important Notes
+## Notes
- Start LM Studio before using with CodinIT
- Keep LM Studio running in background
-- First model download may take several minutes depending on size
+- First model download may take several minutes
- Models are stored locally after download
diff --git a/providers/mistral-ai.mdx b/providers/mistral-ai.mdx
index ffa5d3b..ef075a6 100644
--- a/providers/mistral-ai.mdx
+++ b/providers/mistral-ai.mdx
@@ -1,53 +1,43 @@
---
title: "Mistral"
-description: "Learn how to configure and use Mistral AI models, including Codestral, with CodinIT. Covers API key setup and model selection."
+description: "Configure Mistral AI models including Codestral for code generation with CodinIT."
---
-CodinIT supports accessing models through the Mistral AI API, including both standard Mistral models and the code-specialized Codestral model.
-
**Website:** [https://mistral.ai/](https://mistral.ai/)
-### Getting an API Key
-
-1. **Sign Up/Sign In:** Go to the [Mistral Platform](https://console.mistral.ai/). Create an account or sign in. You may need to go through a verification process.
-2. **Create an API Key:**
- - [La Plateforme API Key](https://console.mistral.ai/api-keys/) and/or
- - [Codestral API Key](https://console.mistral.ai/codestral)
+## Getting an API Key
-### Supported Models
+1. Go to [Mistral Console](https://console.mistral.ai/) and sign in
+2. Create API keys:
+ - [La Plateforme API Key](https://console.mistral.ai/api-keys/) for standard models
+ - [Codestral API Key](https://console.mistral.ai/codestral) for Codestral models
+3. Copy the key immediately
-CodinIT supports the following Mistral models:
+## Configuration
-- pixtral-large-2411
-- ministral-3b-2410
-- ministral-8b-2410
-- mistral-small-latest
-- mistral-medium-latest
-- mistral-small-2501
-- pixtral-12b-2409
-- open-mistral-nemo-2407
-- open-codestral-mamba
-- codestral-2501
-- devstral-small-2505
+1. Click the settings icon (⚙️) in CodinIT
+2. Select "Mistral" as the API Provider
+3. Paste your API key
+4. Choose your model
-**Note:** Model availability and specifications may change.
-Refer to the [Mistral AI documentation](https://docs.mistral.ai/api/) and [Mistral Model Overview](https://docs.mistral.ai/getting-started/models/models_overview/) for the most current information.
+## Supported Models
-### Configuration in CodinIT
+- `pixtral-large-2411`
+- `mistral-small-2501`
+- `ministral-8b-2410`
+- `codestral-2501` (Code specialized)
+- `devstral-small-2505` (Code specialized)
+- `open-codestral-mamba`
-1. **Open CodinIT Settings:** Click the settings icon (⚙️) in the CodinIT panel.
-2. **Select Provider:** Choose "Mistral" from the "API Provider" dropdown.
-3. **Enter API Key:** Paste your Mistral API key into the "Mistral API Key" field if you're using a standard `mistral` model. If you intend to use `codestral-latest`, see the "Using Codestral" section below.
-4. **Select Model:** Choose your desired model from the "Model" dropdown.
+See [Mistral Models](https://docs.mistral.ai/getting-started/models/models_overview/) for full details.
-### Using Codestral
+## Using Codestral
-[Codestral](https://docs.mistral.ai/capabilities/code_generation/) is a model specifically designed for code generation and interaction.
-For Codestral, you can use different endpoints (Default: codestral.mistral.ai).
-If using the La Plateforme API Key for Codestral, change the **Codestral Base Url** to: `https://api.mistral.ai`
+For Codestral models:
+- Use Codestral API key from `codestral.mistral.ai`
+- Or use La Plateforme API key and set **Codestral Base URL** to `https://api.mistral.ai`
-To use Codestral with CodinIT:
+## Notes
-1. **Select "Mistral" as the API Provider in CodinIT Settings.**
-2. **Select a Codestral Model** (e.g., `codestral-latest`) from the "Model" dropdown.
-3. **Enter your Codestral API Key** (from `codestral.mistral.ai`) or your La Plateforme API Key (from `api.mistral.ai`) into the appropriate API key field in CodinIT.
\ No newline at end of file
+- **Pricing:** See [Mistral Pricing](https://mistral.ai/pricing)
+- **Documentation:** [Mistral API Docs](https://docs.mistral.ai/api/)
diff --git a/providers/moonshot.mdx b/providers/moonshot.mdx
index c9a4875..2a1e797 100644
--- a/providers/moonshot.mdx
+++ b/providers/moonshot.mdx
@@ -1,196 +1,38 @@
---
title: Moonshot
-description: Access advanced Chinese AI models including Kimi series through Moonshot's platform with excellent multilingual and reasoning capabilities.
+description: Configure Moonshot's Kimi series models for Chinese language and multilingual tasks.
---
-Moonshot provides access to advanced Chinese AI models, including the popular Kimi series, offering strong performance in both Chinese and English languages with competitive pricing.
+**Website:** [https://platform.moonshot.ai/](https://platform.moonshot.ai/)
-## Overview
+## Getting an API Key
-Moonshot specializes in Chinese language AI models while maintaining strong English capabilities. Their platform offers a range of models from lightweight options to powerful enterprise-grade solutions, with particular strength in Chinese language processing and cultural understanding.
+1. Go to [Moonshot Platform](https://platform.moonshot.ai/) and sign in
+2. Navigate to API console
+3. Create a new API key
+4. Copy the key immediately
-
-
- Specialized models for Chinese language processing
-
-
- Popular conversational AI models
-
-
- Strong performance in both Chinese and English
-
-
+## Configuration
-## Available Models
+1. Click the settings icon (⚙️) in CodinIT
+2. Select "Moonshot" as the API Provider
+3. Paste your API key
+4. Choose your model
-
-
- ### Kimi Conversational Models
- Moonshot's flagship conversational AI models, known for natural interactions.
+## Supported Models
- - **Kimi Latest**: Most advanced conversational model
- - **Kimi K2 Preview**: Advanced reasoning and analysis
- - **Kimi K2 Turbo**: Fast and efficient for quick tasks
- - **Kimi Thinking**: Enhanced reasoning capabilities
+- **Kimi series:** Kimi Latest, K2 Preview, K2 Turbo, Kimi Thinking
+- **Moonshot v1:** 8K, 32K, 128K context variants
+- **Vision models:** Moonshot v1 Vision (8K, 32K, 128K)
-
+## Features
-
- ### Moonshot Vision Models
- Multimodal models with vision capabilities and large context windows.
+- **Chinese language:** Specialized for Chinese text processing
+- **Multimodal:** Vision and text understanding
+- **Large context:** Up to 128K tokens
+- **Cultural intelligence:** Chinese cultural context understanding
- - **Moonshot v1 8K**: Basic model with 8K context
- - **Moonshot v1 32K**: Extended context for longer conversations
- - **Moonshot v1 128K**: Large context for complex tasks
- - **Moonshot v1 Auto**: Adaptive model selection
+## Notes
-
-
-
- ### Multimodal Vision Series
- Models capable of understanding both text and images.
-
- - **Moonshot v1 8K Vision**: Basic multimodal capabilities
- - **Moonshot v1 32K Vision**: Extended context with vision
- - **Moonshot v1 128K Vision**: Large context multimodal model
-
-
-
-
-## Setup Instructions
-
-
- Visit [Moonshot Platform](https://platform.moonshot.ai/) and create an account
- Navigate to the API console in your dashboard
- Create a new API key for accessing the models
- Add your Moonshot API key to the provider settings
- Try different models to find the best fit for your needs
-
-
-## Key Features
-
-
- Chinese Language
- Multimodal
- Large Context
- Conversational AI
- Cultural Understanding
-
-
-### Platform Advantages
-
-- **Chinese Language Excellence**: Superior performance in Chinese text processing
-- **Multimodal Capabilities**: Vision and text understanding combined
-- **Large Context Windows**: Handle complex, lengthy conversations
-- **Cultural Intelligence**: Better understanding of Chinese cultural context
-- **Competitive Pricing**: Cost-effective for Chinese language AI
-
-## Use Cases
-
-
-
- ### Chinese Language Applications
- Perfect for Chinese language content creation and processing.
-
- - Chinese content generation and translation
- - Cultural content creation
- - Chinese business communication
- - Educational content in Chinese
-
-
-
-
- ### Chat and Interaction
- Excellent for natural conversational interfaces.
-
- - Customer service chatbots
- - Personal assistants
- - Interactive learning systems
- - Social conversation applications
-
-
-
-
- ### Vision and Text
- Combine image understanding with text processing.
-
- - Image description and analysis
- - Visual content creation
- - Document processing with images
- - Multimodal content generation
-
-
-
-
- ### Enterprise Use
- Suitable for business applications requiring Chinese language support.
-
- - Chinese market analysis
- - International business communication
- - Cross-cultural content creation
- - Multilingual customer support
-
-
-
-
-## Pricing Information
-
-Moonshot offers flexible pricing based on model usage and context length:
-
-
-
-
-
-
-
-
-
- **Context Pricing**: Pricing scales with context window size. Larger context models cost more per token.
-
-
-
- **Model Selection**: Use Kimi models for conversational tasks and Moonshot v1 models for complex reasoning or
- multimodal work.
-
-
-
- **Language Optimization**: While Moonshot models excel at Chinese, they also perform well in English for general
- tasks.
-
-
-## Model Capabilities
-
-
-
- ### Multilingual Performance
- Strong capabilities across multiple languages with Chinese specialization.
-
- - **Primary Language**: Chinese (Mandarin, simplified/traditional)
- - **Secondary Languages**: English, Japanese, Korean
- - **Cultural Context**: Deep understanding of Chinese culture and context
- - **Code Switching**: Natural switching between languages
-
-
-
-
- ### Context Window Management
- Different models offer varying context window sizes for different needs.
-
- - **Short Context (8K)**: Quick interactions, simple tasks
- - **Medium Context (32K)**: Complex conversations, document analysis
- - **Long Context (128K)**: Large documents, extended conversations
- - **Adaptive Models**: Automatic context optimization
-
-
-
-
- ### Advanced Capabilities
- Unique features that set Moonshot models apart.
-
- - **Cultural Intelligence**: Understanding of Chinese cultural nuances
- - **Vision Integration**: Image understanding and description
- - **Reasoning Enhancement**: Improved logical reasoning in Chinese contexts
- - **Conversational Memory**: Better context retention in conversations
-
-
-
+- **Pricing:** Scales with context window size
+- **Languages:** Excellent for Chinese, strong English support
diff --git a/providers/ollama.mdx b/providers/ollama.mdx
index b7b4f14..b6a1f39 100644
--- a/providers/ollama.mdx
+++ b/providers/ollama.mdx
@@ -1,79 +1,46 @@
---
title: "Ollama"
-description: "Set up Ollama to run AI models locally with CodinIT for enhanced privacy, offline access, and complete control over your development."
+description: "Run AI models locally with Ollama for privacy and offline access."
---
-CodinIT supports running models locally using Ollama. This approach offers privacy, offline access, and potentially reduced costs. It requires some initial setup and a sufficiently powerful computer. Because of the present state of consumer hardware, it's not recommended to use Ollama with CodinIT as performance will likely be poor for average hardware configurations.
+Run models locally using Ollama for privacy, offline access, and control. Requires initial setup and sufficient hardware.
**Website:** [https://ollama.com/](https://ollama.com/)
-### Setting up Ollama
+## Setup
-1. **Download and Install Ollama:**
- Obtain the Ollama installer for your operating system from the [Ollama website](https://ollama.com/) and follow their installation guide. Ensure Ollama is running. You can typically start it with:
+1. **Install Ollama:** Download from [ollama.com](https://ollama.com/) and install
+2. **Start Ollama:** Run `ollama serve` in terminal
+3. **Download a model:**
+ ```bash
+ ollama pull qwen2.5-coder:32b
+ ```
+4. **Configure context window:**
+ ```bash
+ ollama run qwen2.5-coder:32b
+ /set parameter num_ctx 32768
+ /save your_custom_model_name
+ ```
- ```bash
- ollama serve
- ```
+## Configuration in CodinIT
-2. **Download a Model:**
- Ollama supports a wide variety of models. A list of available models can be found on the [Ollama model library](https://ollama.com/library). Some models recommended for coding tasks include:
+1. Click the settings icon (⚙️) in CodinIT
+2. Select "ollama" as the API Provider
+3. Enter your saved model name
+4. (Optional) Set base URL if not using default `http://localhost:11434`
- - `codellama:7b-code` (a good, smaller starting point)
- - `codellama:13b-code` (offers better quality, larger size)
- - `codellama:34b-code` (provides even higher quality, very large)
- - `qwen2.5-coder:32b`
- - `mistralai/Mistral-7B-Instruct-v0.1` (a solid general-purpose model)
- - `deepseek-coder:6.7b-base` (effective for coding)
- - `llama3:8b-instruct-q5_1` (suitable for general tasks)
+## Recommended Models
- To download a model, open your terminal and execute:
+- `qwen2.5-coder:32b` - Excellent for coding
+- `codellama:34b-code` - High quality, large size
+- `deepseek-coder:6.7b-base` - Effective for coding
+- `llama3:8b-instruct-q5_1` - General tasks
- ```bash
- ollama pull
- ```
+See [Ollama model library](https://ollama.com/library) for full list.
- For instance:
+## Notes
- ```bash
- ollama pull qwen2.5-coder:32b
- ```
-
-3. **Configure the Model's Context Window:**
- By default, Ollama models often use a context window of 2048 tokens, which can be insufficient for many CodinIT requests. A minimum of 12,000 tokens is advisable for decent results, with 32,000 tokens being ideal. To adjust this, you'll modify the model's parameters and save it as a new version.
-
- First, load the model (using `qwen2.5-coder:32b` as an example):
-
- ```bash
- ollama run qwen2.5-coder:32b
- ```
-
- Once the model is loaded within the Ollama interactive session, set the context size parameter:
-
- ```
- /set parameter num_ctx 32768
- ```
-
- Then, save this configured model with a new name:
-
- ```
- /save your_custom_model_name
- ```
-
- (Replace `your_custom_model_name` with a name of your choice.)
-
-4. **Configure CodinIT:**
- - Open the CodinIT sidebar (usually indicated by the CodinIT icon).
- - Click the settings gear icon (⚙️).
- - Select "ollama" as the API Provider.
- - Enter the Model name you saved in the previous step (e.g., `your_custom_model_name`).
- - (Optional) Adjust the base URL if Ollama is running on a different machine or port. The default is `http://localhost:11434`.
- - (Optional) Configure the Model context size in CodinIT's Advanced settings. This helps CodinIT manage its context window effectively with your customized Ollama model.
-
-### Tips and Notes
-
-- **Resource Demands:** Running large language models locally can be demanding on system resources. Ensure your computer meets the requirements for your chosen model.
-- **Model Choice:** Experiment with various models to discover which best fits your specific tasks and preferences.
-- **Offline Capability:** After downloading a model, you can use CodinIT with that model even without an internet connection.
-- **Token Usage Tracking:** CodinIT tracks token usage for models accessed via Ollama, allowing you to monitor consumption.
-- **Ollama's Own Documentation:** For more detailed information, consult the official [Ollama documentation](https://ollama.com/docs).
\ No newline at end of file
+- **Context window:** Minimum 12,000 tokens recommended, 32,000 ideal
+- **Resource demands:** Large models require significant system resources
+- **Offline capability:** Works without internet after model download
+- **Performance:** May be slow on average hardware
diff --git a/providers/openai-like.mdx b/providers/openai-like.mdx
index 8085cad..3a95395 100644
--- a/providers/openai-like.mdx
+++ b/providers/openai-like.mdx
@@ -1,279 +1,55 @@
---
title: OpenAI Compatible
-description: Connect to any OpenAI-compatible API endpoint including custom deployments, self-hosted models, and alternative AI services.
+description: Connect to any OpenAI-compatible API endpoint including custom deployments and self-hosted models.
---
-The OpenAI Compatible provider allows you to connect to any service that implements the OpenAI API specification, including custom deployments, alternative providers, and self-hosted models that maintain API compatibility.
+Connect to any service that implements the OpenAI API specification.
-## Overview
+## Configuration
-This flexible provider enables integration with any OpenAI-compatible API, making it easy to use custom AI deployments, alternative hosting services, or self-hosted models that follow the OpenAI API standard.
+Set these environment variables:
+- `OPENAI_LIKE_API_BASE_URL` - Your API endpoint URL
+- `OPENAI_LIKE_API_KEY` - Authentication token
+- `OPENAI_LIKE_API_MODELS` (optional) - Manual model list in format: `model1:limit;model2:limit`
-
-
- Works with any OpenAI-compatible API
-
-
- Connect to self-hosted or custom AI services
-
-
- Highly customizable connection settings
-
-
+## Setup
-## How It Works
-
-
-
- ### OpenAI Standard
- Connects to services that implement the OpenAI API specification.
-
- - **Standard Endpoints**: Uses familiar `/chat/completions` and `/models` endpoints
- - **Compatible Formats**: Supports standard OpenAI request/response formats
- - **Authentication**: Uses Bearer token authentication like OpenAI
- - **Streaming Support**: Compatible with streaming responses
-
-
-
-
- ### Setup Flexibility
- Highly configurable to work with different API providers.
-
- - **Custom Base URL**: Specify any API endpoint URL
- - **API Key**: Configure authentication tokens
- - **Model List**: Define available models manually or auto-discover
- - **Environment Variables**: Support for different deployment environments
-
-
-
-
- ### Dynamic Model Loading
- Automatically discovers available models from compatible APIs.
-
- - **API Query**: Fetches model list from `/models` endpoint
- - **Fallback Configuration**: Manual model specification if API discovery fails
- - **Model Parsing**: Intelligent model name and capability detection
- - **Real-time Updates**: Reflects current API capabilities
-
-
-
-
-## Setup Instructions
-
-
- Determine the base URL of your OpenAI-compatible API service
- Get the authentication token or API key for the service
- Set the required environment variables in your deployment
- Verify the API endpoint and authentication work correctly
- Set up model list either through API discovery or manual configuration
-
-
-## Configuration Options
-
-
-
- ### Required Settings
- Environment variables needed for OpenAI-compatible provider setup.
-
- - **OPENAI_LIKE_API_BASE_URL**: The base URL of your API service
- - **OPENAI_LIKE_API_KEY**: Authentication token for API access
- - **OPENAI_LIKE_API_MODELS** (optional): Manual model specification
-
-
-
-
- ### Model Specification
- How to manually specify models when API discovery is not available.
-
- - **Format**: `model1:limit;model2:limit;model3:limit`
- - **Example**: `gpt-4:8000;claude-3:4000;llama-2:2000`
- - **Token Limits**: Specify context window limits per model
- - **Naming**: Use clear, descriptive model names
-
-
-
-
- ### Container Deployment
- Special considerations for Docker and containerized deployments.
-
- - **Network Access**: Ensure API endpoints are accessible from containers
- - **Environment Variables**: Pass configuration through Docker environment
- - **Volume Mounting**: Mount configuration files if needed
- - **Service Discovery**: Use container networking for service communication
-
-
-
-
-## Use Cases
-
-
-
- ### Local AI Deployment
- Connect to locally hosted AI models and services.
-
- - Local LLM deployments (Ollama, LM Studio, etc.)
- - Custom model servers
- - Private AI infrastructure
- - Development environments
-
-
-
-
- ### Third-Party Services
- Integrate with alternative AI providers using OpenAI compatibility.
-
- - Alternative hosting services
- - Specialized AI providers
- - Regional AI services
- - Custom AI platforms
-
-
-
-
- ### Corporate AI
- Connect to enterprise AI deployments and private clouds.
-
- - Corporate AI infrastructure
- - Private cloud deployments
- - On-premises AI services
- - Hybrid cloud setups
-
-
-
-
- ### Development and Testing
- Useful for development, testing, and prototyping scenarios.
-
- - Local development servers
- - Staging environment testing
- - API compatibility testing
- - Mock AI services for development
-
-
-
+1. Identify your OpenAI-compatible API endpoint
+2. Obtain the API key or authentication token
+3. Set environment variables in your deployment
+4. Test the connection
+5. Configure available models
## Compatible Services
-
-
- ### Desktop Applications
- Popular local AI tools that provide OpenAI-compatible APIs.
-
- - **LM Studio**: Local model server with web UI
- - **Ollama**: Command-line tool for running models locally
- - **LocalAI**: Self-hosted OpenAI-compatible API
- - **Text Generation WebUI**: Local web interface for models
-
-
-
-
- ### Alternative Cloud Providers
- Cloud services that offer OpenAI-compatible APIs.
-
- - **Together AI**: Open-source model hosting
- - **Replicate**: Model deployment platform
- - **Modal**: Serverless model inference
- - **Anthropic-compatible services**: Alternative Claude hosting
-
-
-
-
- ### Custom AI Services
- Self-hosted or custom AI service deployments.
-
- - **vLLM**: High-performance LLM serving
- - **TGI (Text Generation Inference)**: Optimized text generation
- - **FastChat**: Open-source chat platform
- - **Custom model servers**: Your own AI service implementations
-
-
-
-
-## Troubleshooting
-
-
-
- ### API Connectivity
- Common connection and authentication problems.
-
- - **Network Access**: Verify API endpoint is reachable
- - **Authentication**: Check API key validity and format
- - **CORS Issues**: Ensure proper cross-origin headers
- - **SSL/TLS**: Verify certificate validity for HTTPS endpoints
+**Local AI Tools:**
+- LM Studio
+- Ollama
+- LocalAI
+- Text Generation WebUI
-
+**Cloud Alternatives:**
+- Together AI
+- Replicate
+- Modal
+- Custom deployments
-
- ### Model Loading Issues
- Problems with model list retrieval and configuration.
+**Self-Hosted:**
+- vLLM
+- TGI (Text Generation Inference)
+- FastChat
+- Custom model servers
- - **API Endpoint**: Verify `/models` endpoint exists and works
- - **Authentication**: Ensure proper API key for model discovery
- - **Manual Configuration**: Use environment variable fallback
- - **Model Format**: Check model ID format and naming conventions
-
-
-
-
- ### Speed and Reliability
- Addressing performance and reliability concerns.
-
- - **Response Times**: Check network latency to API endpoint
- - **Rate Limits**: Monitor API rate limiting and quotas
- - **Model Size**: Consider model size vs. available resources
- - **Caching**: Implement response caching for repeated queries
-
-
-
-
-
- **Compatibility Check**: Always verify that your target service implements the OpenAI API specification correctly,
- including proper request/response formats and authentication.
-
-
-
- **Testing Strategy**: Start with simple requests to verify connectivity, then test model discovery, and finally test
- actual model inference before full deployment.
-
-
-
- **Security Considerations**: Ensure your API keys are properly secured and that the API endpoint uses HTTPS for secure
- communication.
-
-
-## Advanced Configuration
-
-
-
- ### Additional Headers
- Configure custom headers for special API requirements.
-
- - **Authorization Variants**: Different authentication header formats
- - **API Version Headers**: Specify API version requirements
- - **Custom Metadata**: Service-specific header requirements
- - **Rate Limiting**: Custom rate limit headers
-
-
-
-
- ### Network Proxies
- Configure proxy settings for restricted network environments.
-
- - **HTTP Proxies**: Route API calls through proxy servers
- - **Corporate Networks**: Work within enterprise network restrictions
- - **VPN Requirements**: Handle VPN-dependent API access
- - **Load Balancing**: Distribute requests across multiple endpoints
-
-
+## Use Cases
-
- ### Observability
- Integrate with monitoring and logging systems.
+- Self-hosted models and services
+- Alternative AI providers
+- Enterprise private deployments
+- Development and testing environments
- - **Request Logging**: Track API usage and performance
- - **Error Monitoring**: Capture and analyze API errors
- - **Usage Analytics**: Monitor token consumption and costs
- - **Health Checks**: Implement API endpoint health monitoring
+## Notes
-
-
+- Verify API implements OpenAI specification correctly
+- Ensure HTTPS for secure communication
+- Test with simple requests first
+- Use manual model configuration if auto-discovery fails
diff --git a/providers/openai.mdx b/providers/openai.mdx
index 4e90342..759228d 100644
--- a/providers/openai.mdx
+++ b/providers/openai.mdx
@@ -1,47 +1,37 @@
---
title: "OpenAI"
-description: "Configure and use official OpenAI models including GPT-5, o3, and o4-mini with CodinIT for advanced reasoning and code generation."
+description: "Configure OpenAI models including GPT-5, o3, and o4-mini with CodinIT."
---
-CodinIT supports accessing models directly through the official OpenAI API.
-
**Website:** [https://openai.com/](https://openai.com/)
-### Getting an API Key
-
-1. **Sign Up/Sign In:** Visit the [OpenAI Platform](https://platform.openai.com/). You'll need to create an account or sign in if you already have one.
-2. **Navigate to API Keys:** Once logged in, go to the [API keys section](https://platform.openai.com/api-keys) of your account.
-3. **Create a Key:** Click on "Create new secret key". It's good practice to give your key a descriptive name (e.g., "CodinIT API Key").
-4. **Copy the Key:** **Crucial:** Copy the generated API key immediately. For security reasons, OpenAI will not show it to you again. Store this key in a safe and secure location.
+## Getting an API Key
-### Supported Models
+1. Visit [OpenAI Platform](https://platform.openai.com/) and sign in
+2. Go to [API keys](https://platform.openai.com/api-keys)
+3. Click "Create new secret key" and name it (e.g., "CodinIT")
+4. Copy the key immediately - you won't see it again
-CodinIT is compatible with a variety of OpenAI models, including but not limited to:
+## Configuration
-- 'o3'
-- `o3-mini` (medium reasoning effort)
-- 'o4-mini'
-- `o3-mini-high` (high reasoning effort)
-- `o3-mini-low` (low reasoning effort)
-- `o1`
-- `o1-preview`
-- `o1-mini`
-- `GPT-5o`
-- `GPT-5o-mini`
-- 'GPT-5.1'
-- 'GPT-5.1-mini'
+1. Click the settings icon (⚙️) in CodinIT
+2. Select "OpenAI" as the API Provider
+3. Paste your API key
+4. Choose your model
-For the most current list of available models and their capabilities, please refer to the official [OpenAI Models documentation](https://platform.openai.com/models).
+## Supported Models
-### Configuration in CodinIT
+- `GPT-5o`
+- `GPT-5.1`
+- `o3`
+- `o3-mini` (medium reasoning)
+- `o4-mini`
+- `o1`
+- `o1-mini`
-1. **Open CodinIT Settings:** Click the settings gear icon (⚙️) in the CodinIT panel.
-2. **Select Provider:** Choose "OpenAI" from the "API Provider" dropdown menu.
-3. **Enter API Key:** Paste your OpenAI API key into the "OpenAI API Key" field.
-4. **Select Model:** Choose your desired model from the "Model" dropdown list.
-5. **(Optional) Base URL:** If you need to use a proxy or a custom base URL for the OpenAI API, you can enter it here. Most users will not need to change this from the default.
+See [OpenAI Models documentation](https://platform.openai.com/models) for full details.
-### Tips and Notes
+## Notes
-- **Pricing:** Be sure to review the [OpenAI Pricing page](https://openai.com/pricing) for detailed information on the costs associated with different models.
-- **Azure OpenAI Service:** If you are looking to use the Azure OpenAI service, please note that specific documentation for Azure OpenAI with CodinIT may be found separately, or you might need to configure it as an OpenAI-compatible endpoint if such functionality is supported by CodinIT for custom configurations.
\ No newline at end of file
+- **Pricing:** See [OpenAI Pricing](https://openai.com/pricing)
+- **Azure OpenAI:** Configure as OpenAI-compatible endpoint if needed
\ No newline at end of file
diff --git a/providers/openrouter.mdx b/providers/openrouter.mdx
index 4212d2b..01a885e 100644
--- a/providers/openrouter.mdx
+++ b/providers/openrouter.mdx
@@ -1,40 +1,35 @@
---
title: "OpenRouter"
-description: "Learn how to use OpenRouter with CodinIT to access a wide variety of language models through a single API."
+description: "Access multiple AI models through a unified API with OpenRouter."
---
-OpenRouter is an AI platform that provides access to a wide variety of language models from different providers, all through a single API. This can simplify setup and allow you to easily experiment with different models.
+OpenRouter provides access to models from multiple providers through a single API.
**Website:** [https://openrouter.ai/](https://openrouter.ai/)
-### Getting an API Key
+## Getting an API Key
-1. **Sign Up/Sign In:** Go to the [OpenRouter website](https://openrouter.ai/). Sign in with your Google or GitHub account.
-2. **Get an API Key:** Go to the [keys page](https://openrouter.ai/keys). You should see an API key listed. If not, create a new key.
-3. **Copy the Key:** Copy the API key.
+1. Go to [OpenRouter](https://openrouter.ai/) and sign in with Google or GitHub
+2. Navigate to the [keys page](https://openrouter.ai/keys)
+3. Copy your API key (or create a new one)
-### Supported Models
+## Configuration
-OpenRouter supports a large and growing number of models. CodinIT automatically fetches the list of available models. Refer to the [OpenRouter Models page](https://openrouter.ai/models) for the complete and up-to-date list.
+1. Click the settings icon (⚙️) in CodinIT
+2. Select "OpenRouter" as the API Provider
+3. Paste your API key
+4. Choose your model
-### Configuration in CodinIT
+## Supported Models
-1. **Open CodinIT Settings:** Click the settings icon (⚙️) in the CodinIT panel.
-2. **Select Provider:** Choose "OpenRouter" from the "API Provider" dropdown.
-3. **Enter API Key:** Paste your OpenRouter API key into the "OpenRouter API Key" field.
-4. **Select Model:** Choose your desired model from the "Model" dropdown.
-5. **(Optional) Custom Base URL:** If you need to use a custom base URL for the OpenRouter API, check "Use custom base URL" and enter the URL. Leave this blank for most users.
+CodinIT automatically fetches available models. See [OpenRouter Models](https://openrouter.ai/models) for the complete list.
-### Supported Transforms
+## Features
-OpenRouter provides an [optional "middle-out" message transform](https://openrouter.ai/features/message-transforms) to help with prompts that exceed the maximum context size of a model. You can enable it by checking the "Compress prompts and message chains to the context size" box.
+- **Message transforms:** Enable "Compress prompts and message chains to context size" to handle large prompts
+- **Prompt caching:** Automatically passes caching to supported models
+- **Gemini caching:** Manually enable "Enable Prompt Caching" for Gemini models
-### Tips and Notes
+## Notes
-- **Model Selection:** OpenRouter offers a wide range of models. Experiment to find the best one for your needs.
-- **Pricing:** OpenRouter charges based on the underlying model's pricing. See the [OpenRouter Models page](https://openrouter.ai/models) for details.
-- **Prompt Caching:**
- - OpenRouter passes caching requests to underlying models that support it. Check the [OpenRouter Models page](https://openrouter.ai/models) to see which models offer caching.
- - For most models, caching should activate automatically if supported by the model itself (similar to how Requesty works).
- - **Exception for Gemini Models via OpenRouter:** Due to potential response delays sometimes observed with Google's caching mechanism when accessed via OpenRouter, a manual activation step is required _specifically for Gemini models_.
- - If using a **Gemini model** via OpenRouter, you **must manually check** the "Enable Prompt Caching" box in the provider settings to activate caching for that model. This checkbox serves as a temporary workaround. For non-Gemini models on OpenRouter, this checkbox is not necessary for caching.
\ No newline at end of file
+- **Pricing:** Based on underlying model pricing. See [OpenRouter Models](https://openrouter.ai/models)
\ No newline at end of file
diff --git a/providers/perplexity.mdx b/providers/perplexity.mdx
index 040ce2f..b0f5ffb 100644
--- a/providers/perplexity.mdx
+++ b/providers/perplexity.mdx
@@ -1,212 +1,39 @@
---
title: Perplexity
-description: Access Perplexity's Sonar AI models with built-in web search, real-time data access, and source citations for research-focused tasks.
+description: Configure Perplexity's Sonar models with integrated web search for research tasks.
---
-Perplexity provides AI models with integrated search capabilities, allowing models to access real-time information and provide more accurate, up-to-date responses based on current web data.
+**Website:** [https://www.perplexity.ai/](https://www.perplexity.ai/)
-## Overview
+## Getting an API Key
-Perplexity combines advanced language models with web search functionality, enabling AI to provide responses based on the latest available information. Their Sonar models are specifically designed for research, analysis, and knowledge-intensive tasks.
+1. Go to [Perplexity AI](https://www.perplexity.ai/) and sign in
+2. Navigate to Settings > API
+3. Create a new API key
+4. Copy the key immediately
-
-
- Access current web information and data
-
-
- Specialized for analysis and research tasks
-
-
- Sources and references for information
-
-
+## Configuration
-## Available Models
+1. Click the settings icon (⚙️) in CodinIT
+2. Select "Perplexity" as the API Provider
+3. Paste your API key
+4. Choose your model
-
-
- ### Sonar Base Models
- Standard models with integrated search capabilities.
+## Supported Models
- - **Sonar**: Basic model with web search integration
- - **Sonar Pro**: Enhanced model with advanced search features
- - **Best for**: General research, current events, fact-checking
+- `sonar` - Basic model with web search
+- `sonar-pro` - Enhanced search features
+- `sonar-reasoning` - Advanced reasoning with search
+- `sonar-reasoning-pro` - Professional-grade reasoning
-
+## Features
-
- ### Sonar Reasoning Models
- Advanced models with enhanced reasoning and analysis capabilities.
+- **Web search:** Integrated real-time web search
+- **Source citations:** References for all information
+- **Real-time data:** Access to current information
+- **Research focused:** Optimized for research tasks
- - **Sonar Reasoning**: Advanced reasoning with search integration
- - **Sonar Reasoning Pro**: Professional-grade reasoning and research
- - **Best for**: Complex analysis, academic research, technical problems
+## Notes
-
-
-
-## Setup Instructions
-
-
- Visit [Perplexity AI](https://www.perplexity.ai/) and create an account
- Navigate to Settings > API in your Perplexity account
- Create a new API key for model access
- Add your Perplexity API key to the provider settings
- Try queries that require current information to test search capabilities
-
-
-## Key Features
-
-
- Web Search
- Real-time Data
- Source Citations
- Research Focused
- Fact Checking
-
-
-### Platform Advantages
-
-- **Integrated Search**: Models can access current web information automatically
-- **Source Transparency**: Citations and references for all information provided
-- **Real-time Updates**: Access to the latest news, data, and developments
-- **Research Enhancement**: Better performance on research and analytical tasks
-- **Fact Verification**: Cross-referencing information for accuracy
-
-## Use Cases
-
-
-
- ### Academic Research
- Perfect for research tasks requiring current and accurate information.
-
- - Academic research and literature review
- - Current events analysis
- - Market research and trends
- - Scientific paper analysis
- - Competitive intelligence
-
-
-
-
- ### Business Applications
- Excellent for business research and decision-making.
-
- - Market analysis and reporting
- - Industry trend monitoring
- - Competitive analysis
- - Business intelligence gathering
- - Strategic planning support
-
-
-
-
- ### News & Information
- Ideal for staying updated with current events and information.
-
- - News analysis and summarization
- - Breaking news monitoring
- - Event impact assessment
- - Real-time information queries
- - Fact-checking and verification
-
-
-
-
- ### Technical Analysis
- Suitable for technical research and problem-solving.
-
- - Technical documentation research
- - API and tool research
- - Technology trend analysis
- - Development resource discovery
- - Technical problem-solving
-
-
-
-
-## Pricing Information
-
-Perplexity offers straightforward pricing based on usage:
-
-
-
-
-
-
-
-
-
- **Search Costs**: Web search functionality is included in the token pricing. No additional search fees.
-
-
-
- **Model Selection**: Use Sonar Pro for most applications. Choose Sonar Reasoning models for complex analytical tasks.
-
-
-
- **Rate Limits**: Perplexity implements rate limits based on your account tier. Monitor usage to avoid interruptions.
-
-
-## Search Integration
-
-
-
- ### Search Mechanism
- Understanding how Perplexity integrates search with AI responses.
-
- - **Automatic Search**: Models search the web when needed for current information
- - **Source Selection**: Chooses reliable sources and recent information
- - **Citation System**: Provides links and references for all information
- - **Fact Verification**: Cross-references information for accuracy
-
-
-
-
- ### Search Features
- Advanced search capabilities built into the models.
-
- - **Real-time Data**: Access to current news, prices, and statistics
- - **Comprehensive Coverage**: Searches across multiple reliable sources
- - **Quality Filtering**: Prioritizes high-quality, authoritative sources
- - **Freshness Priority**: Emphasizes recent and up-to-date information
-
-
-
-
- ### Source Transparency
- How Perplexity provides transparency in its responses.
-
- - **Source Links**: Direct links to original sources
- - **Citation Markers**: Clear indicators of cited information
- - **Source Quality**: Indicators of source reliability and recency
- - **Verification**: Cross-referenced information for accuracy
-
-
-
-
-## Best Practices
-
-
-
- ### Query Optimization
- Tips for getting the best results from Perplexity models.
-
- - **Be Specific**: Clear, specific questions get better results
- - **Include Context**: Provide background information when relevant
- - **Specify Timeframes**: Mention if you need current vs. historical information
- - **Request Citations**: Ask for sources when accuracy is critical
-
-
-
-
- ### Research Strategies
- Effective ways to use Perplexity for research tasks.
-
- - **Iterative Refinement**: Start broad, then narrow down queries
- - **Cross-Verification**: Use multiple queries to verify information
- - **Source Evaluation**: Check the quality and recency of sources
- - **Follow-up Questions**: Ask for clarification or additional details
-
-
-
+- **Pricing:** Search functionality included in token pricing
+- **Use cases:** Research, fact-checking, current events
diff --git a/providers/togetherai.mdx b/providers/togetherai.mdx
index b6b1ab2..775005b 100644
--- a/providers/togetherai.mdx
+++ b/providers/togetherai.mdx
@@ -1,251 +1,41 @@
---
title: Together AI
-description: Access hundreds of open-source AI models including Llama, Mistral, and Qwen through Together's optimized inference platform.
+description: Access hundreds of open-source AI models through Together's optimized platform.
---
+**Website:** [https://api.together.xyz/](https://api.together.xyz/)
-Together AI provides access to a comprehensive collection of open-source AI models from leading research organizations, offering high-performance inference with competitive pricing and extensive model variety.
+## Getting an API Key
-## Overview
+1. Go to [Together AI](https://api.together.xyz/) and sign in
+2. Navigate to Settings > API Keys
+3. Create a new API key
+4. Copy the key immediately
-Together AI serves as a marketplace for open-source AI models, providing access to cutting-edge models from Meta, Mistral, Google, and other leading AI research labs. Their platform offers both static model access and dynamic model discovery.
+## Configuration
-
-
- Access to hundreds of open-source models
-
-
- Optimized infrastructure for fast inference
-
-
- Latest models from top AI research labs
-
-
+1. Click the settings icon (⚙️) in CodinIT
+2. Select "Together AI" as the API Provider
+3. Paste your API key
+4. Choose your model
## Popular Models
-
-
- ### Llama Models
- Industry-leading open-source models from Meta.
+- **Llama series:** Llama 3.1 70B, Llama 3.2 90B Vision
+- **Mistral series:** Mixtral 8x7B, Devstral Small, Magistral Small
+- **Google series:** Gemma 3 (27B, 12B, 4B, 1B)
+- **Coding models:** Qwen3-Coder 480B, Arcee AI Coder
+- **Reasoning models:** Kimi K2 Thinking, DeepSeek-V3.2-Exp, Cogito V1
- - **Llama 3.1 70B**: Powerful general-purpose model
- - **Llama 3.2 90B Vision**: Multimodal model with vision capabilities
- - **Best for**: Advanced reasoning, creative tasks, multimodal applications
+## Features
-
+- **Model variety:** Hundreds of open-source models
+- **Research access:** Latest models from top labs
+- **High performance:** Optimized inference infrastructure
+- **Dynamic discovery:** Automatic model catalog updates
+- **Cost effective:** Flexible usage-based pricing
-
- ### Mistral AI Series
- Efficient and capable models from Mistral AI.
+## Notes
- - **Mixtral 8x7B**: High-performance mixture-of-experts model
- - **Devstral Small**: Specialized coding model
- - **Magistral Small**: Advanced reasoning model
- - **Best for**: Efficient inference, specialized tasks
-
-
-
-
- ### Google/Gemma Series
- Lightweight and capable models from Google.
-
- - **Gemma 3 27B**: Advanced general-purpose model
- - **Gemma 3 12B/4B/1B**: Range of model sizes for different needs
- - **Best for**: Balanced performance, research applications
-
-
-
-
- ### Specialized Coding
- Models optimized for programming and technical tasks.
-
- - **Qwen3-Coder 480B**: Massive coding model
- - **Arcee AI Coder**: Specialized coding assistant
- - **Best for**: Code generation, debugging, technical writing
-
-
-
-
- ### Advanced Reasoning
- Models with enhanced reasoning and thinking capabilities.
-
- - **Kimi K2 Thinking**: Advanced thinking model
- - **DeepSeek-V3.2-Exp**: Experimental reasoning model
- - **Cogito V1 Preview**: Best-in-class reasoning
- - **Best for**: Complex problem-solving, analysis
-
-
-
-
-## Setup Instructions
-
-
- Visit [Together AI](https://api.together.xyz/) and create an account
- Navigate to Settings > API Keys in your dashboard
- Create a new API key for model access
- Add your Together API key to the provider settings
- Browse available models and test different options
-
-
-## Key Features
-
-
- Open Source
- Model Variety
- High Performance
- Dynamic Discovery
- Cost Effective
-
-
-### Platform Advantages
-
-- **Extensive Catalog**: Access to hundreds of open-source models
-- **Research Access**: Latest models from top AI research institutions
-- **Performance Optimized**: Fast inference with optimized infrastructure
-- **Flexible Pricing**: Pay only for what you use
-- **Regular Updates**: New models added frequently
-
-## Use Cases
-
-
-
- ### AI Research
- Perfect for researchers and developers exploring different models.
-
- - Model comparison and evaluation
- - Research prototyping
- - Algorithm testing
- - Performance benchmarking
-
-
-
-
- ### Software Development
- Excellent for coding and technical development work.
-
- - Code generation and completion
- - Technical documentation
- - API development
- - Debugging assistance
-
-
-
-
- ### Creative Applications
- Suitable for content creation and creative tasks.
-
- - Creative writing and ideation
- - Content generation
- - Marketing copy
- - Educational materials
-
-
-
-
- ### Enterprise Use
- Reliable for business and productivity applications.
-
- - Business analysis
- - Customer service automation
- - Process optimization
- - Data analysis
-
-
-
-
-## Pricing Information
-
-Together AI offers flexible pricing based on model size and usage:
-
-
-
-
-
-
-
-
-**Dynamic Pricing**: Actual prices may vary based on current model popularity and demand.
-
-
- **Model Selection**: Start with smaller models for testing, then scale up to larger models for production use.
-
-
-
- **Model Availability**: Some models may have limited availability or higher costs during peak usage.
-
-
-## Model Management
-
-
-
- ### Model Catalog
- How Together AI provides access to available models.
-
- - **API Integration**: Automatic model discovery through API
- - **Real-time Updates**: New models added as they become available
- - **Pricing Information**: Cost details included in model listings
- - **Performance Metrics**: Context window and capability information
-
-
-
-
- ### Model Organization
- Understanding the different types of models available.
-
- - **Text Generation**: General-purpose language models
- - **Code Models**: Specialized for programming tasks
- - **Vision Models**: Multimodal models with image understanding
- - **Reasoning Models**: Enhanced logical reasoning capabilities
- - **Experimental Models**: Cutting-edge research models
-
-
-
-
- ### Optimization Features
- Features to improve model performance and cost efficiency.
-
- - **Context Management**: Efficient handling of large context windows
- - **Caching**: Request caching for repeated queries
- - **Load Balancing**: Automatic distribution across available resources
- - **Cost Optimization**: Suggestions for cost-effective model selection
-
-
-
-
-## Best Practices
-
-
-
- ### Choosing the Right Model
- Guidelines for selecting appropriate models for your use case.
-
- - **Task Matching**: Choose models specialized for your specific task
- - **Cost Consideration**: Balance performance needs with budget constraints
- - **Context Requirements**: Ensure model context window meets your needs
- - **Testing Phase**: Test multiple models before committing to one
-
-
-
-
- ### Optimizing Performance
- Tips for getting the best performance from Together AI models.
-
- - **Prompt Engineering**: Craft clear, specific prompts
- - **Context Management**: Keep context focused and relevant
- - **Batch Processing**: Group similar requests when possible
- - **Caching Strategy**: Cache frequent queries to reduce costs
-
-
-
-
- ### Managing Costs
- Strategies for controlling and optimizing usage costs.
-
- - **Usage Monitoring**: Track usage patterns and costs
- - **Model Switching**: Use smaller models for simpler tasks
- - **Caching**: Implement caching to reduce API calls
- - **Bulk Operations**: Combine multiple operations when feasible
-
-
-
+- **Pricing:** Based on model size and usage
+- **Model updates:** New models added frequently
diff --git a/providers/xai-grok.mdx b/providers/xai-grok.mdx
index c35cf86..553d7cb 100644
--- a/providers/xai-grok.mdx
+++ b/providers/xai-grok.mdx
@@ -1,85 +1,51 @@
---
title: "xAI (Grok)"
-description: "Learn how to configure and use xAI's Grok models with CodinIT, including API key setup, supported models, and reasoning capabilities."
+description: "Configure xAI's Grok models with large context windows and reasoning capabilities."
---
-xAI is the company behind Grok, a large language model known for its conversational abilities and large context window. Grok models are designed to provide helpful, informative, and contextually relevant responses.
-
**Website:** [https://x.ai/](https://x.ai/)
-### Getting an API Key
-
-1. **Sign Up/Sign In:** Go to the [xAI Console](https://console.x.ai/). Create an account or sign in.
-2. **Navigate to API Keys:** Go to the API keys section in your dashboard.
-3. **Create a Key:** Click to create a new API key. Give your key a descriptive name (e.g., "CodinIT").
-4. **Copy the Key:** **Important:** Copy the API key _immediately_. You will not be able to see it again. Store it securely.
-
-### Supported Models
-
-CodinIT supports the following xAI Grok models:
-
-#### Grok-3 Models
-
-- `grok-3-beta` (Default) - xAI's Grok-3 beta model with 131K context window
-- `grok-3-fast-beta` - xAI's Grok-3 fast beta model with 131K context window
-- `grok-3-mini-beta` - xAI's Grok-3 mini beta model with 131K context window
-- `grok-3-mini-fast-beta` - xAI's Grok-3 mini fast beta model with 131K context window
-
-#### Grok-2 Models
-
-- `grok-2-latest` - xAI's Grok-2 model - latest version with 131K context window
-- `grok-2` - xAI's Grok-2 model with 131K context window
-- `grok-2-1212` - xAI's Grok-2 model (version 1212) with 131K context window
-
-#### Grok Vision Models
-
-- `grok-2-vision-latest` - xAI's Grok-2 Vision model - latest version with image support and 32K context window
-- `grok-2-vision` - xAI's Grok-2 Vision model with image support and 32K context window
-- `grok-2-vision-1212` - xAI's Grok-2 Vision model (version 1212) with image support and 32K context window
-- `grok-vision-beta` - xAI's Grok Vision Beta model with image support and 8K context window
-
-#### Legacy Models
-
-- `grok-beta` - xAI's Grok Beta model (legacy) with 131K context window
-
-### Configuration in CodinIT
-
-1. **Open CodinIT Settings:** Click the settings icon (⚙️) in the CodinIT panel.
-2. **Select Provider:** Choose "xAI" from the "API Provider" dropdown.
-3. **Enter API Key:** Paste your xAI API key into the "xAI API Key" field.
-4. **Select Model:** Choose your desired Grok model from the "Model" dropdown.
-
-### Reasoning Capabilities
-
-Grok 3 Mini models feature specialized reasoning capabilities, allowing them to "think before responding" - particularly useful for complex problem-solving tasks.
-
-#### Reasoning-Enabled Models
+## Getting an API Key
-Reasoning is only supported by:
+1. Go to [xAI Console](https://console.x.ai/) and sign in
+2. Navigate to API Keys section
+3. Create a new API key and name it (e.g., "CodinIT")
+4. Copy the key immediately - you won't see it again
-- `grok-3-mini-beta`
-- `grok-3-mini-fast-beta`
+## Configuration
-The Grok 3 models `grok-3-beta` and `grok-3-fast-beta` do not support reasoning.
+1. Click the settings icon (⚙️) in CodinIT
+2. Select "xAI" as the API Provider
+3. Paste your API key
+4. Choose your model
-#### Controlling Reasoning Effort
+## Supported Models
-When using reasoning-enabled models, you can control how hard the model thinks with the `reasoning_effort` parameter:
+**Grok-3 Series (131K context):**
+- `grok-3-beta` (Default)
+- `grok-3-fast-beta`
+- `grok-3-mini-beta` (Reasoning enabled)
+- `grok-3-mini-fast-beta` (Reasoning enabled)
-- `low`: Minimal thinking time, using fewer tokens for quick responses
-- `high`: Maximum thinking time, leveraging more tokens for complex problems
+**Grok-2 Series (131K context):**
+- `grok-2-latest`
+- `grok-2`
+- `grok-2-1212`
-Choose `low` for simple queries that should complete quickly, and `high` for harder problems where response latency is less important.
+**Vision Models:**
+- `grok-2-vision-latest` (32K context)
+- `grok-2-vision` (32K context)
+- `grok-vision-beta` (8K context)
-#### Key Features
+## Reasoning Capabilities
-- **Step-by-Step Problem Solving**: The model thinks through problems methodically before delivering an answer
-- **Math & Quantitative Strength**: Excels at numerical challenges and logic puzzles
-- **Reasoning Trace Access**: The model's thinking process is available via the `reasoning_content` field in the response completion object
+Available on `grok-3-mini-beta` and `grok-3-mini-fast-beta`:
+- **Step-by-step problem solving:** Methodical thinking process
+- **Reasoning effort control:** Set `low` for quick responses or `high` for complex problems
+- **Reasoning trace access:** View the model's thinking process
-### Tips and Notes
+## Notes
-- **Context Window:** Most Grok models feature large context windows (up to 131K tokens), allowing you to include substantial amounts of code and context in your prompts.
-- **Vision Capabilities:** Select vision-enabled models (`grok-2-vision-latest`, `grok-2-vision`, etc.) when you need to process or analyze images.
-- **Pricing:** Pricing varies by model, with input costs ranging from $0.3 to $5.0 per million tokens and output costs from $0.5 to $25.0 per million tokens. Refer to the xAI documentation for the most current pricing information.
-- **Performance Tradeoffs:** "Fast" variants typically offer quicker response times but may have higher costs, while "mini" variants are more economical but may have reduced capabilities.
\ No newline at end of file
+- **Context window:** Up to 131K tokens for most models
+- **Vision support:** Available on select models
+- **Pricing:** Varies by model, see xAI documentation