All Guides
Business

💰 Cost Optimization

Maximize value from your Vigthoria subscription with smart usage strategies.

10 min read Best Practice

Understand Your Usage

Monitor your usage patterns to identify optimization opportunities:

  1. Go to Dashboard → Usage Analytics
  2. Review your token consumption by:
    • Model (which models use the most tokens)
    • Time (peak usage hours/days)
    • Application (which apps/integrations)
  3. Set usage alerts at 50%, 80%, and 100% thresholds
Quick Win

Export your usage data monthly to identify trends. Many users find 20-30% of their usage is redundant or could use lighter models.

Smart Model Selection

Not every task needs the most powerful model:

Task Type Recommended Model Cost Level
Simple Q&A, classification vigthoria-reasoning-v2 Standard
Code generation vigthoria-code-v2 Standard
Creative content vigthoria-creative-v2 Standard
Image analysis vigthoria-vision-v2 Premium
Potential Savings: 15-40%

By matching models to tasks instead of using one model for everything.

Reduce Token Usage

1. Optimize Prompts

Shorter, clearer prompts use fewer input tokens:

// Before: 45 tokens
"I would really appreciate it if you could please help me by writing 
a function that takes a number as input and returns whether that 
number is a prime number or not."

// After: 18 tokens
"Write a function isPrime(n) that returns true if n is prime."

2. Limit Response Length

Set appropriate max_tokens for each use case:

{
  "max_tokens": 200,  // For short answers
  "max_tokens": 500,  // For explanations
  "max_tokens": 1500  // For articles
}

3. Use Stop Sequences

End generation early when you have what you need:

{
  "stop": ["---", "END", "\n\n\n"]
}
Potential Savings: 20-50%

By reducing average tokens per request from 2000 to 800.

Implement Caching

Don't pay for the same generation twice:

import hashlib
import redis

cache = redis.Redis()

def cached_generation(prompt, model, **kwargs):
    # Create cache key from request
    cache_key = hashlib.sha256(
        f"{model}:{prompt}:{kwargs}".encode()
    ).hexdigest()
    
    # Check cache
    cached = cache.get(cache_key)
    if cached:
        return json.loads(cached)
    
    # Generate and cache
    response = vigthoria.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        **kwargs
    )
    
    # Cache for 24 hours
    cache.setex(cache_key, 86400, json.dumps(response))
    return response

Good candidates for caching:

Potential Savings: 30-60%

Depending on how many repeated requests you have.

Batch Processing

For non-real-time tasks, batch requests during off-peak hours:

// Instead of 10 separate requests:
const items = ['item1', 'item2', 'item3', ...];

// Combine into one:
const response = await vigthoria.chat.completions.create({
  model: 'vigthoria-reasoning-v2',
  messages: [{
    role: 'user',
    content: `Analyze these 10 items and provide a summary for each:
    ${items.join('\n')}`
  }]
});

Set Up Alerts

Prevent surprise overages with proactive monitoring:

  1. Go to Dashboard → Settings → Alerts
  2. Configure alerts:
    • 50%: Review and optimize if needed
    • 80%: Implement stricter controls
    • 90%: Pause non-essential usage
  3. Set up webhook notifications for real-time alerts

Right-Size Your Plan

Review your plan quarterly:

Annual Savings

Annual plans typically offer 15-20% savings over monthly billing. If you're committed to Vigthoria, consider switching.

Cost Optimization Checklist