All Guides
AI

⚡ Optimizing Generations

Get better AI outputs faster with these optimization techniques.

12 min read Best Practice

Model Selection

Choose the right model for your task to get optimal results:

Task Best Model Why
Complex analysis vigthoria-reasoning-v2 Deep logical thinking, step-by-step reasoning
Code generation vigthoria-code-v2 Optimized for syntax, patterns, best practices
Creative writing vigthoria-creative-v2 Imaginative, varied, engaging output
Image understanding vigthoria-vision-v2 Multimodal image + text processing
Quick responses vigthoria-reasoning-v2 Fast, balanced for general tasks
Pro Tip

For mixed tasks (e.g., creative code), start with the primary need. Writing creative stories with code snippets? Use Creative. Building an app with creative copy? Use Code.

Parameter Tuning

Temperature

Controls randomness and creativity:

Max Tokens

Set appropriately to avoid truncation or waste:

Stop Sequences

Use stop sequences to end generation at the right point:

{
  "stop": ["```", "\n\n---", "END"]
}

Prompt Optimization

Be Specific

❌ Bad "Write about technology"
✓ Good "Write a 300-word article about AI's impact on healthcare, focusing on diagnostic tools, for a non-technical audience"

Provide Context

❌ Bad "Fix this bug"
✓ Good "Fix this React useEffect bug. Expected: fetch data once on mount. Actual: infinite loop. Using React 18."

Use Examples (Few-Shot)

Convert these sentences to formal English:

"gonna grab lunch" → "I will be taking lunch now."
"u free tmrw?" → "Are you available tomorrow?"
"thx for the help" → ?

Structure Your Requests

Task: Summarize this article
Format: 3 bullet points, max 20 words each
Tone: Professional
Audience: Executives

Article:
[content here]

Speed Optimization

Reduce Token Count

Use Caching

Cache identical requests to avoid redundant API calls:

const crypto = require('crypto');
const cache = new Map();

async function cachedGeneration(prompt, options) {
  const cacheKey = crypto
    .createHash('md5')
    .update(JSON.stringify({ prompt, options }))
    .digest('hex');
  
  if (cache.has(cacheKey)) {
    return cache.get(cacheKey);
  }
  
  const result = await vigthoria.chat.completions.create({
    messages: [{ role: 'user', content: prompt }],
    ...options
  });
  
  cache.set(cacheKey, result);
  return result;
}

Parallel Requests

Process multiple independent requests simultaneously:

const prompts = ['Task 1', 'Task 2', 'Task 3'];

const results = await Promise.all(
  prompts.map(prompt => 
    vigthoria.chat.completions.create({
      model: 'vigthoria-reasoning-v2',
      messages: [{ role: 'user', content: prompt }]
    })
  )
);

Quality Improvement

System Messages

Set clear expectations in the system message:

{
  "messages": [
    {
      "role": "system",
      "content": "You are an expert technical writer. Write clear, concise documentation. Use bullet points for lists. Include code examples where relevant. Avoid jargon."
    },
    {
      "role": "user",
      "content": "Document the authentication flow"
    }
  ]
}

Iterative Refinement

  1. Get initial output
  2. Ask for specific improvements
  3. Request alternative approaches
  4. Combine best elements

Self-Critique Pattern

messages: [
  { role: "user", content: "Write a product description for X" },
  { role: "assistant", content: "[initial output]" },
  { role: "user", content: "Now critique this description and provide an improved version that addresses the weaknesses" }
]

Quick Reference