Skip to content
Secure Private AI for Enterprises and Developers - amazee.ai

Multi-Language

This example shows how to set up AI AutoEvals for evaluating content in multiple languages.

  1. Language Detection: Detect the language of content

  2. Language-Specific Evaluation Sets: Create evaluation sets for each language

  3. Custom Prompts: Use language-specific evaluation prompts

  4. Tag-Based Routing: Route requests to appropriate evaluation set

Step 1: Create Language-Specific Evaluation Sets

Section titled “Step 1: Create Language-Specific Evaluation Sets”
  1. Navigate to /admin/content/ai-autoevals/sets

  2. Click “Add Evaluation Set”

  3. Configure:

    • Label: “English Evaluation”
    • Description: “Evaluation for English content”
    • Fact Extraction Method: “AI Generated”
    • Custom Prompt Template:
    You are evaluating the factual accuracy of an AI response in English.
    User Question: {{ input }}
    Evaluation Criteria:
    {{ facts }}
    AI Response:
    {{ output }}
    Please evaluate the response and choose:
    A) The response fully meets all evaluation criteria (Exact Match)
    B) The response includes all expected information plus additional relevant details (Superset)
    C) The response includes some expected information but is missing some details (Subset)
    D) The response contradicts evaluation criteria (Disagreement)
    Provide your analysis and final choice.
    • Tags Filter: {"language": "en"}
  4. Save

  1. Click “Add Evaluation Set”

  2. Configure:

    • Label: “Spanish Evaluation”
    • Description: “Evaluation for Spanish content”
    • Fact Extraction Method: “AI Generated”
    • Custom Prompt Template:
    Estás evaluando la precisión factual de una respuesta de IA en español.
    Pregunta del Usuario: {{ input }}
    Criterios de Evaluación:
    {{ facts }}
    Respuesta de IA:
    {{ output }}
    Por favor, evalúa la respuesta y elige:
    A) La respuesta cumple completamente con todos los criterios de evaluación (Coincidencia Exacta)
    B) La respuesta incluye toda la información esperada más detalles adicionales relevantes (Sobrecconjunto)
    C) La respuesta incluye parte de la información esperada pero falta algunos detalles (Subconjunto)
    D) La respuesta contradice los criterios de evaluación (Desacuerdo)
    Proporciona tu análisis y elección final.
    • Tags Filter: {"language": "es"}
  3. Save

  1. Click “Add Evaluation Set”

  2. Configure:

    • Label: “French Evaluation”
    • Description: “Evaluation for French content”
    • Fact Extraction Method: “AI Generated”
    • Custom Prompt Template:
    Vous évaluez l'exactitude factuelle d'une réponse IA en français.
    Question de l'utilisateur: {{ input }}
    Critères d'évaluation:
    {{ facts }}
    Réponse IA:
    {{ output }}
    Veuillez évaluer la réponse et choisir:
    A) La réponse répond entièrement à tous les critères d'évaluation (Correspondance Exacte)
    B) La réponse inclut toutes les informations attendues plus des détails supplémentaires pertinents (Surensemble)
    C) La réponse inclut certaines informations attendues mais manque certains détails (Sous-ensemble)
    D) La réponse contredit les critères d'évaluation (Désaccord)
    Fournissez votre analyse et votre choix final.
    • Tags Filter: {"language": "fr"}
  3. Save

Create a helper function to detect language:

<?php
/**
* Detects language of content.
*/
function detectContentLanguage(string $text): string {
// Simple language detection based on common words
$languagePatterns = [
'es' => ['/¿/u', '/¿/u', '/\b(el|la|los|las|un|una|es|son|tengo|tiene|estoy|está)\b/iu'],
'fr' => ['/¿/u', '/\b(le|la|les|un|une|est|sont|je|tu|il|elle|nous|vous)\b/iu'],
'de' => ['/ß/u', '/\b(der|die|das|ein|eine|ist|sind|ich|du|er|sie|wir|ihr)\b/iu'],
'it' => ['/¿/u', '/\b(il|la|lo|le|gli|un|una|è|sono|ho|hai|è)\b/iu'],
];
foreach ($languagePatterns as $lang => $patterns) {
foreach ($patterns as $pattern) {
if (preg_match($pattern, $text)) {
return $lang;
}
}
}
// Default to English
return 'en';
}

Or use a language detection library:

<?php
use Gettext\Translations;
use LanguageDetection\Language;
/**
* Detects language of content using library.
*/
function detectContentLanguageLib(string $text): string {
// Requires composer require patrickpreuss/language-detection
$ld = new Language([
'en', 'es', 'fr', 'de', 'it', 'pt', 'nl', 'pl', 'ru', 'zh', 'ja'
]);
return $ld->detect($text)->close();
}

Add language tag to AI requests:

<?php
/**
* Makes an AI request with language tag.
*/
function makeAiRequest(string $question, string $language = 'en'): string {
$input = new ChatInput([
new ChatMessage('user', $question),
]);
$provider = \Drupal::service('ai.provider')->createInstance('amazeeio');
$model = 'gpt-4';
// Detect language if not provided
if ($language === 'auto') {
$language = detectContentLanguage($question);
}
// Make request with language tag
$response = $provider->chat($input, $model, [
'ai_autoevals:track' => TRUE,
'language' => $language,
]);
return $response->getNormalized()->getText();
}
<?php
// Evaluate English content
$englishResponse = makeAiRequest('What is the capital of France?', 'en');
// Will use "English Evaluation" set
// Evaluate Spanish content
$spanishResponse = makeAiRequest('¿Cuál es la capital de Francia?', 'es');
// Will use "Spanish Evaluation" set
// Evaluate French content
$frenchResponse = makeAiRequest('Quelle est la capitale de la France?', 'fr');
// Will use "French Evaluation" set
// Auto-detect language
$autoResponse = makeAiRequest('Was ist die Hauptstadt von Frankreich?', 'auto');
// Will detect German and use default set
<?php
/**
* Gets average score by language.
*/
function getAverageScoreByLanguage(): array {
$query = \Drupal::database()->select('ai_autoevals_evaluation_result', 'e');
$query->addExpression('AVG(e.score)', 'avg_score');
$query->addExpression('COUNT(e.id)', 'count');
$query->addField('e', 'tags');
// Extract language from tags JSON
$query->where("JSON_EXTRACT(e.tags, '$.language') IS NOT NULL");
$query->groupBy('e.tags');
$results = $query->execute()->fetchAllAssoc('tags');
$scores = [];
foreach ($results as $tags => $data) {
$tagsArray = json_decode($tags, TRUE);
$language = $tagsArray['language'] ?? 'unknown';
$scores[$language] = [
'average' => (float) $data['avg_score'],
'count' => (int) $data['count'],
];
}
return $scores;
}
// Display results
$scores = getAverageScoreByLanguage();
foreach ($scores as $language => $data) {
print "$language: Average {$data['average']}, Count: {$data['count']}\n";
}

Evaluate content that has been translated:

<?php
/**
* Evaluates translated content against original.
*/
function evaluateTranslationQuality(
string $originalContent,
string $translatedContent,
string $sourceLang,
string $targetLang
): void {
$evaluationManager = \Drupal::service('ai_autoevals.evaluation_manager');
// Create evaluation for source language
$sourceEvaluation = $evaluationManager->createEvaluation([
'evaluation_set_id' => "{$sourceLang}_evaluation",
'request_id' => 'translation_source_' . uniqid(),
'provider_id' => 'amazeeio',
'model_id' => 'chat',
'operation_type' => 'chat',
'input' => "Original text: $originalContent",
'output' => $originalContent,
'tags' => [
'language' => $sourceLang,
'evaluation_type' => 'translation_quality',
'translation_pair' => "{$sourceLang}_{$targetLang}",
],
]);
// Create evaluation for translated content
$targetEvaluation = $evaluationManager->createEvaluation([
'evaluation_set_id' => "{$targetLang}_evaluation",
'request_id' => 'translation_target_' . uniqid(),
'provider_id' => 'amazeeio',
'model_id' => 'chat',
'operation_type' => 'chat',
'input' => "Translation: $translatedContent",
'output' => $translatedContent,
'tags' => [
'language' => $targetLang,
'evaluation_type' => 'translation_quality',
'translation_pair' => "{$sourceLang}_{$targetLang}",
'source_language' => $sourceLang,
'original_id' => $sourceEvaluation->id(),
],
]);
// Link evaluations
$sourceEvaluation->set('metadata', [
'translation_target_id' => $targetEvaluation->id(),
]);
$sourceEvaluation->save();
$targetEvaluation->set('metadata', [
'translation_source_id' => $sourceEvaluation->id(),
]);
$targetEvaluation->save();
}

Create a custom fact extractor with language detection:

<?php
namespace Drupal\multilang_ai\Plugin\FactExtractor;
use Drupal\ai_autoevals\Plugin\FactExtractor\FactExtractorPluginBase;
/**
* Multi-language fact extractor.
*
* @FactExtractor(
* id = "multilang_extractor",
* label = @Translation("Multi-Language Extractor"),
* description = @Translation("Extracts facts in the detected language."),
* weight = 50
* )
*/
class MultiLanguageFactExtractor extends FactExtractorPluginBase {
/**
* Language-specific fact templates.
*/
protected const LANGUAGE_FACTS = [
'en' => 'The answer should address the question in English',
'es' => 'La respuesta debe responder a la pregunta en español',
'fr' => 'La réponse doit répondre à la question en français',
'de' => 'Die Antwort muss die Frage auf Deutsch beantworten',
];
/**
* {@inheritdoc}
*/
public function extract(string $input, array $context = []): array {
$facts = [];
// Detect language
$language = $this->detectLanguage($input);
// Add language-specific fact
if (isset(self::LANGUAGE_FACTS[$language])) {
$facts[] = self::LANGUAGE_FACTS[$language];
}
else {
$facts[] = self::LANGUAGE_FACTS['en'];
}
// Extract numerical values (language-agnostic)
if (preg_match_all('/\b\d+(?:\.\d+)?\b/', $input, $matches)) {
foreach ($matches[0] as $value) {
$facts[] = "The answer should accurately reference the value: $value";
}
}
return $facts;
}
/**
* Detects language from input.
*/
protected function detectLanguage(string $input): string {
$languagePatterns = [
'es' => ['/¿/u', '/\b(el|la|los|las|es|son|tengo|tiene|está)\b/iu'],
'fr' => ['/¿/u', '/\b(le|la|les|est|sont|je|tu|il|elle|nous|vous)\b/iu'],
'de' => ['/ß/u', '/\b(der|die|das|ist|sind|ich|du|er|sie|wir|ihr)\b/iu'],
];
foreach ($languagePatterns as $lang => $patterns) {
foreach ($patterns as $pattern) {
if (preg_match($pattern, $input)) {
return $lang;
}
}
}
return 'en';
}
/**
* {@inheritdoc}
*/
public function isAvailable(): bool {
return TRUE;
}
}
  1. Use Standard Language Codes

    Use ISO 639-1 language codes (en, es, fr, de, etc.):

    'language' => 'en', // English
    'language' => 'es', // Spanish
    'language' => 'fr', // French
  2. Provide Native Prompts

    Use native language in evaluation prompts for better accuracy.

  3. Test Each Language

    Test evaluation sets for each language to ensure accuracy:

    foreach (['en', 'es', 'fr'] as $lang) {
    $testResponse = makeAiRequest($testQuestions[$lang], $lang);
    print "$lang: $testResponse\n";
    }
  4. Track Language Metrics

    Monitor performance by language to identify issues:

    $scores = getAverageScoreByLanguage();
    if (isset($scores['de']) && $scores['de']['average'] < 0.7) {
    print "Warning: German evaluation scores are low\n";
    }
  5. Handle Unsupported Languages

    Provide a fallback for unsupported languages:

    $supportedLanguages = ['en', 'es', 'fr', 'de'];
    if (!in_array($detectedLang, $supportedLanguages)) {
    // Use English evaluation set as default
    $language = 'en';
    }