Multi-Language
This example shows how to set up AI AutoEvals for evaluating content in multiple languages.
Overview
Section titled “Overview”-
Language Detection: Detect the language of content
-
Language-Specific Evaluation Sets: Create evaluation sets for each language
-
Custom Prompts: Use language-specific evaluation prompts
-
Tag-Based Routing: Route requests to appropriate evaluation set
Step 1: Create Language-Specific Evaluation Sets
Section titled “Step 1: Create Language-Specific Evaluation Sets”English Evaluation Set
Section titled “English Evaluation Set”-
Navigate to
/admin/content/ai-autoevals/sets -
Click “Add Evaluation Set”
-
Configure:
- Label: “English Evaluation”
- Description: “Evaluation for English content”
- Fact Extraction Method: “AI Generated”
- Custom Prompt Template:
You are evaluating the factual accuracy of an AI response in English.User Question: {{ input }}Evaluation Criteria:{{ facts }}AI Response:{{ output }}Please evaluate the response and choose:A) The response fully meets all evaluation criteria (Exact Match)B) The response includes all expected information plus additional relevant details (Superset)C) The response includes some expected information but is missing some details (Subset)D) The response contradicts evaluation criteria (Disagreement)Provide your analysis and final choice.- Tags Filter:
{"language": "en"}
-
Save
Spanish Evaluation Set
Section titled “Spanish Evaluation Set”-
Click “Add Evaluation Set”
-
Configure:
- Label: “Spanish Evaluation”
- Description: “Evaluation for Spanish content”
- Fact Extraction Method: “AI Generated”
- Custom Prompt Template:
Estás evaluando la precisión factual de una respuesta de IA en español.Pregunta del Usuario: {{ input }}Criterios de Evaluación:{{ facts }}Respuesta de IA:{{ output }}Por favor, evalúa la respuesta y elige:A) La respuesta cumple completamente con todos los criterios de evaluación (Coincidencia Exacta)B) La respuesta incluye toda la información esperada más detalles adicionales relevantes (Sobrecconjunto)C) La respuesta incluye parte de la información esperada pero falta algunos detalles (Subconjunto)D) La respuesta contradice los criterios de evaluación (Desacuerdo)Proporciona tu análisis y elección final.- Tags Filter:
{"language": "es"}
-
Save
French Evaluation Set
Section titled “French Evaluation Set”-
Click “Add Evaluation Set”
-
Configure:
- Label: “French Evaluation”
- Description: “Evaluation for French content”
- Fact Extraction Method: “AI Generated”
- Custom Prompt Template:
Vous évaluez l'exactitude factuelle d'une réponse IA en français.Question de l'utilisateur: {{ input }}Critères d'évaluation:{{ facts }}Réponse IA:{{ output }}Veuillez évaluer la réponse et choisir:A) La réponse répond entièrement à tous les critères d'évaluation (Correspondance Exacte)B) La réponse inclut toutes les informations attendues plus des détails supplémentaires pertinents (Surensemble)C) La réponse inclut certaines informations attendues mais manque certains détails (Sous-ensemble)D) La réponse contredit les critères d'évaluation (Désaccord)Fournissez votre analyse et votre choix final.- Tags Filter:
{"language": "fr"}
-
Save
Step 2: Detect Content Language
Section titled “Step 2: Detect Content Language”Create a helper function to detect language:
<?php
/** * Detects language of content. */function detectContentLanguage(string $text): string { // Simple language detection based on common words $languagePatterns = [ 'es' => ['/¿/u', '/¿/u', '/\b(el|la|los|las|un|una|es|son|tengo|tiene|estoy|está)\b/iu'], 'fr' => ['/¿/u', '/\b(le|la|les|un|une|est|sont|je|tu|il|elle|nous|vous)\b/iu'], 'de' => ['/ß/u', '/\b(der|die|das|ein|eine|ist|sind|ich|du|er|sie|wir|ihr)\b/iu'], 'it' => ['/¿/u', '/\b(il|la|lo|le|gli|un|una|è|sono|ho|hai|è)\b/iu'], ];
foreach ($languagePatterns as $lang => $patterns) { foreach ($patterns as $pattern) { if (preg_match($pattern, $text)) { return $lang; } } }
// Default to English return 'en';}Or use a language detection library:
<?php
use Gettext\Translations;use LanguageDetection\Language;
/** * Detects language of content using library. */function detectContentLanguageLib(string $text): string { // Requires composer require patrickpreuss/language-detection $ld = new Language([ 'en', 'es', 'fr', 'de', 'it', 'pt', 'nl', 'pl', 'ru', 'zh', 'ja' ]);
return $ld->detect($text)->close();}Step 3: Route Requests by Language
Section titled “Step 3: Route Requests by Language”Add language tag to AI requests:
<?php
/** * Makes an AI request with language tag. */function makeAiRequest(string $question, string $language = 'en'): string { $input = new ChatInput([ new ChatMessage('user', $question), ]);
$provider = \Drupal::service('ai.provider')->createInstance('amazeeio'); $model = 'gpt-4';
// Detect language if not provided if ($language === 'auto') { $language = detectContentLanguage($question); }
// Make request with language tag $response = $provider->chat($input, $model, [ 'ai_autoevals:track' => TRUE, 'language' => $language, ]);
return $response->getNormalized()->getText();}Step 4: Evaluate Content
Section titled “Step 4: Evaluate Content”<?php
// Evaluate English content$englishResponse = makeAiRequest('What is the capital of France?', 'en');// Will use "English Evaluation" set
// Evaluate Spanish content$spanishResponse = makeAiRequest('¿Cuál es la capital de Francia?', 'es');// Will use "Spanish Evaluation" set
// Evaluate French content$frenchResponse = makeAiRequest('Quelle est la capitale de la France?', 'fr');// Will use "French Evaluation" set
// Auto-detect language$autoResponse = makeAiRequest('Was ist die Hauptstadt von Frankreich?', 'auto');// Will detect German and use default setStep 5: Compare Language Performance
Section titled “Step 5: Compare Language Performance”<?php
/** * Gets average score by language. */function getAverageScoreByLanguage(): array { $query = \Drupal::database()->select('ai_autoevals_evaluation_result', 'e'); $query->addExpression('AVG(e.score)', 'avg_score'); $query->addExpression('COUNT(e.id)', 'count'); $query->addField('e', 'tags');
// Extract language from tags JSON $query->where("JSON_EXTRACT(e.tags, '$.language') IS NOT NULL");
$query->groupBy('e.tags');
$results = $query->execute()->fetchAllAssoc('tags');
$scores = []; foreach ($results as $tags => $data) { $tagsArray = json_decode($tags, TRUE); $language = $tagsArray['language'] ?? 'unknown'; $scores[$language] = [ 'average' => (float) $data['avg_score'], 'count' => (int) $data['count'], ]; }
return $scores;}
// Display results$scores = getAverageScoreByLanguage();foreach ($scores as $language => $data) { print "$language: Average {$data['average']}, Count: {$data['count']}\n";}Advanced: Content Translation Evaluation
Section titled “Advanced: Content Translation Evaluation”Evaluate content that has been translated:
<?php
/** * Evaluates translated content against original. */function evaluateTranslationQuality( string $originalContent, string $translatedContent, string $sourceLang, string $targetLang): void { $evaluationManager = \Drupal::service('ai_autoevals.evaluation_manager');
// Create evaluation for source language $sourceEvaluation = $evaluationManager->createEvaluation([ 'evaluation_set_id' => "{$sourceLang}_evaluation", 'request_id' => 'translation_source_' . uniqid(), 'provider_id' => 'amazeeio', 'model_id' => 'chat', 'operation_type' => 'chat', 'input' => "Original text: $originalContent", 'output' => $originalContent, 'tags' => [ 'language' => $sourceLang, 'evaluation_type' => 'translation_quality', 'translation_pair' => "{$sourceLang}_{$targetLang}", ], ]);
// Create evaluation for translated content $targetEvaluation = $evaluationManager->createEvaluation([ 'evaluation_set_id' => "{$targetLang}_evaluation", 'request_id' => 'translation_target_' . uniqid(), 'provider_id' => 'amazeeio', 'model_id' => 'chat', 'operation_type' => 'chat', 'input' => "Translation: $translatedContent", 'output' => $translatedContent, 'tags' => [ 'language' => $targetLang, 'evaluation_type' => 'translation_quality', 'translation_pair' => "{$sourceLang}_{$targetLang}", 'source_language' => $sourceLang, 'original_id' => $sourceEvaluation->id(), ], ]);
// Link evaluations $sourceEvaluation->set('metadata', [ 'translation_target_id' => $targetEvaluation->id(), ]); $sourceEvaluation->save();
$targetEvaluation->set('metadata', [ 'translation_source_id' => $sourceEvaluation->id(), ]); $targetEvaluation->save();}Advanced: Custom Language Detector
Section titled “Advanced: Custom Language Detector”Create a custom fact extractor with language detection:
<?php
namespace Drupal\multilang_ai\Plugin\FactExtractor;
use Drupal\ai_autoevals\Plugin\FactExtractor\FactExtractorPluginBase;
/** * Multi-language fact extractor. * * @FactExtractor( * id = "multilang_extractor", * label = @Translation("Multi-Language Extractor"), * description = @Translation("Extracts facts in the detected language."), * weight = 50 * ) */class MultiLanguageFactExtractor extends FactExtractorPluginBase {
/** * Language-specific fact templates. */ protected const LANGUAGE_FACTS = [ 'en' => 'The answer should address the question in English', 'es' => 'La respuesta debe responder a la pregunta en español', 'fr' => 'La réponse doit répondre à la question en français', 'de' => 'Die Antwort muss die Frage auf Deutsch beantworten', ];
/** * {@inheritdoc} */ public function extract(string $input, array $context = []): array { $facts = [];
// Detect language $language = $this->detectLanguage($input);
// Add language-specific fact if (isset(self::LANGUAGE_FACTS[$language])) { $facts[] = self::LANGUAGE_FACTS[$language]; } else { $facts[] = self::LANGUAGE_FACTS['en']; }
// Extract numerical values (language-agnostic) if (preg_match_all('/\b\d+(?:\.\d+)?\b/', $input, $matches)) { foreach ($matches[0] as $value) { $facts[] = "The answer should accurately reference the value: $value"; } }
return $facts; }
/** * Detects language from input. */ protected function detectLanguage(string $input): string { $languagePatterns = [ 'es' => ['/¿/u', '/\b(el|la|los|las|es|son|tengo|tiene|está)\b/iu'], 'fr' => ['/¿/u', '/\b(le|la|les|est|sont|je|tu|il|elle|nous|vous)\b/iu'], 'de' => ['/ß/u', '/\b(der|die|das|ist|sind|ich|du|er|sie|wir|ihr)\b/iu'], ];
foreach ($languagePatterns as $lang => $patterns) { foreach ($patterns as $pattern) { if (preg_match($pattern, $input)) { return $lang; } } }
return 'en'; }
/** * {@inheritdoc} */ public function isAvailable(): bool { return TRUE; }
}Best Practices
Section titled “Best Practices”-
Use Standard Language Codes
Use ISO 639-1 language codes (en, es, fr, de, etc.):
'language' => 'en', // English'language' => 'es', // Spanish'language' => 'fr', // French -
Provide Native Prompts
Use native language in evaluation prompts for better accuracy.
-
Test Each Language
Test evaluation sets for each language to ensure accuracy:
foreach (['en', 'es', 'fr'] as $lang) {$testResponse = makeAiRequest($testQuestions[$lang], $lang);print "$lang: $testResponse\n";} -
Track Language Metrics
Monitor performance by language to identify issues:
$scores = getAverageScoreByLanguage();if (isset($scores['de']) && $scores['de']['average'] < 0.7) {print "Warning: German evaluation scores are low\n";} -
Handle Unsupported Languages
Provide a fallback for unsupported languages:
$supportedLanguages = ['en', 'es', 'fr', 'de'];if (!in_array($detectedLang, $supportedLanguages)) {// Use English evaluation set as default$language = 'en';}
Next Steps
Section titled “Next Steps”- Custom Fact Extractors - Domain-specific evaluation
- Content Moderation - Content moderation workflow
- API Reference - Complete service documentation