API Reference

This document provides a comprehensive API reference for AI AutoEvals services and entities.

Services

AiAutoevalsConfig

Service ID: ai_autoevals.config

Centralized configuration service for accessing module settings.

$config = \Drupal::service('ai_autoevals.config');

Methods

getProviderId(): string

Gets the default AI provider ID with fallback to system defaults.

$providerId = $config->getProviderId();

Returns: string - The provider ID

getModelId(): string

Gets the default AI model ID with fallback to system defaults.

$modelId = $config->getModelId();

Returns: string - The model ID

getProvider(): ProviderProxy

Gets the configured AI provider instance.

$provider = $config->getProvider();

Returns: \Drupal\ai\Plugin\ProviderProxy - The AI provider instance

Throws: \RuntimeException - If no AI provider is configured

isConfigured(): bool

Checks if AI provider is properly configured.

if ($config->isConfigured()) {
  // Provider and model are configured
}

Returns: bool - TRUE if both provider and model are configured

getGlobalExcludeQueryKeywords(): array

Gets the global query exclusion keywords from module settings.

$exclusions = $config->getGlobalExcludeQueryKeywords();

Returns: array - The global query exclusion keywords

getGlobalExcludeResponseKeywords(): array

Gets the global response exclusion keywords from module settings.

$exclusions = $config->getGlobalExcludeResponseKeywords();

Returns: array - The global response exclusion keywords

getOperationTypes(): array

Gets the configured operation types to evaluate.

$types = $config->getOperationTypes();
// Returns: ['chat', 'chat_completion']

Returns: array - The operation types

isAutoTrackEnabled(): bool

Checks if auto-tracking is enabled.

if ($config->isAutoTrackEnabled()) {
  // Auto-track is enabled
}

Returns: bool - TRUE if auto-track is enabled

isDebugMode(): bool

Checks if debug mode is enabled.

if ($config->isDebugMode()) {
  // Debug mode is enabled
}

Returns: bool - TRUE if debug mode is enabled

getFactExtractionMethod(): string

Gets the default fact extraction method.

$method = $config->getFactExtractionMethod();

Returns: string - The fact extraction method

getContextDepth(): int

Gets the default context depth.

$depth = $config->getContextDepth();

Returns: int - The context depth

getRetentionPeriod(): int

Gets the retention period in days.

$days = $config->getRetentionPeriod();

Returns: int - The retention period in days

KeywordMatcher

Service ID: ai_autoevals.keyword_matcher

Service for keyword matching used throughout the module.

$matcher = \Drupal::service('ai_autoevals.keyword_matcher');

Methods

matchesAny(string $text, array $keywords): bool

Checks if text matches any of the given keywords.

if ($matcher->matchesAny('the weather is nice', ['weather', 'forecast'])) {
  // At least one keyword matched
}

Parameters:

$text (string): The text to search in
$keywords (array): The keywords to match

Returns: bool - TRUE if any keyword is found

matchesAll(string $text, array $keywords): bool

Checks if text matches all of the given keywords.

if ($matcher->matchesAll('the weather is nice and sunny', ['weather', 'sunny'])) {
  // All keywords matched
}

Parameters:

$text (string): The text to search in
$keywords (array): The keywords to match

Returns: bool - TRUE if all keywords are found

matches(string $text, array $keywords, string $mode = ‘any’): bool

Checks if text matches keywords based on the specified mode.

$anyMatch = $matcher->matches($text, $keywords, 'any');
$allMatch = $matcher->matches($text, $keywords, 'all');

Parameters:

$text (string): The text to search in
$keywords (array): The keywords to match
$mode (string): The match mode: ‘any’ or ‘all’

Returns: bool - TRUE if text matches keywords according to the mode

EvaluationManager

Service ID: ai_autoevals.evaluation_manager

The main service for managing evaluations.

$evaluationManager = \Drupal::service('ai_autoevals.evaluation_manager');

Methods

createEvaluation(array $data): EvaluationResultInterface

Creates a new evaluation result entity.

$evaluation = $evaluationManager->createEvaluation([
  'evaluation_set_id' => 'default',
  'request_id' => 'unique-request-id',
  'request_parent_id' => null,
  'provider_id' => 'openai',
  'model_id' => 'gpt-4',
  'operation_type' => 'chat',
  'input' => 'User input text',
  'output' => 'AI response text',
  'tags' => ['key' => 'value'],
  'metadata' => ['additional' => 'data'],
]);

Parameters:

$data: Array of evaluation data
- evaluation_set_id (string): ID of evaluation set to use
- request_id (string): Unique identifier for the request
- request_parent_id (string|null): Parent request ID for conversation tracking
- provider_id (string): AI provider ID
- model_id (string): AI model ID
- operation_type (string): Operation type (chat, chat_completion)
- input (string): User’s input
- output (string): AI’s response
- tags (array): Optional tags
- metadata (array): Optional metadata

Returns: EvaluationResultInterface

queueEvaluation(int $evaluation_id): void

Queues an evaluation for async processing.

$evaluationManager->queueEvaluation($evaluation->id());

Parameters:

$evaluation_id (int): The evaluation entity ID

getEvaluationScore(int $evaluation_id): ?float

Gets the score for a completed evaluation.

$score = $evaluationManager->getEvaluationScore(123);

Parameters:

$evaluation_id (int): The evaluation entity ID

Returns: float|null - The score, or null if not completed

getEvaluationHistory(array $filters, int $limit, int $offset): array

Gets evaluation history with optional filters.

$results = $evaluationManager->getEvaluationHistory([
  'status' => 'completed',
  'evaluation_set_id' => 'default',
  'score_min' => 0.5,
], 50, 0);

Parameters:

$filters (array): Filter criteria
- status (string|null): Filter by status
- evaluation_set_id (string|null): Filter by evaluation set
- score_min (float|null): Minimum score
- score_max (float|null): Maximum score
- provider_id (string|null): Filter by provider
- model_id (string|null): Filter by model
$limit (int): Number of results to return
$offset (int): Number of results to skip

Returns: array - Array of evaluation entities

getActiveEvaluationSets(): array

Gets all enabled evaluation set configurations.

$sets = $evaluationManager->getActiveEvaluationSets();

Returns: array - Array of EvaluationSet entities

getMatchingEvaluationSet(array $tags, string $operationType): ?EvaluationSetInterface

Finds the first matching evaluation set for given tags and operation type.

$set = $evaluationManager->getMatchingEvaluationSet(['category' => 'support'], 'chat');

Parameters:

$tags (array): Tags from the request
$operationType (string): Operation type (chat, chat_completion)

Returns: EvaluationSetInterface|null - Matching evaluation set or null

getMatchingEvaluationSetWithHook(array $tags, string $operationType, ?string $inputText = NULL, ?string $outputText = NULL): ?EvaluationSetInterface

Gets matching evaluation sets and allows modules to filter them via hook_ai_autoevals_evaluation_sets_alter().

This method retrieves all matching evaluation sets and invokes the hook to allow modules to filter the sets based on custom criteria such as language, user roles, content type, etc.

$set = $evaluationManager->getMatchingEvaluationSetWithHook(
  ['category' => 'support'],
  'chat',
  'What is the weather like today?',
  NULL
);

Parameters:

$tags (array): Tags from the request
$operationType (string): Operation type (chat, chat_completion)
$inputText (string|null): Optional input text for additional filtering
$outputText (string|null): Optional output text for additional filtering

Returns: EvaluationSetInterface|null - The first matching evaluation set after hook filtering, or NULL

See also: hook_ai_autoevals_evaluation_sets_alter()

shouldEvaluateOperation(string $operationType): bool

Checks if an operation type should be evaluated based on configuration.

if ($evaluationManager->shouldEvaluateOperation('chat')) {
  // Proceed with evaluation
}

Parameters:

$operationType (string): Operation type to check

Returns: bool - True if should evaluate

getStatistics(): array

Gets dashboard statistics.

$stats = $evaluationManager->getStatistics();
// Returns: [
//   'total' => 1000,
//   'by_status' => ['pending' => 10, 'completed' => 980, 'failed' => 10],
//   'average_score' => 0.85,
//   'by_evaluation_set' => ['default' => 500, 'strict' => 500]
// ]

Returns: array - Statistics array with total, by_status, average_score, by_evaluation_set

requeueEvaluation(int $evaluation_id): bool

Requeues a failed or pending evaluation.

$success = $evaluationManager->requeueEvaluation(123);

Parameters:

$evaluation_id (int): The evaluation entity ID

Returns: bool - True on success

getRecentEvaluations(int $limit = 10): array

Gets recent evaluations.

$recent = $evaluationManager->getRecentEvaluations(20);

Parameters:

$limit (int): Number of evaluations to return

Returns: array - Array of evaluation entities

FactExtractor

Service ID: ai_autoevals.fact_extractor

Service for extracting evaluation criteria from user input.

$factExtractor = \Drupal::service('ai_autoevals.fact_extractor');

Methods

extractFacts(string $input, array $context = [], ?EvaluationSetInterface $evaluationSet = NULL): array

Extracts evaluation criteria from user input.

$facts = $factExtractor->extractFacts(
  'What is the capital of France? Answer in one word.',
  ['previous_turns' => []],
  $evaluationSet
);
// Returns: [
//   'The answer should mention Paris as the capital of France',
//   'The answer should contain no more than one word'
// ]

Parameters:

$input (string): User’s input/question
$context (array): Additional context (previous turns, etc.)
$evaluationSet (EvaluationSetInterface|null): Optional evaluation set with custom knowledge

Returns: array - Array of extracted facts

Using Custom Knowledge

When the evaluation set has custom knowledge defined, the fact extractor incorporates it:

// Evaluation set with custom knowledge about a product
$evaluationSet = EvaluationSet::load('product_support');
// Custom knowledge: "SuperWidget battery life: 8 hours, Weight: 250g"

$facts = $factExtractor->extractFacts(
  'What is the battery life of SuperWidget?',
  [],
  $evaluationSet
);
// Returns: [
//   'The answer should state that SuperWidget has 8 hours of battery life'
// ]

Evaluator

Service ID: ai_autoevals.evaluator

Service for evaluating AI responses against extracted facts.

$evaluator = \Drupal::service('ai_autoevals.evaluator');

Methods

evaluate(EvaluationSetInterface $evaluationSet, string $input, array $facts, string $output): array

Evaluates a response.

$result = $evaluator->evaluate(
  $evaluationSet,
  'User question',
  ['Fact 1', 'Fact 2'],
  'AI response'
);
// Returns: ['choice' => 'C', 'score' => 1.0, 'analysis' => '...']

Parameters:

$evaluationSet (EvaluationSetInterface): Evaluation set configuration
$input (string): User’s original question
$facts (array): Extracted evaluation criteria
$output (string): AI’s response

Returns: array - Array with choice, score, and analysis

loadPromptTemplate(EvaluationSetInterface $evaluationSet): string

Loads the prompt template from the evaluation set.

$template = $evaluator->loadPromptTemplate($evaluationSet);

Parameters:

$evaluationSet (EvaluationSetInterface): Evaluation set configuration

Returns: string - The prompt template

parseResponse(string $response): array

Parses the LLM response to extract choice and analysis.

$parsed = $evaluator->parseResponse($responseText);
// Returns: ['choice' => 'C', 'analysis' => '...']

Parameters:

$response (string): Raw LLM response

Returns: array - Parsed choice and analysis

calculateScore(string $choice, array $choiceScores): float

Calculates the score based on choice.

$score = $evaluator->calculateScore('C', ['A' => 0.4, 'B' => 0.6, 'C' => 1.0, 'D' => 0.0]);

Parameters:

$choice (string): The evaluation choice (A, B, C, D)
$choiceScores (array): Scoring configuration

Returns: float - The calculated score

ConversationTracker

Service ID: ai_autoevals.conversation_tracker

Service for tracking conversation context.

$tracker = \Drupal::service('ai_autoevals.conversation_tracker');

Methods

trackConversation(string $requestId, ?string $parentId, array $data): void

Tracks a conversation turn.

$tracker->trackConversation('request-123', 'parent-456', [
  'input' => 'User message',
  'output' => 'AI response',
]);

Parameters:

$requestId (string): Current request ID
$parentId (string|null): Parent request ID
$data (array): Conversation data

getConversationContext(string $requestId, int $depth): array

Gets conversation context.

$context = $tracker->getConversationContext('request-123', 3);

Parameters:

$requestId (string): Request ID
$depth (int): Number of turns to include

Returns: array - Conversation context

isFollowUp(string $requestId): bool

Checks if a request is a follow-up.

if ($tracker->isFollowUp('request-123')) {
  // This is a follow-up request
}

Parameters:

$requestId (string): Request ID to check

Returns: bool - True if follow-up

getThreadRoot(string $requestId): string

Gets the root request ID of a conversation thread.

$rootId = $tracker->getThreadRoot('request-123');

Parameters:

$requestId (string): Request ID in the thread

Returns: string - Root request ID

clearConversation(string $requestId): void

Clears conversation data for a request.

$tracker->clearConversation('request-123');

Parameters:

$requestId (string): Request ID to clear

EvaluationBatchProcessor

Service ID: ai_autoevals.batch_processor

Service for batch operations.

$batchProcessor = \Drupal::service('ai_autoevals.batch_processor');

Methods

reEvaluateBatch(array $evaluationIds, string $newEvaluationSetId): array

Re-evaluates multiple evaluations with a different configuration.

$newIds = $batchProcessor->reEvaluateBatch([1, 2, 3], 'strict');

Parameters:

$evaluationIds (array): Array of evaluation IDs
$newEvaluationSetId (string): New evaluation set ID to use

Returns: array - Array of new evaluation IDs

compareConfigurations(array $configIds, array $sampleIds): array

Compares multiple evaluation configurations.

$results = $batchProcessor->compareConfigurations(
  ['default', 'strict'],
  [1, 2, 3, 4, 5]
);

Parameters:

$configIds (array): Evaluation set IDs to compare
$sampleIds (array): Evaluation IDs to sample

Returns: array - Comparison results

requeueAllFailed(): int

Requeues all failed evaluations.

$count = $batchProcessor->requeueAllFailed();

Returns: int - Number of evaluations requeued

scheduleBatchReEvaluation(array $filters, string $newConfigId, int $limit = 100): int

Schedules batch re-evaluation based on filters.

$count = $batchProcessor->scheduleBatchReEvaluation(
  ['status' => 'completed', 'evaluation_set_id' => 'default'],
  'strict',
  50
);

Parameters:

$filters (array): Filter criteria
$newConfigId (string): New evaluation set ID
$limit (int): Maximum evaluations to process

Returns: int - Number of evaluations scheduled

getComparisonResults(int $originalId): array

Gets comparison data for completed re-evaluations.

$comparison = $batchProcessor->getComparisonResults(123);
// Returns: ['original' => [...], 're_evaluations' => [...]]

Parameters:

$originalId (int): Original evaluation ID

Returns: array - Comparison data

Entities

EvaluationResult

Content entity storing evaluation results.

Entity Type ID: ai_autoevals_evaluation_result

Interface: EvaluationResultInterface

Key Methods

getEvaluationSetId(): string - Get evaluation set ID
getRequestId(): string - Get request ID
getRequestParentId(): ?string - Get parent request ID
getProviderId(): string - Get provider ID
getModelId(): string - Get model ID
getOperationType(): string - Get operation type
getInput(): string - Get user input
getOutput(): string - Get AI response
getFacts(): array - Get extracted facts
getStatus(): string - Get status (pending, processing, completed, failed)
getScore(): ?float - Get score
getChoice(): ?string - Get evaluation choice
getAnalysis(): ?string - Get analysis text
getTags(): array - Get tags
getMetadata(): array - Get metadata
isPending(): bool - Check if pending
isProcessing(): bool - Check if processing
isCompleted(): bool - Check if completed
isFailed(): bool - Check if failed

EvaluationSet

Config entity storing evaluation configurations.

Entity Type ID: ai_autoevals_evaluation_set

Interface: EvaluationSetInterface

Key Methods

getDescription(): string - Get description
getOperationTypes(): array - Get operation types
getFactExtractionMethod(): string - Get fact extraction method
getCustomKnowledge(): string - Get custom knowledge
setCustomKnowledge(string $knowledge): static - Set custom knowledge
hasCustomKnowledge(): bool - Check if custom knowledge exists
getPromptTemplateId(): string - Get prompt template ID
getCustomPromptTemplate(): string - Get custom prompt template
getChoiceScores(): array - Get choice scores
getScoreForChoice(string $choice): float - Get score for specific choice
getTags(): array - Get tag filters
getContextDepth(): int - Get context depth
isEnabled(): bool - Check if enabled
getWeight(): int - Get weight
matchesOperationType(string $operationType): bool - Check if matches operation type
matchesTags(array $requestTags): bool - Check if matches tags

Keyword Matching Methods

hasKeywords(): bool - Check if query or response keywords are defined
matchesQuery(string $query): bool - Check if query matches keywords
matchesResponse(string $response): bool - Check if response matches keywords
getQueryKeywords(): array - Get query keywords
getResponseKeywords(): array - Get response keywords
getExcludeQueryKeywords(): array - Get query exclusion keywords
getExcludeResponseKeywords(): array - Get response exclusion keywords
getKeywordMatchMode(): string - Get match mode (‘any’ or ‘all’)

Programmatic Creation with Builder

Use the fluent builder pattern to create evaluation sets:

use Drupal\ai_autoevals\Entity\EvaluationSet;

$set = EvaluationSet::builder('weather_eval', 'Weather Evaluation')
  ->withDescription('Evaluates weather-related AI responses')
  ->forOperations(['chat'])
  ->triggerOnKeywords(['weather', 'forecast'], [])
  ->excludeOnKeywords(['test', 'debug'], [])
  ->withFactExtractionMethod('ai_generated')
  ->withContextDepth(3)
  ->build();

Builder Methods:

EvaluationSet::builder(string $id, string $label): Create a new builder
withDescription(string $description): Set description
forOperations(array $types): Set operation types
withTags(array $tags): Set required tags
triggerOnKeywords(array $queryKeywords, array $responseKeywords = []): Set trigger keywords
excludeOnKeywords(array $queryKeywords, array $responseKeywords = []): Set exclusion keywords
withKeywordMatchMode(string $mode): Set ‘any’ or ‘all’ mode
withFactExtractionMethod(string $method): Set extraction method
withContextDepth(int $depth): Set context depth
withCustomKnowledge(string $knowledge): Set domain knowledge
withCustomPromptTemplate(string $template): Set custom prompt
withChoiceScores(array $scores): Set scoring mapping
withWeight(int $weight): Set priority weight
enabled(bool $enabled = TRUE): Set enabled status
disabled(): Disable the set
build(): Create and save the set
buildWithoutSaving(): Create without saving

Custom Knowledge

The custom_knowledge field allows you to provide domain-specific context that guides the fact extractor.

// Set custom knowledge on an evaluation set
$evaluationSet->setCustomKnowledge('
Product: SuperWidget Pro
- Battery life: 8 hours
- Weight: 250g
- Colors: Red, Blue, Green
');
$evaluationSet->save();

// Check if custom knowledge exists
if ($evaluationSet->hasCustomKnowledge()) {
  $knowledge = $evaluationSet->getCustomKnowledge();
}

Queue

Queue ID: ai_autoevals_evaluation_worker

The queue processes evaluations asynchronously.

Process manually:

drush queue:run ai_autoevals_evaluation_worker

Worker Class: Drupal\ai_autoevals\Plugin\QueueWorker\EvaluationQueueWorker

Time Limit: 60 seconds per cron run

Next Steps

Event System - Learn about events
Plugin Development - Create custom plugins
Examples - See real-world implementations