Prompt Leak Detection

Biztonsági mechanizmus, amely megakadályozza, hogy az AI visszaadja a prompt instrukciókat a generált tartalomban.

Probléma

Prompt leak: Az AI válaszában visszaköszön a prompt részlete (pl. "Feladat:", "Követelmények:", Blade direktívák).

Példa false positive (Queue #1595):

Téma: "Laravel Blade direktívák használata"
AI generált tutorial → említi: @if, @foreach, {!! !!}
Detektor tévesen prompt leak-nek jelöli → retry → újabb tutorial → újabb false positive

Megoldás: 4-rétegű védelem

Layer 1: Preventív (Prompt Instruction Enhancement)

Cél: AI figyelmeztetése CDATA használatára programozási témáknál.

Implementáció: resources/views/prompts/blog-posts/bp_articles/primary-instructions.blade.php

@if(isset($product['theme']) && preg_match('/(laravel|php|blade|...|coding)/iu', $product['theme']))

FONTOS: Programozási témák esetén CDATA szekció használata kötelező!

Példa helyes formátum:
<p>A Blade direktívák használata: <![CDATA[@foreach ($users as $user)]]></p>

@endif

Hatás: AI-t instruálja → kevesebb false positive.

Layer 2: Reactive (Context-Aware Detection) 🎯 CORE FIX

Cél: Tutorial/educational kontextus felismerése heurisztikával.

Implementáció: app/Services/PromptLeakDetectionService.php

CDATA Whitelisting

public function removeCDATASectionsForLeakDetection($response)
{
    // CDATA tartalom placeholder-re cserélése
    return preg_replace('/<!\[CDATA\[(.*?)\]\]>/s', '<![CDATA[__CDATA_CONTENT__]]>', $response);
}

Működés: Blade direktívák CDATA-ban → nem trigger-eli a detektort.

Context-Aware Heuristics

public function isLikelyTutorialContext($response, $position)
{
    $context = mb_substr($response, $position - 200, 400); // ±200 karakter

    $tutorialKeywords = [
        'direktíva', 'szintaxis', 'használat', 'például', 'példa',
        'kód', 'code', '<h2>', '<h3>', 'tutorial', 'útmutató', ...
    ];

    $keywordCount = 0;
    foreach ($tutorialKeywords as $keyword) {
        if (mb_stripos($context, $keyword) !== false) {
            $keywordCount++;
        }
    }

    $isEarlyPosition = $position < 500;

    // Decision logic:
    if ($keywordCount >= 3 && !$isEarlyPosition) {
        return true; // Tutorial context → skip detection
    }

    return false; // Real leak → flag it
}

Heurisztika:

3+ tutorial keyword + NOT early position (< 500 char) = tutorial context
Early position (< 500 char) = valódi leak (prompt elején van)

Példa:

Position: 1250 char (cikk közepe)
Keywords: "direktíva", "használat", "példa", <h3> = 4 találat
→ Tutorial context → skip detection ✅

Layer 3: Gradient (Severity-Based Flagging)

Cél: Pozíció alapú severity (HIGH/MEDIUM) → eltérő kezelés.

Implementáció: PromptLeakDetectionService::calculateLeakSeverity()

public function calculateLeakSeverity($pattern, $position, $response)
{
    // Blade syntax specifikus
    if (strpos($pattern, '@foreach|@if') !== false) {
        if ($position < 500) {
            return 'HIGH';    // Early → valódi leak
        }
        if ($position > 1000) {
            return 'MEDIUM';  // Late → tutorial context valószínű
        }
        return 'HIGH';        // 500-1000 → ambiguous
    }

    return 'HIGH'; // Default: minden más pattern HIGH
}

Severity kezelés (Job layer):

HIGH: Retry (max 3 attempt)
MEDIUM: Accept válasz, NO retry (tutorial context)

Implementáció: QueueBlogPostJob.php, QueueProductDescriptionJob.php

} elseif (isset($results[0]['status']) && $results[0]['status'] === 'prompt_leak') {
    $severity = $results[0]['leak_details']['severity'] ?? 'HIGH';

    // MEDIUM severity: Accept (tutorial context)
    if ($severity === 'MEDIUM') {
        Log::channel('tracing')->warning('Prompt leak detected but accepted (MEDIUM severity - tutorial context)', [...]);
        $repository->updateQueuedBlogPostWithSuccess($requestNode, $queuedItem, $results, $isAutomated);
        return;
    }

    // HIGH severity: Retry logic
    // ...
}

Layer 4: Adaptive Retry (FUTURE - Optional)

Cél: ML-alapú pattern finomhangolás production adatokból.

Státusz: ❌ NEM implementált (opcionális, későbbi fejlesztés)

Architektúra

Service Layer (Clean Architecture)

Fájl: app/Services/PromptLeakDetectionService.php

Public methods:

detectPromptLeak($response) - Fő detektálás
calculateLeakSeverity($pattern, $position, $response) - Severity számítás
isLikelyTutorialContext($response, $position) - Tutorial heurisztika
removeCDATASectionsForLeakDetection($response) - CDATA whitelist
findPositionInOriginalResponse(...) - Pozíció mapping

Dependency Injection:

// AiRepository.php
public function __construct(
    protected PromptLeakDetectionService $promptLeakDetectionService
) { }

private function cleanupResponse($response, ...)
{
    $leakDetection = $this->promptLeakDetectionService->detectPromptLeak($response);
    // ...
}

Laravel auto-binding via constructor type-hinting.

Leak Patterns

Detektált minták (regex):

$leakPatterns = [
    // Format instructions
    '/<format_instructions>/i',
    '/Structure your responses according to/i',

    // Task instructions (Hungarian) - CONTEXT-AWARE
    '/^Feladat:/mi',
    '/^Irányelvek:/mi',
    '/^Követelmények:/mi',  // Csak line start (product description false positive elkerülés)

    // Blade/template syntax - CONTEXT-AWARE
    '/@foreach|@if|@php|\{\!\!/i' => [
        'name' => 'blade_syntax',
        'context_check' => true,  // Enable heuristics
    ],

    // Meta-instruction patterns
    '/^(Készíts|Írj egy|Válaszolj|A cikk legyen|Tapasztalt szövegíróként)/mi',

    // Token budget
    '/tokenek arányában|tokenek.*%-a/i',
];

Monitoring

Logs

Channel: api-response

Log típusok:

// Context-aware skip
Log::channel('api-response')->info('Prompt leak pattern detected but skipped (tutorial context)', [
    'pattern' => 'blade_syntax',
    'position' => 1250,
    'context' => '...előző 200 karakter...',
]);

// Valódi leak
Log::channel('api-response')->error('Prompt leak detected in AI response', [
    'pattern_matched' => 'blade_syntax',
    'leak_position' => 120,
    'severity' => 'HIGH',
]);

Query (MySQL):

SELECT created_at, level, message, context
FROM logs
WHERE channel = 'api-response'
  AND message LIKE '%Prompt leak%'
ORDER BY created_at DESC
LIMIT 50;

Production Checklist

Deployment után ellenőrizd:

False positive rate csökkenés:
- Query: prompt_leak_attempts > 0 rekordok (queue táblák)
- Várt: < 1% (tutorial témáknál)

MEDIUM severity usage:

SELECT COUNT(*) FROM logs
WHERE channel = 'api-response'
  AND message LIKE '%MEDIUM severity%'
  AND created_at > '2026-02-05';

Várt: > 0 (tutorial context detection működik)

Tutorial context skip rate:

SELECT COUNT(*) FROM logs
WHERE channel = 'api-response'
  AND message LIKE '%tutorial context%'
  AND created_at > '2026-02-05';

Commit History

Commit	Leírás
`2d3bb9a0`	Phase 1: Prompt instruction enhancement
`6ac8c747`	Phase 2: Context-aware detection (heuristics)
`21075e52`	Phase 3: Severity-based flagging (HIGH/MEDIUM)
`b13f65f6`	Refactor: Extract PromptLeakDetectionService

CI Results:

Staging: ✅ Pipeline #205 (276 sec)
Main: ✅ Pipeline #206

Kapcsolódó Dokumentációk

Implementálva: 2026-02-05 Verzió: 1.0

Probléma​

Megoldás: 4-rétegű védelem​

Layer 1: Preventív (Prompt Instruction Enhancement)​

Layer 2: Reactive (Context-Aware Detection) 🎯 CORE FIX​

CDATA Whitelisting​

Context-Aware Heuristics​

Layer 3: Gradient (Severity-Based Flagging)​

Layer 4: Adaptive Retry (FUTURE - Optional)​

Architektúra​

Service Layer (Clean Architecture)​

Leak Patterns​

Monitoring​

Logs​

Production Checklist​

Commit History​

Kapcsolódó Dokumentációk​