Production'da bug'ları debug etmek zaman alıyor. Controller'dan service layer'a kadar full call chain'i trace edip, toplanan hata loglarını LLM ile analiz ederek otomatik root cause tespiti yapan bir sistem geliştirdim. Spring Boot AOP + Ollama kombinasyonu ile observability'yi bir üst seviyeye taşıdım.

Problem: Production Bug Debugging Süreci

Tipik bir production hata senaryosu şöyle gelişir:

  1. Kullanıcı hata bildiriyor
  2. Log dosyalarını manuel olarak tarıyorsun
  3. Stack trace'den exception'ı görüyorsun
  4. Ama hangi servisin, hangi metodu çağırırken, hangi parametrelerle hata verdiğini bulamıyorsun
  5. Log level'ı DEBUG'a alıp tekrar test ediyorsun
  6. Sorunu bulana kadar saatler harcıyorsun

Bu süreç hem zaman alıcı hem de frustrating. Özellikle mikroservis mimarisinde call chain birden fazla servisi kapsadığında, hata kaynağını bulmak çok zorlaşıyor.

Çözüm: AOP Tabanlı Dynamic Tracing + AI Analysis

Sistem Mimarisi:

Implementation: Step by Step

1. Trace Entity ve Model

java ApiTrace.java
@Entity
@Table(name = "api_traces")
@Data
public class ApiTrace {
    
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;
    
    @Column(name = "trace_id", nullable = false, unique = true)
    private String traceId; // UUID for correlation
    
    @Column(name = "request_path")
    private String requestPath; // /api/declaration/save
    
    @Column(name = "http_method")
    private String httpMethod; // POST, GET, etc.
    
    @Column(name = "class_name")
    private String className; // com.example.DeclarationController
    
    @Column(name = "method_name")
    private String methodName; // saveDeclaration
    
    @Column(columnDefinition = "TEXT")
    private String parameters; // JSON serialized params
    
    @Column(columnDefinition = "TEXT")
    private String result; // Return value (if exists)
    
    @Column(columnDefinition = "TEXT")
    private String exception; // Exception message + stack trace
    
    @Column(name = "execution_time_ms")
    private Long executionTimeMs;
    
    @Column(name = "created_at")
    private LocalDateTime createdAt;
    
    @Column(name = "user_id")
    private String userId;
    
    @Column(name = "session_id")
    private String sessionId;
}

2. AOP Aspect Implementation

java ApiTraceAspect.java
@Aspect
@Component
@Slf4j
public class ApiTraceAspect {

    @Autowired
    private ApiTraceRepository traceRepository;
    
    @Autowired
    private ObjectMapper objectMapper;
    
    private static final ThreadLocal TRACE_ID = new ThreadLocal<>();

    // Intercept all Controller methods
    @Around("execution(* com.experilabs..controller..*(..))")
    public Object traceController(ProceedingJoinPoint joinPoint) throws Throwable {
        return traceMethod(joinPoint, "CONTROLLER");
    }

    // Intercept all Service methods
    @Around("execution(* com.experilabs..service..*(..))")
    public Object traceService(ProceedingJoinPoint joinPoint) throws Throwable {
        return traceMethod(joinPoint, "SERVICE");
    }

    // Intercept all Repository methods
    @Around("execution(* com.experilabs..repository..*(..))")
    public Object traceRepository(ProceedingJoinPoint joinPoint) throws Throwable {
        return traceMethod(joinPoint, "REPOSITORY");
    }

    private Object traceMethod(ProceedingJoinPoint joinPoint, String layer) throws Throwable {
        // Check if tracing is enabled via header
        HttpServletRequest request = 
            ((ServletRequestAttributes) RequestContextHolder.getRequestAttributes())
                .getRequest();
        
        String traceHeader = request.getHeader("x-trace-enable");
        if (!"true".equalsIgnoreCase(traceHeader)) {
            // Tracing disabled, proceed normally (no performance impact)
            return joinPoint.proceed();
        }

        // Generate or reuse trace ID
        String traceId = TRACE_ID.get();
        if (traceId == null) {
            traceId = UUID.randomUUID().toString();
            TRACE_ID.set(traceId);
        }

        // Capture method info
        String className = joinPoint.getSignature().getDeclaringTypeName();
        String methodName = joinPoint.getSignature().getName();
        Object[] args = joinPoint.getArgs();

        ApiTrace trace = new ApiTrace();
        trace.setTraceId(traceId);
        trace.setClassName(className);
        trace.setMethodName(methodName);
        trace.setRequestPath(request.getRequestURI());
        trace.setHttpMethod(request.getMethod());
        trace.setUserId(getCurrentUserId());
        trace.setSessionId(request.getSession().getId());

        try {
            // Serialize parameters (sanitize sensitive data)
            trace.setParameters(sanitizeAndSerialize(args));
        } catch (Exception e) {
            trace.setParameters("Serialization failed: " + e.getMessage());
        }

        long startTime = System.currentTimeMillis();
        Object result = null;
        Exception exception = null;

        try {
            // Execute the actual method
            result = joinPoint.proceed();
            
            // Capture result (limit size)
            try {
                String resultJson = objectMapper.writeValueAsString(result);
                if (resultJson.length() > 5000) {
                    resultJson = resultJson.substring(0, 5000) + "... (truncated)";
                }
                trace.setResult(resultJson);
            } catch (Exception e) {
                trace.setResult("Result serialization failed");
            }
            
            return result;
            
        } catch (Exception e) {
            exception = e;
            
            // Capture full stack trace
            StringWriter sw = new StringWriter();
            e.printStackTrace(new PrintWriter(sw));
            trace.setException(sw.toString());
            
            throw e;
            
        } finally {
            long endTime = System.currentTimeMillis();
            trace.setExecutionTimeMs(endTime - startTime);
            trace.setCreatedAt(LocalDateTime.now());
            
            // Save trace asynchronously (don't block main flow)
            CompletableFuture.runAsync(() -> {
                try {
                    traceRepository.save(trace);
                    log.debug("Trace saved: {} - {}.{} ({}ms)",
                        traceId, className, methodName, trace.getExecutionTimeMs());
                } catch (Exception e) {
                    log.error("Failed to save trace", e);
                }
            });
            
            // Clean up ThreadLocal if this is the last method in chain
            if (layer.equals("CONTROLLER")) {
                TRACE_ID.remove();
            }
        }
    }

    private String sanitizeAndSerialize(Object[] args) throws JsonProcessingException {
        // Remove sensitive data (password, token, etc.)
        List sanitized = new ArrayList<>();
        for (Object arg : args) {
            if (arg == null) {
                sanitized.add(null);
            } else if (arg instanceof String) {
                String str = (String) arg;
                // Mask potential sensitive data
                if (str.length() > 100) {
                    sanitized.add(str.substring(0, 100) + "...");
                } else {
                    sanitized.add(str);
                }
            } else {
                sanitized.add(arg.getClass().getSimpleName() + "@" + 
                    Integer.toHexString(arg.hashCode()));
            }
        }
        return objectMapper.writeValueAsString(sanitized);
    }

    private String getCurrentUserId() {
        try {
            Authentication auth = SecurityContextHolder.getContext().getAuthentication();
            return auth != null ? auth.getName() : "anonymous";
        } catch (Exception e) {
            return "unknown";
        }
    }
}
            

            

3. LLM Error Analysis Service

java LlmErrorAnalysisService.java
@Service
@Slf4j
public class LlmErrorAnalysisService {

    private static final String OLLAMA_API_URL = "http://localhost:11434/api/generate";
    private static final String MODEL = "mistral-small:24b";
    
    @Autowired
    private RestTemplate restTemplate;
    
    @Autowired
    private ObjectMapper objectMapper;
    
    @Autowired
    private ApiTraceRepository traceRepository;

    public ErrorAnalysisResult analyzeTrace(String traceId) {
        // Fetch all traces with this ID (full call chain)
        List traces = traceRepository.findByTraceIdOrderByCreatedAtAsc(traceId);
        
        if (traces.isEmpty()) {
            return ErrorAnalysisResult.notFound();
        }

        // Build context for LLM
        String traceContext = buildTraceContext(traces);
        
        // Prompt for LLM
        String prompt = String.format(
            "Analyze this API trace and identify the root cause of the error.\n\n" +
            "TRACE DATA:\n%s\n\n" +
            "Provide:\n" +
            "1. Root Cause (1-2 sentences)\n" +
            "2. Exact location (class + method)\n" +
            "3. Suggested fix (code snippet if possible)\n" +
            "4. Prevention tips\n\n" +
            "Format your response as JSON:\n" +
            "{\n" +
            "  \"rootCause\": \"...\",\n" +
            "  \"location\": \"ClassName.methodName\",\n" +
            "  \"suggestedFix\": \"...\",\n" +
            "  \"preventionTips\": [\"tip1\", \"tip2\"]\n" +
            "}",
            traceContext
        );

        try {
            // Call Ollama API
            Map request = new HashMap<>();
            request.put("model", MODEL);
            request.put("prompt", prompt);
            request.put("stream", false);
            request.put("format", "json");
            request.put("options", Map.of("temperature", 0.1)); // Low temp for factual analysis

            log.info("Sending trace {} to LLM for analysis", traceId);
            long startTime = System.currentTimeMillis();
            
            ResponseEntity response = restTemplate.postForEntity(
                OLLAMA_API_URL,
                request,
                String.class
            );
            
            long duration = System.currentTimeMillis() - startTime;
            log.info("LLM analysis completed in {}ms", duration);

            // Parse LLM response
            JsonNode root = objectMapper.readTree(response.getBody());
            String llmOutput = root.get("response").asText();
            
            // Clean JSON (remove markdown fences if present)
            llmOutput = llmOutput.replaceAll("```json\\n?", "").replaceAll("```\\n?", "").trim();
            
            ErrorAnalysisResult result = objectMapper.readValue(llmOutput, ErrorAnalysisResult.class);
            result.setTraceId(traceId);
            result.setAnalysisTimeMs(duration);
            
            return result;
            
        } catch (Exception e) {
            log.error("LLM analysis failed for trace " + traceId, e);
            return ErrorAnalysisResult.error("Analysis failed: " + e.getMessage());
        }
    }

    private String buildTraceContext(List traces) {
        StringBuilder context = new StringBuilder();
        
        context.append("=== API CALL CHAIN ===\n\n");
        
        for (int i = 0; i < traces.size(); i++) {
            ApiTrace trace = traces.get(i);
            
            context.append(String.format("%d. %s.%s\n",
                i + 1,
                trace.getClassName(),
                trace.getMethodName()
            ));
            
            context.append(String.format("   Parameters: %s\n", trace.getParameters()));
            context.append(String.format("   Execution Time: %dms\n", trace.getExecutionTimeMs()));
            
            if (trace.getException() != null) {
                context.append(String.format("   ❌ EXCEPTION:\n%s\n", 
                    limitStackTrace(trace.getException(), 30))); // First 30 lines
            } else if (trace.getResult() != null) {
                context.append(String.format("   ✅ Result: %s\n", 
                    trace.getResult().substring(0, Math.min(200, trace.getResult().length()))));
            }
            
            context.append("\n");
        }
        
        return context.toString();
    }

    private String limitStackTrace(String stackTrace, int maxLines) {
        String[] lines = stackTrace.split("\n");
        if (lines.length <= maxLines) {
            return stackTrace;
        }
        return String.join("\n", Arrays.copyOf(lines, maxLines)) + 
            "\n... (" + (lines.length - maxLines) + " more lines)";
    }
}

@Data
class ErrorAnalysisResult {
    private String traceId;
    private String rootCause;
    private String location;
    private String suggestedFix;
    private List preventionTips;
    private Long analysisTimeMs;
    private String error;
    
    public static ErrorAnalysisResult notFound() {
        ErrorAnalysisResult result = new ErrorAnalysisResult();
        result.setError("Trace not found");
        return result;
    }
    
    public static ErrorAnalysisResult error(String message) {
        ErrorAnalysisResult result = new ErrorAnalysisResult();
        result.setError(message);
        return result;
    }
}

4. REST Controller

java TraceController.java
@RestController
@RequestMapping("/api/trace")
@Slf4j
public class TraceController {

    @Autowired
    private ApiTraceRepository traceRepository;
    
    @Autowired
    private LlmErrorAnalysisService analysisService;

    @GetMapping("/{traceId}")
    public ResponseEntity> getTrace(@PathVariable String traceId) {
        List traces = traceRepository.findByTraceIdOrderByCreatedAtAsc(traceId);
        return ResponseEntity.ok(traces);
    }

    @GetMapping("/{traceId}/analyze")
    public ResponseEntity analyzeTrace(@PathVariable String traceId) {
        ErrorAnalysisResult result = analysisService.analyzeTrace(traceId);
        
        if (result.getError() != null) {
            return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR).body(result);
        }
        
        return ResponseEntity.ok(result);
    }

    @GetMapping("/recent")
    public ResponseEntity> getRecentErrors(
            @RequestParam(defaultValue = "20") int limit) {
        
        Pageable pageable = PageRequest.of(0, limit, Sort.by("createdAt").descending());
        List traces = traceRepository.findByExceptionIsNotNull(pageable);
        
        return ResponseEntity.ok(traces);
    }
}

Performance Optimization: On-Demand Tracing

AOP her method call'da çalışıyor ama tracing sadece header ile aktifleşiyor. Bu sayede production performansı korunuyor:

0ms
Overhead (Trace Disabled)
~5ms
Overhead (Trace Enabled)
~2s
LLM Analysis Time

Kullanım Örnekleri:

bash cURL Examples
# Normal request (no tracing)
curl -X POST http://localhost:8080/api/declaration/save \
  -H "Content-Type: application/json" \
  -d '{"data": "..."}'

# Request with tracing enabled
curl -X POST http://localhost:8080/api/declaration/save \
  -H "Content-Type: application/json" \
  -H "x-trace-enable: true" \
  -d '{"data": "..."}'

# Get trace details
curl http://localhost:8080/api/trace/{traceId}

# Analyze trace with LLM
curl http://localhost:8080/api/trace/{traceId}/analyze

Gerçek Dünya Örneği: NullPointerException Root Cause

Senaryo:

Kullanıcı beyanname kaydederken hata alıyor. Trace ID: a7f3c2d1-4b5e-9876-abcd-1234567890ef

Trace Chain:

plaintext Call Chain
1. DeclarationController.saveDeclaration
   Parameters: [DeclarationDTO@a1b2c3]
   Execution Time: 45ms
   ✅ Result: Success

2. DeclarationService.save
   Parameters: [DeclarationDTO@a1b2c3]
   Execution Time: 42ms
   ❌ EXCEPTION: NullPointerException

3. CompanyService.validateCompany
   Parameters: [null]
   Execution Time: 2ms
   ❌ EXCEPTION: NullPointerException at line 85

LLM Analysis Result:

json Error Analysis
{
  "rootCause": "CompanyService.validateCompany received null parameter because DeclarationDTO.companyId was null. No null check before service call.",
  "location": "DeclarationService.save (line ~120)",
  "suggestedFix": "Add null check:\n\nif (dto.getCompanyId() == null) {\n    throw new ValidationException(\"Company ID is required\");\n}\ncompanyService.validateCompany(dto.getCompanyId());",
  "preventionTips": [
    "Add @NotNull validation on DeclarationDTO.companyId field",
    "Use Optional for nullable IDs",
    "Add unit test for null company scenario",
    "Enable JSR-303 validation in controller"
  ],
  "analysisTimeMs": 1850
}

LLM sadece hatayı bulmakla kalmadı, fix önerisi ve prevention tips de verdi! Bu, junior developer'ların bile hızlıca problemi çözmesini sağlıyor.

UI Dashboard (React Frontend)

Trace'leri ve LLM analizlerini görselleştirmek için basit bir React dashboard geliştirdim:

jsx TraceAnalyzer.jsx
import React, { useState } from 'react';
import axios from 'axios';

export default function TraceAnalyzer() {
    const [traceId, setTraceId] = useState('');
    const [traces, setTraces] = useState([]);
    const [analysis, setAnalysis] = useState(null);
    const [loading, setLoading] = useState(false);

    const fetchTrace = async () => {
        setLoading(true);
        try {
            const response = await axios.get(`/api/trace/${traceId}`);
            setTraces(response.data);
        } catch (error) {
            console.error('Failed to fetch trace', error);
        }
        setLoading(false);
    };

    const analyzeTrace = async () => {
        setLoading(true);
        try {
            const response = await axios.get(`/api/trace/${traceId}/analyze`);
            setAnalysis(response.data);
        } catch (error) {
            console.error('Failed to analyze trace', error);
        }
        setLoading(false);
    };

    return (
        

🔍 API Trace Analyzer

setTraceId(e.target.value)} placeholder="Enter Trace ID..." />
{traces.length > 0 && (

📞 Call Chain

{traces.map((trace, idx) => (
{idx + 1} {trace.className}.{trace.methodName} {trace.executionTimeMs}ms
Parameters: {trace.parameters}
{trace.exception && (
❌ Exception:
{trace.exception}
)}
))}
)} {analysis && (

🤖 AI Analysis

🔍 Root Cause

{analysis.rootCause}

📍 Location

{analysis.location}

💡 Suggested Fix

{analysis.suggestedFix}

🛡️ Prevention Tips

    {analysis.preventionTips.map((tip, idx) => (
  • {tip}
  • ))}
Analysis completed in {analysis.analysisTimeMs}ms
)}
); }

Production Deployment Considerations

⚠️ Dikkat Edilecekler:

  • Database size: Trace'ler hızla büyüyor, retention policy belirleyin (örn: 7 gün)
  • Sensitive data: Password, token gibi bilgileri mutlaka sanitize edin
  • Performance: Async save kullanın, main flow bloke etmeyin
  • GDPR compliance: User data içeren trace'leri anonim hale getirin
  • LLM cost: Local Ollama kullanın, cloud LLM'e güvenmeyin

✅ Best Practices:

  • Trace retention: 7 gün (production), 30 gün (staging)
  • Auto-cleanup job: Daily cron ile eski trace'leri sil
  • Index optimization: trace_id, created_at, exception üzerinde index
  • Circuit breaker: LLM analysis fail ederse fallback mechanism
  • Rate limiting: Trace save işlemini limit'le (abuse prevention)

Gelecek İyileştirmeler

1. Distributed Tracing (OpenTelemetry)

Mikroservisler arası trace'leri OpenTelemetry ile ilişkilendireceğim. Böylece cross-service call chain'i de görebileceğiz.

2. Automatic Fix Application

LLM'in önerdiği fix'i otomatik olarak pull request açacak şekilde entegre edeceğim. GitHub Actions ile CI/CD pipeline'a bağlayacağım.

3. Anomaly Detection

Trace pattern'lerini ML ile analiz edip anomaly detection yapacağım. Örn: Sudden spike in execution time → Alert.

Sonuç

Spring Boot AOP + Local LLM kombinasyonu ile production debugging sürecini devrim niteliğinde iyileştirdim:

  • Full call chain visibility: Controller'dan repository'ye tüm flow görünür
  • On-demand activation: Production performance etkilenmiyor
  • AI-powered root cause detection: Otomatik hata analizi
  • Actionable suggestions: Code fix + prevention tips
  • $0 cost: Local Ollama, cloud API yok

Debug süresi saatlerden dakikalara düştü. Junior developer'lar bile complex bug'ları LLM yardımıyla çözebiliyor.

Sorularınız mı var?

Spring AOP, LLM integration veya observability konularında soru ve deneyimlerinizi benimle paylaşabilirsiniz: email | LinkedIn

Remzi Şahbaz

Software Engineer @ ExperiLabs

Spring Boot, mikroservisler ve AI/LLM teknolojileriyle kurumsal çözümler geliştiriyorum.