ERNIE & PaddlePaddle Hackathon 2025

Multi-Agent Document Intelligence

Powered by ERNIE LLM and PaddleOCR-VL. Multiple AI agents working together to understand, analyze, and answer questions about your documents.

Neural Agent Network

Watch data flow through the cognitive pipeline

IN Document OCR PaddleOCR-VL CORE Coordinator CAMEL-AI ANA Analysis SUM Summary Q&A Dialog OUT Intelligence
Idle
C

Coordinator Agent

The neural core that orchestrates all agents. Decomposes tasks, manages data flow, and implements CAMEL-AI RolePlaying for deep analysis.

Task Planning RolePlaying Aggregation
O

OCR Agent

Vision-language perception layer. Extracts text from documents using PaddleOCR-VL with layout understanding and structure preservation.

PaddleOCR-VL Layout Analysis Markdown
A

Analysis Agent

Deep understanding layer. Extracts entities, classifies documents, and identifies key information using ERNIE's comprehension capabilities.

ERNIE 4.5 NER Classification
S

Summary Agent

Synthesis layer. Distills complex documents into key points, generates structured outlines, and creates concise summaries.

ERNIE 4.5 Abstractive Key Points
Q

QA Agent

Interactive layer. Enables natural language dialog with documents, supports multi-turn conversations with citation capabilities.

ERNIE 4.5 Multi-turn Citations

Live Neural Processing

Feed a document into the neural network and watch it think

Input Signal

Waiting for input signal...

The neural network will show real-time agent activity here

Technical Deep Dive

ERNIE

Baidu ERNIE LLM

Powered by Baidu's ERNIE large language model:

  • Document analysis and entity extraction
  • Abstractive summarization
  • Context-aware question answering
  • Multi-turn dialog management
PaddleOCR

PaddleOCR-VL

PaddlePaddle's vision-language OCR model:

  • High-accuracy text extraction
  • Layout and structure analysis
  • Table recognition
  • Multi-language support
CAMEL-AI

Multi-Agent Framework

Agent collaboration using CAMEL-AI patterns:

  • Task decomposition and planning
  • RolePlaying for deep analysis
  • Coordinator orchestration
  • Result aggregation