ERNIE & PaddlePaddle Hackathon 2025

Multi-Agent Document Intelligence

Powered by ERNIE LLM and PaddleOCR-VL. Multiple AI agents working together to understand, analyze, and answer questions about your documents.

Neural Agent Network

Watch data flow through the cognitive pipeline

Idle

The neural core that orchestrates all agents. Decomposes tasks, manages data flow, and implements CAMEL-AI RolePlaying for deep analysis.

Task Planning RolePlaying Aggregation

Vision-language perception layer. Extracts text from documents using PaddleOCR-VL with layout understanding and structure preservation.

PaddleOCR-VL Layout Analysis Markdown

Deep understanding layer. Extracts entities, classifies documents, and identifies key information using ERNIE's comprehension capabilities.

ERNIE 4.5 NER Classification

Synthesis layer. Distills complex documents into key points, generates structured outlines, and creates concise summaries.

ERNIE 4.5 Abstractive Key Points

Interactive layer. Enables natural language dialog with documents, supports multi-turn conversations with citation capabilities.

ERNIE 4.5 Multi-turn Citations

Feed a document into the neural network and watch it think

Input Signal

Waiting for input signal...

The neural network will show real-time agent activity here

ERNIE

PaddleOCR

PaddlePaddle's vision-language OCR model:

CAMEL-AI

Agent collaboration using CAMEL-AI patterns: