mirror of https://github.com/icereed/paperless-gpt.git synced 2025-03-13 21:28:02 -05:00

Dominik Schröter c6521ef312 feat: add configuration files for GitHub workflows and documentation

2025-02-03 09:16:43 +01:00

5 KiB

Raw Permalink Blame History

paperless-gpt Architecture

This document provides a comprehensive overview of the paperless-gpt architecture, explaining how different components interact to provide AI-powered document processing capabilities.

System Overview

paperless-gpt is designed as a companion service to paperless-ngx, adding AI capabilities for document processing. The system consists of several key components:

graph TB
    UI[Web UI] --> API[Backend API]
    API --> LLM[LLM Service]
    API --> OCR[OCR Service]
    API --> DB[Local DB]
    API --> PaperlessNGX[paperless-ngx API]
    LLM --> OpenAI[OpenAI]
    LLM --> Ollama[Ollama]
    OCR --> VisionLLM[Vision LLM]

Core Components

1. Backend API (Go)

Handles all business logic
Manages document processing workflow
Coordinates between services
Provides REST API endpoints
Manages state and caching

2. Web UI (React + TypeScript)

User interface for document management
Real-time processing status
Document preview and editing
Configuration interface
Responsive design

3. LLM Service

Manages LLM provider connections
Handles prompt engineering
Processes document content
Generates metadata suggestions
Supports multiple providers:
- OpenAI (gpt-4, gpt-3.5-turbo)
- Ollama (llama2, etc.)

4. OCR Service

Vision LLM integration
Image preprocessing
Text extraction
Layout analysis
Quality enhancement

5. Local Database

Caches processing results
Stores configuration
Manages queues
Tracks document state

Data Flow

Document Processing Flow

sequenceDiagram
    participant U as User
    participant UI as Web UI
    participant API as Backend API
    participant LLM as LLM Service
    participant OCR as OCR Service
    participant PNX as paperless-ngx

    U->>UI: Upload Document
    UI->>API: Process Request
    API->>OCR: Extract Text
    OCR-->>API: Text Content
    API->>LLM: Generate Metadata
    LLM-->>API: Suggestions
    API->>UI: Preview Results
    U->>UI: Approve Changes
    UI->>API: Apply Changes
    API->>PNX: Update Document
    PNX-->>API: Confirmation
    API-->>UI: Success

Key Design Decisions

1. Modular Architecture

Separation of concerns
Pluggable components
Easy to extend
Maintainable code

2. Stateless Design

Scalable architecture
No shared state
Resilient operation
Easy deployment

3. Security First

API authentication
Data encryption
Input validation
Error handling

4. Performance Optimization

Local caching
Batch processing
Async operations
Resource management

Directory Structure

paperless-gpt/
├── main.go                 # Application entry point
├── app_llm.go             # LLM service implementation
├── app_http_handlers.go    # HTTP handlers
├── paperless.go           # paperless-ngx integration
├── ocr.go                 # OCR service
├── types.go               # Type definitions
├── web-app/               # Frontend application
│   ├── src/
│   │   ├── components/    # React components
│   │   ├── App.tsx       # Main application
│   │   └── ...
│   └── ...
└── ...

Configuration Management

The system uses environment variables for configuration, allowing easy deployment and configuration changes:

PAPERLESS_BASE_URL        # paperless-ngx connection
LLM_PROVIDER             # AI backend selection
VISION_LLM_PROVIDER      # OCR provider selection
...

Error Handling

The system implements comprehensive error handling:

User Errors
- Input validation
- Clear error messages
- Guided resolution
System Errors
- Graceful degradation
- Automatic retry
- Error logging
- Monitoring alerts
External Service Errors
- Fallback options
- Circuit breaking
- Rate limiting
- Error reporting

Scaling Considerations

The architecture supports scaling through:

Horizontal Scaling
- Stateless design
- Load balancing
- Distributed processing
Resource Management
- Connection pooling
- Cache management
- Queue processing
- Rate limiting
Performance Optimization
- Batch processing
- Async operations
- Efficient algorithms
- Resource caching

Future Considerations

The architecture is designed to support future enhancements:

Plugin System
- Custom processors
- Integration points
- Event hooks
Advanced Features
- Multi-language support
- Custom ML models
- Advanced analytics
Integration Options
- API extensions
- Service hooks
- Custom providers

Development Guidelines

When making changes to the architecture:

Documentation
- Update this document
- Add inline comments
- Update API docs
Testing
- Unit tests
- Integration tests
- Performance tests
Review Process
- Architecture review
- Security review
- Performance review

This architecture documentation is maintained by the core team and updated as the system evolves.

5 KiB Raw Permalink Blame History