Mirrored-Projects/paperless-gpt

Fork 0

mirror of https://github.com/icereed/paperless-gpt.git synced 2025-03-12 12:58:02 -05:00

Icereed 897f0fbbd7

feat: restructure Dockerfile to build Vite frontend and embed assets in Go application (#178 )

2025-02-03 10:51:42 +01:00

2.8 KiB

Raw Blame History

Technical Context

Technology Stack

Backend (Go)

Runtime: Go
Key Libraries:
- langchaingo: LLM integration
- logrus: Structured logging
- net/http: API server

Frontend (React/TypeScript)

Framework: React with TypeScript
Build Tool: Vite
Testing: Playwright
Styling: Tailwind CSS
Package Manager: npm

Infrastructure

Containerization: Docker
Deployment: Docker Compose
CI/CD: GitHub Actions

Development Setup

Prerequisites

Docker and Docker Compose
Go development environment
Node.js and npm
Access to LLM provider (OpenAI or Ollama)

Local Development Steps

Clone repository
Configure environment variables
Start paperless-ngx instance
Build and run paperless-gpt
Access web interface

Testing Steps (Required Before Commits)

Unit Tests:
```
go test .
```

E2E Tests:

docker build . -t icereed/paperless-gpt:e2e
cd web-app && npm run test:e2e

These tests MUST be run and pass before considering any task complete.

Configuration

Environment Variables

Required Variables

PAPERLESS_BASE_URL=http://paperless-ngx:8000
PAPERLESS_API_TOKEN=your_paperless_api_token
LLM_PROVIDER=openai|ollama
LLM_MODEL=model_name

Optional Variables

PAPERLESS_PUBLIC_URL=public_url
MANUAL_TAG=paperless-gpt
AUTO_TAG=paperless-gpt-auto
OPENAI_API_KEY=key (if using OpenAI)
OPENAI_BASE_URL=custom_url
LLM_LANGUAGE=English
OLLAMA_HOST=host_url
VISION_LLM_PROVIDER=provider
VISION_LLM_MODEL=model
AUTO_OCR_TAG=tag
OCR_LIMIT_PAGES=5
LOG_LEVEL=info

Docker Configuration

Network configuration for service communication
Volume mounts for prompts and persistence
Resource limits and scaling options
Port mappings for web interface

LLM Provider Setup

OpenAI Configuration

API key management
Model selection
Base URL configuration (for custom endpoints)
Vision API access for OCR

Ollama Configuration

Server setup and hosting
Model installation and management
Network access configuration
Resource allocation

Custom Prompts

Template Files

title_prompt.tmpl
tag_prompt.tmpl
ocr_prompt.tmpl
correspondent_prompt.tmpl

Template Variables

Language
Content
AvailableTags
OriginalTags
Title
AvailableCorrespondents
BlackList

Technical Constraints

Performance Considerations

Token limits for LLM requests
OCR page limits
Concurrent processing limits
Network bandwidth requirements

Security Requirements

API token security
Environment variable management
Network isolation
Data privacy considerations

Integration Requirements

paperless-ngx compatibility
LLM provider API compatibility
Docker environment compatibility
Web browser compatibility

2.8 KiB Raw Blame History