mirror of
https://github.com/icereed/paperless-gpt.git
synced 2025-03-12 12:58:02 -05:00
2.8 KiB
2.8 KiB
Technical Context
Technology Stack
Backend (Go)
- Runtime: Go
- Key Libraries:
- langchaingo: LLM integration
- logrus: Structured logging
- net/http: API server
Frontend (React/TypeScript)
- Framework: React with TypeScript
- Build Tool: Vite
- Testing: Playwright
- Styling: Tailwind CSS
- Package Manager: npm
Infrastructure
- Containerization: Docker
- Deployment: Docker Compose
- CI/CD: GitHub Actions
Development Setup
Prerequisites
- Docker and Docker Compose
- Go development environment
- Node.js and npm
- Access to LLM provider (OpenAI or Ollama)
Local Development Steps
- Clone repository
- Configure environment variables
- Start paperless-ngx instance
- Build and run paperless-gpt
- Access web interface
Testing Steps (Required Before Commits)
-
Unit Tests:
go test .
-
E2E Tests:
docker build . -t icereed/paperless-gpt:e2e cd web-app && npm run test:e2e
These tests MUST be run and pass before considering any task complete.
Configuration
Environment Variables
Required Variables
PAPERLESS_BASE_URL=http://paperless-ngx:8000
PAPERLESS_API_TOKEN=your_paperless_api_token
LLM_PROVIDER=openai|ollama
LLM_MODEL=model_name
Optional Variables
PAPERLESS_PUBLIC_URL=public_url
MANUAL_TAG=paperless-gpt
AUTO_TAG=paperless-gpt-auto
OPENAI_API_KEY=key (if using OpenAI)
OPENAI_BASE_URL=custom_url
LLM_LANGUAGE=English
OLLAMA_HOST=host_url
VISION_LLM_PROVIDER=provider
VISION_LLM_MODEL=model
AUTO_OCR_TAG=tag
OCR_LIMIT_PAGES=5
LOG_LEVEL=info
Docker Configuration
- Network configuration for service communication
- Volume mounts for prompts and persistence
- Resource limits and scaling options
- Port mappings for web interface
LLM Provider Setup
OpenAI Configuration
- API key management
- Model selection
- Base URL configuration (for custom endpoints)
- Vision API access for OCR
Ollama Configuration
- Server setup and hosting
- Model installation and management
- Network access configuration
- Resource allocation
Custom Prompts
Template Files
- title_prompt.tmpl
- tag_prompt.tmpl
- ocr_prompt.tmpl
- correspondent_prompt.tmpl
Template Variables
- Language
- Content
- AvailableTags
- OriginalTags
- Title
- AvailableCorrespondents
- BlackList
Technical Constraints
Performance Considerations
- Token limits for LLM requests
- OCR page limits
- Concurrent processing limits
- Network bandwidth requirements
Security Requirements
- API token security
- Environment variable management
- Network isolation
- Data privacy considerations
Integration Requirements
- paperless-ngx compatibility
- LLM provider API compatibility
- Docker environment compatibility
- Web browser compatibility