2025-02-03 02:30:53 -06:00
|
|
|
# Technical Context
|
|
|
|
|
|
|
|
## Technology Stack
|
|
|
|
|
|
|
|
### Backend (Go)
|
|
|
|
- **Runtime**: Go
|
|
|
|
- **Key Libraries**:
|
|
|
|
- langchaingo: LLM integration
|
|
|
|
- logrus: Structured logging
|
|
|
|
- net/http: API server
|
|
|
|
|
|
|
|
### Frontend (React/TypeScript)
|
|
|
|
- **Framework**: React with TypeScript
|
|
|
|
- **Build Tool**: Vite
|
|
|
|
- **Testing**: Playwright
|
|
|
|
- **Styling**: Tailwind CSS
|
|
|
|
- **Package Manager**: npm
|
|
|
|
|
|
|
|
### Infrastructure
|
|
|
|
- **Containerization**: Docker
|
|
|
|
- **Deployment**: Docker Compose
|
|
|
|
- **CI/CD**: GitHub Actions
|
|
|
|
|
|
|
|
## Development Setup
|
|
|
|
|
|
|
|
### Prerequisites
|
|
|
|
1. Docker and Docker Compose
|
|
|
|
2. Go development environment
|
|
|
|
3. Node.js and npm
|
|
|
|
4. Access to LLM provider (OpenAI or Ollama)
|
|
|
|
|
|
|
|
### Local Development Steps
|
|
|
|
1. Clone repository
|
|
|
|
2. Configure environment variables
|
|
|
|
3. Start paperless-ngx instance
|
|
|
|
4. Build and run paperless-gpt
|
|
|
|
5. Access web interface
|
|
|
|
|
2025-02-03 03:51:42 -06:00
|
|
|
### Testing Steps (Required Before Commits)
|
|
|
|
1. **Unit Tests**:
|
|
|
|
```bash
|
|
|
|
go test .
|
|
|
|
```
|
|
|
|
|
|
|
|
2. **E2E Tests**:
|
|
|
|
```bash
|
|
|
|
docker build . -t icereed/paperless-gpt:e2e
|
|
|
|
cd web-app && npm run test:e2e
|
|
|
|
```
|
|
|
|
|
|
|
|
These tests MUST be run and pass before considering any task complete.
|
|
|
|
|
2025-02-03 02:30:53 -06:00
|
|
|
## Configuration
|
|
|
|
|
|
|
|
### Environment Variables
|
|
|
|
|
|
|
|
#### Required Variables
|
|
|
|
```
|
|
|
|
PAPERLESS_BASE_URL=http://paperless-ngx:8000
|
|
|
|
PAPERLESS_API_TOKEN=your_paperless_api_token
|
|
|
|
LLM_PROVIDER=openai|ollama
|
|
|
|
LLM_MODEL=model_name
|
|
|
|
```
|
|
|
|
|
|
|
|
#### Optional Variables
|
|
|
|
```
|
|
|
|
PAPERLESS_PUBLIC_URL=public_url
|
|
|
|
MANUAL_TAG=paperless-gpt
|
|
|
|
AUTO_TAG=paperless-gpt-auto
|
|
|
|
OPENAI_API_KEY=key (if using OpenAI)
|
|
|
|
OPENAI_BASE_URL=custom_url
|
|
|
|
LLM_LANGUAGE=English
|
|
|
|
OLLAMA_HOST=host_url
|
|
|
|
VISION_LLM_PROVIDER=provider
|
|
|
|
VISION_LLM_MODEL=model
|
|
|
|
AUTO_OCR_TAG=tag
|
|
|
|
OCR_LIMIT_PAGES=5
|
|
|
|
LOG_LEVEL=info
|
|
|
|
```
|
|
|
|
|
|
|
|
### Docker Configuration
|
|
|
|
- Network configuration for service communication
|
|
|
|
- Volume mounts for prompts and persistence
|
|
|
|
- Resource limits and scaling options
|
|
|
|
- Port mappings for web interface
|
|
|
|
|
|
|
|
### LLM Provider Setup
|
|
|
|
|
|
|
|
#### OpenAI Configuration
|
|
|
|
- API key management
|
|
|
|
- Model selection
|
|
|
|
- Base URL configuration (for custom endpoints)
|
|
|
|
- Vision API access for OCR
|
|
|
|
|
|
|
|
#### Ollama Configuration
|
|
|
|
- Server setup and hosting
|
|
|
|
- Model installation and management
|
|
|
|
- Network access configuration
|
|
|
|
- Resource allocation
|
|
|
|
|
|
|
|
### Custom Prompts
|
|
|
|
|
|
|
|
#### Template Files
|
|
|
|
- title_prompt.tmpl
|
|
|
|
- tag_prompt.tmpl
|
|
|
|
- ocr_prompt.tmpl
|
|
|
|
- correspondent_prompt.tmpl
|
|
|
|
|
|
|
|
#### Template Variables
|
|
|
|
- Language
|
|
|
|
- Content
|
|
|
|
- AvailableTags
|
|
|
|
- OriginalTags
|
|
|
|
- Title
|
|
|
|
- AvailableCorrespondents
|
|
|
|
- BlackList
|
|
|
|
|
|
|
|
## Technical Constraints
|
|
|
|
|
|
|
|
### Performance Considerations
|
|
|
|
- Token limits for LLM requests
|
|
|
|
- OCR page limits
|
|
|
|
- Concurrent processing limits
|
|
|
|
- Network bandwidth requirements
|
|
|
|
|
|
|
|
### Security Requirements
|
|
|
|
- API token security
|
|
|
|
- Environment variable management
|
|
|
|
- Network isolation
|
|
|
|
- Data privacy considerations
|
|
|
|
|
|
|
|
### Integration Requirements
|
|
|
|
- paperless-ngx compatibility
|
|
|
|
- LLM provider API compatibility
|
|
|
|
- Docker environment compatibility
|
|
|
|
- Web browser compatibility
|