paperless-gpt/cline_docs/techContext.md

136 lines
2.8 KiB
Markdown

# Technical Context
## Technology Stack
### Backend (Go)
- **Runtime**: Go
- **Key Libraries**:
- langchaingo: LLM integration
- logrus: Structured logging
- net/http: API server
### Frontend (React/TypeScript)
- **Framework**: React with TypeScript
- **Build Tool**: Vite
- **Testing**: Playwright
- **Styling**: Tailwind CSS
- **Package Manager**: npm
### Infrastructure
- **Containerization**: Docker
- **Deployment**: Docker Compose
- **CI/CD**: GitHub Actions
## Development Setup
### Prerequisites
1. Docker and Docker Compose
2. Go development environment
3. Node.js and npm
4. Access to LLM provider (OpenAI or Ollama)
### Local Development Steps
1. Clone repository
2. Configure environment variables
3. Start paperless-ngx instance
4. Build and run paperless-gpt
5. Access web interface
### Testing Steps (Required Before Commits)
1. **Unit Tests**:
```bash
go test .
```
2. **E2E Tests**:
```bash
docker build . -t icereed/paperless-gpt:e2e
cd web-app && npm run test:e2e
```
These tests MUST be run and pass before considering any task complete.
## Configuration
### Environment Variables
#### Required Variables
```
PAPERLESS_BASE_URL=http://paperless-ngx:8000
PAPERLESS_API_TOKEN=your_paperless_api_token
LLM_PROVIDER=openai|ollama
LLM_MODEL=model_name
```
#### Optional Variables
```
PAPERLESS_PUBLIC_URL=public_url
MANUAL_TAG=paperless-gpt
AUTO_TAG=paperless-gpt-auto
OPENAI_API_KEY=key (if using OpenAI)
OPENAI_BASE_URL=custom_url
LLM_LANGUAGE=English
OLLAMA_HOST=host_url
VISION_LLM_PROVIDER=provider
VISION_LLM_MODEL=model
AUTO_OCR_TAG=tag
OCR_LIMIT_PAGES=5
LOG_LEVEL=info
```
### Docker Configuration
- Network configuration for service communication
- Volume mounts for prompts and persistence
- Resource limits and scaling options
- Port mappings for web interface
### LLM Provider Setup
#### OpenAI Configuration
- API key management
- Model selection
- Base URL configuration (for custom endpoints)
- Vision API access for OCR
#### Ollama Configuration
- Server setup and hosting
- Model installation and management
- Network access configuration
- Resource allocation
### Custom Prompts
#### Template Files
- title_prompt.tmpl
- tag_prompt.tmpl
- ocr_prompt.tmpl
- correspondent_prompt.tmpl
#### Template Variables
- Language
- Content
- AvailableTags
- OriginalTags
- Title
- AvailableCorrespondents
- BlackList
## Technical Constraints
### Performance Considerations
- Token limits for LLM requests
- OCR page limits
- Concurrent processing limits
- Network bandwidth requirements
### Security Requirements
- API token security
- Environment variable management
- Network isolation
- Data privacy considerations
### Integration Requirements
- paperless-ngx compatibility
- LLM provider API compatibility
- Docker environment compatibility
- Web browser compatibility