feat: initialize memory bank documentation for paperless-gpt project

This commit is contained in:
Dominik Schröter 2025-02-03 09:30:53 +01:00
parent 1bd25a297c
commit 78ecc12335
5 changed files with 412 additions and 0 deletions

View file

@ -0,0 +1,49 @@
# Active Context
## Current Task
Creating and initializing the memory bank documentation system for the paperless-gpt project.
## Recent Changes
1. Created memory bank directory structure
2. Initialized core documentation files:
- productContext.md: Project purpose and functionality
- systemPatterns.md: Architecture and design patterns
- techContext.md: Technical stack and setup
- activeContext.md: This file (current state)
- progress.md: To be created next
## Current State
- Initial documentation setup phase
- Core project understanding established
- Key systems and patterns documented
- Technical requirements captured
## Next Steps
### Immediate Tasks
1. Create progress.md to track project status
2. Verify all memory bank files are complete
3. Review documentation for any gaps
4. Ensure all critical information is captured
### Future Considerations
1. Keep documentation updated with:
- New feature implementations
- Architecture changes
- Configuration updates
- Bug fixes and improvements
2. Documentation maintenance:
- Regular reviews for accuracy
- Updates for new developments
- Removal of obsolete information
- Addition of new patterns/technologies
## Active Questions/Issues
None currently - initial setup phase
## Recent Decisions
1. Created comprehensive documentation structure
2. Organized information into logical sections
3. Prioritized key system components
4. Established documentation patterns

View file

@ -0,0 +1,57 @@
# Product Context
## Project Purpose
paperless-gpt is an AI-powered companion application designed to enhance the document management capabilities of paperless-ngx by automating document organization tasks through advanced AI technologies.
## Problems Solved
1. Manual Document Organization
- Eliminates time-consuming manual tagging and title creation
- Reduces human error in document categorization
- Streamlines document processing workflow
2. OCR Quality
- Improves text extraction from poor quality scans
- Provides context-aware OCR capabilities
- Handles complex document layouts better than traditional OCR
3. Document Categorization
- Automates correspondent identification
- Provides intelligent tag suggestions
- Generates meaningful document titles
## Core Functionality
### 1. LLM-Enhanced OCR
- Uses Large Language Models for better text extraction
- Handles messy or low-quality scans effectively
- Provides context-aware text interpretation
### 2. Automatic Document Processing
- Title Generation: Creates descriptive titles based on content
- Tag Generation: Suggests relevant tags from existing tag set
- Correspondent Identification: Automatically detects document senders/recipients
### 3. Integration Features
- Seamless paperless-ngx integration
- Docker-based deployment
- Customizable prompt templates
- Support for multiple LLM providers (OpenAI, Ollama)
### 4. User Interface
- Web-based management interface
- Manual review capabilities
- Batch processing support
- Auto-processing workflow option
## Usage Flow
1. Documents are tagged with specific markers (e.g., 'paperless-gpt')
2. System processes documents using AI/LLM capabilities
3. Results can be automatically applied or manually reviewed
4. Processed documents are updated in paperless-ngx
## Configuration Options
- Manual vs. automatic processing
- LLM provider selection
- Language preferences
- Processing limits and constraints
- Custom prompt templates

96
cline_docs/progress.md Normal file
View file

@ -0,0 +1,96 @@
# Progress Tracking
## Implemented Features
### Core Functionality
✅ LLM Integration
- OpenAI support
- Ollama support
- Vision model integration for OCR
- Template-based prompts
✅ Document Processing
- Title generation
- Tag suggestion
- Correspondent identification
- LLM-enhanced OCR
✅ Frontend Interface
- Document review UI
- Suggestion management
- Batch processing
- Success feedback
✅ System Integration
- paperless-ngx API integration
- Docker deployment
- Environment configuration
- Custom prompt templates
## Working Components
### Backend Systems
- Go API server
- LLM provider abstraction
- Template engine
- Concurrent document processing
- Error handling
- Logging system
### Frontend Features
- React/TypeScript application
- Document processing interface
- Review system
- Component architecture
- Tailwind styling
### Infrastructure
- Docker containerization
- Docker Compose setup
- Documentation
- Testing framework
## Remaining Tasks
### Features to Implement
None identified - core functionality complete
### Known Issues
- None currently documented
### Potential Improvements
1. Performance Optimizations
- Token usage optimization
- Processing speed improvements
- Caching strategies
2. Feature Enhancements
- Additional LLM providers
- Extended template capabilities
- Enhanced error recovery
- Advanced OCR options
3. User Experience
- Advanced configuration UI
- Batch processing improvements
- Progress indicators
- Enhanced error messaging
4. Documentation
- Additional usage examples
- Troubleshooting guides
- Performance tuning guide
- Development guidelines
## Project Status
- 🟢 Core Features: Complete
- 🟢 Documentation: Initialized
- 🟢 Testing: Implemented
- 🟢 Deployment: Ready
- 🟡 Optimization: Ongoing
## Next Development Priorities
1. Monitor for user feedback
2. Address any discovered issues
3. Implement performance improvements
4. Enhance documentation based on user needs

View file

@ -0,0 +1,88 @@
# System Patterns
## Architecture Overview
### 1. Microservices Architecture
- **paperless-gpt**: AI processing service (Go)
- **paperless-ngx**: Document management system (external)
- Communication via REST API
- Docker-based deployment
### 2. Backend Architecture (Go)
#### Core Components
- **API Server**: HTTP handlers for document processing
- **LLM Integration**: Abstraction for multiple AI providers
- **Template Engine**: Dynamic prompt generation
- **Document Processor**: Handles OCR and metadata generation
#### Key Patterns
- **Template-Based Prompts**: Customizable templates for different AI tasks
- **Content Truncation**: Smart content limiting based on token counts
- **Concurrent Processing**: Goroutines for parallel document processing
- **Mutex-Protected Resources**: Thread-safe template access
- **Error Propagation**: Structured error handling across layers
### 3. Frontend Architecture (React/TypeScript)
#### Components
- Document Processor
- Suggestion Review
- Document Cards
- Sidebar Navigation
- Success Modal
#### State Management
- Local component state
- Props for component communication
- API integration for data fetching
### 4. Integration Patterns
#### API Communication
- RESTful endpoints
- JSON payload structure
- Token-based authentication
- Error response handling
#### LLM Provider Integration
- Provider abstraction layer
- Support for multiple providers (OpenAI, Ollama)
- Configurable models and parameters
- Vision model support for OCR
### 5. Data Flow
#### Document Processing Flow
1. Document tagged in paperless-ngx
2. paperless-gpt detects tagged documents
3. AI processing (title/tags/correspondent generation)
4. Manual review or auto-apply
5. Update back to paperless-ngx
#### OCR Processing Flow
1. Image/PDF input
2. Vision model processing
3. Text extraction and cleanup
4. Integration with document processing
### 6. Security Patterns
- API token authentication
- Environment-based configuration
- Docker container isolation
- Rate limiting and token management
### 7. Development Patterns
- Clear separation of concerns
- Dependency injection
- Interface-based design
- Concurrent processing with safety
- Comprehensive error handling
- Template-based customization
### 8. Testing Patterns
- Unit tests for core logic
- Integration tests for API
- E2E tests for web interface
- Test fixtures and mocks
- Playwright for frontend testing

122
cline_docs/techContext.md Normal file
View file

@ -0,0 +1,122 @@
# Technical Context
## Technology Stack
### Backend (Go)
- **Runtime**: Go
- **Key Libraries**:
- langchaingo: LLM integration
- logrus: Structured logging
- net/http: API server
### Frontend (React/TypeScript)
- **Framework**: React with TypeScript
- **Build Tool**: Vite
- **Testing**: Playwright
- **Styling**: Tailwind CSS
- **Package Manager**: npm
### Infrastructure
- **Containerization**: Docker
- **Deployment**: Docker Compose
- **CI/CD**: GitHub Actions
## Development Setup
### Prerequisites
1. Docker and Docker Compose
2. Go development environment
3. Node.js and npm
4. Access to LLM provider (OpenAI or Ollama)
### Local Development Steps
1. Clone repository
2. Configure environment variables
3. Start paperless-ngx instance
4. Build and run paperless-gpt
5. Access web interface
## Configuration
### Environment Variables
#### Required Variables
```
PAPERLESS_BASE_URL=http://paperless-ngx:8000
PAPERLESS_API_TOKEN=your_paperless_api_token
LLM_PROVIDER=openai|ollama
LLM_MODEL=model_name
```
#### Optional Variables
```
PAPERLESS_PUBLIC_URL=public_url
MANUAL_TAG=paperless-gpt
AUTO_TAG=paperless-gpt-auto
OPENAI_API_KEY=key (if using OpenAI)
OPENAI_BASE_URL=custom_url
LLM_LANGUAGE=English
OLLAMA_HOST=host_url
VISION_LLM_PROVIDER=provider
VISION_LLM_MODEL=model
AUTO_OCR_TAG=tag
OCR_LIMIT_PAGES=5
LOG_LEVEL=info
```
### Docker Configuration
- Network configuration for service communication
- Volume mounts for prompts and persistence
- Resource limits and scaling options
- Port mappings for web interface
### LLM Provider Setup
#### OpenAI Configuration
- API key management
- Model selection
- Base URL configuration (for custom endpoints)
- Vision API access for OCR
#### Ollama Configuration
- Server setup and hosting
- Model installation and management
- Network access configuration
- Resource allocation
### Custom Prompts
#### Template Files
- title_prompt.tmpl
- tag_prompt.tmpl
- ocr_prompt.tmpl
- correspondent_prompt.tmpl
#### Template Variables
- Language
- Content
- AvailableTags
- OriginalTags
- Title
- AvailableCorrespondents
- BlackList
## Technical Constraints
### Performance Considerations
- Token limits for LLM requests
- OCR page limits
- Concurrent processing limits
- Network bandwidth requirements
### Security Requirements
- API token security
- Environment variable management
- Network isolation
- Data privacy considerations
### Integration Requirements
- paperless-ngx compatibility
- LLM provider API compatibility
- Docker environment compatibility
- Web browser compatibility