From 78ecc123350769236810b963ad1b899fb0213e76 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dominik=20Schr=C3=B6ter?= Date: Mon, 3 Feb 2025 09:30:53 +0100 Subject: [PATCH] feat: initialize memory bank documentation for paperless-gpt project --- cline_docs/activeContext.md | 49 ++++++++++++++ cline_docs/productContext.md | 57 ++++++++++++++++ cline_docs/progress.md | 96 +++++++++++++++++++++++++++ cline_docs/systemPatterns.md | 88 +++++++++++++++++++++++++ cline_docs/techContext.md | 122 +++++++++++++++++++++++++++++++++++ 5 files changed, 412 insertions(+) create mode 100644 cline_docs/activeContext.md create mode 100644 cline_docs/productContext.md create mode 100644 cline_docs/progress.md create mode 100644 cline_docs/systemPatterns.md create mode 100644 cline_docs/techContext.md diff --git a/cline_docs/activeContext.md b/cline_docs/activeContext.md new file mode 100644 index 0000000..33429b4 --- /dev/null +++ b/cline_docs/activeContext.md @@ -0,0 +1,49 @@ +# Active Context + +## Current Task +Creating and initializing the memory bank documentation system for the paperless-gpt project. + +## Recent Changes +1. Created memory bank directory structure +2. Initialized core documentation files: + - productContext.md: Project purpose and functionality + - systemPatterns.md: Architecture and design patterns + - techContext.md: Technical stack and setup + - activeContext.md: This file (current state) + - progress.md: To be created next + +## Current State +- Initial documentation setup phase +- Core project understanding established +- Key systems and patterns documented +- Technical requirements captured + +## Next Steps + +### Immediate Tasks +1. Create progress.md to track project status +2. Verify all memory bank files are complete +3. Review documentation for any gaps +4. Ensure all critical information is captured + +### Future Considerations +1. Keep documentation updated with: + - New feature implementations + - Architecture changes + - Configuration updates + - Bug fixes and improvements + +2. Documentation maintenance: + - Regular reviews for accuracy + - Updates for new developments + - Removal of obsolete information + - Addition of new patterns/technologies + +## Active Questions/Issues +None currently - initial setup phase + +## Recent Decisions +1. Created comprehensive documentation structure +2. Organized information into logical sections +3. Prioritized key system components +4. Established documentation patterns diff --git a/cline_docs/productContext.md b/cline_docs/productContext.md new file mode 100644 index 0000000..77454b8 --- /dev/null +++ b/cline_docs/productContext.md @@ -0,0 +1,57 @@ +# Product Context + +## Project Purpose +paperless-gpt is an AI-powered companion application designed to enhance the document management capabilities of paperless-ngx by automating document organization tasks through advanced AI technologies. + +## Problems Solved +1. Manual Document Organization + - Eliminates time-consuming manual tagging and title creation + - Reduces human error in document categorization + - Streamlines document processing workflow + +2. OCR Quality + - Improves text extraction from poor quality scans + - Provides context-aware OCR capabilities + - Handles complex document layouts better than traditional OCR + +3. Document Categorization + - Automates correspondent identification + - Provides intelligent tag suggestions + - Generates meaningful document titles + +## Core Functionality + +### 1. LLM-Enhanced OCR +- Uses Large Language Models for better text extraction +- Handles messy or low-quality scans effectively +- Provides context-aware text interpretation + +### 2. Automatic Document Processing +- Title Generation: Creates descriptive titles based on content +- Tag Generation: Suggests relevant tags from existing tag set +- Correspondent Identification: Automatically detects document senders/recipients + +### 3. Integration Features +- Seamless paperless-ngx integration +- Docker-based deployment +- Customizable prompt templates +- Support for multiple LLM providers (OpenAI, Ollama) + +### 4. User Interface +- Web-based management interface +- Manual review capabilities +- Batch processing support +- Auto-processing workflow option + +## Usage Flow +1. Documents are tagged with specific markers (e.g., 'paperless-gpt') +2. System processes documents using AI/LLM capabilities +3. Results can be automatically applied or manually reviewed +4. Processed documents are updated in paperless-ngx + +## Configuration Options +- Manual vs. automatic processing +- LLM provider selection +- Language preferences +- Processing limits and constraints +- Custom prompt templates diff --git a/cline_docs/progress.md b/cline_docs/progress.md new file mode 100644 index 0000000..b1e293b --- /dev/null +++ b/cline_docs/progress.md @@ -0,0 +1,96 @@ +# Progress Tracking + +## Implemented Features + +### Core Functionality +✅ LLM Integration +- OpenAI support +- Ollama support +- Vision model integration for OCR +- Template-based prompts + +✅ Document Processing +- Title generation +- Tag suggestion +- Correspondent identification +- LLM-enhanced OCR + +✅ Frontend Interface +- Document review UI +- Suggestion management +- Batch processing +- Success feedback + +✅ System Integration +- paperless-ngx API integration +- Docker deployment +- Environment configuration +- Custom prompt templates + +## Working Components + +### Backend Systems +- Go API server +- LLM provider abstraction +- Template engine +- Concurrent document processing +- Error handling +- Logging system + +### Frontend Features +- React/TypeScript application +- Document processing interface +- Review system +- Component architecture +- Tailwind styling + +### Infrastructure +- Docker containerization +- Docker Compose setup +- Documentation +- Testing framework + +## Remaining Tasks + +### Features to Implement +None identified - core functionality complete + +### Known Issues +- None currently documented + +### Potential Improvements +1. Performance Optimizations + - Token usage optimization + - Processing speed improvements + - Caching strategies + +2. Feature Enhancements + - Additional LLM providers + - Extended template capabilities + - Enhanced error recovery + - Advanced OCR options + +3. User Experience + - Advanced configuration UI + - Batch processing improvements + - Progress indicators + - Enhanced error messaging + +4. Documentation + - Additional usage examples + - Troubleshooting guides + - Performance tuning guide + - Development guidelines + +## Project Status +- 🟢 Core Features: Complete +- 🟢 Documentation: Initialized +- 🟢 Testing: Implemented +- 🟢 Deployment: Ready +- 🟡 Optimization: Ongoing + +## Next Development Priorities +1. Monitor for user feedback +2. Address any discovered issues +3. Implement performance improvements +4. Enhance documentation based on user needs diff --git a/cline_docs/systemPatterns.md b/cline_docs/systemPatterns.md new file mode 100644 index 0000000..9f60734 --- /dev/null +++ b/cline_docs/systemPatterns.md @@ -0,0 +1,88 @@ +# System Patterns + +## Architecture Overview + +### 1. Microservices Architecture +- **paperless-gpt**: AI processing service (Go) +- **paperless-ngx**: Document management system (external) +- Communication via REST API +- Docker-based deployment + +### 2. Backend Architecture (Go) + +#### Core Components +- **API Server**: HTTP handlers for document processing +- **LLM Integration**: Abstraction for multiple AI providers +- **Template Engine**: Dynamic prompt generation +- **Document Processor**: Handles OCR and metadata generation + +#### Key Patterns +- **Template-Based Prompts**: Customizable templates for different AI tasks +- **Content Truncation**: Smart content limiting based on token counts +- **Concurrent Processing**: Goroutines for parallel document processing +- **Mutex-Protected Resources**: Thread-safe template access +- **Error Propagation**: Structured error handling across layers + +### 3. Frontend Architecture (React/TypeScript) + +#### Components +- Document Processor +- Suggestion Review +- Document Cards +- Sidebar Navigation +- Success Modal + +#### State Management +- Local component state +- Props for component communication +- API integration for data fetching + +### 4. Integration Patterns + +#### API Communication +- RESTful endpoints +- JSON payload structure +- Token-based authentication +- Error response handling + +#### LLM Provider Integration +- Provider abstraction layer +- Support for multiple providers (OpenAI, Ollama) +- Configurable models and parameters +- Vision model support for OCR + +### 5. Data Flow + +#### Document Processing Flow +1. Document tagged in paperless-ngx +2. paperless-gpt detects tagged documents +3. AI processing (title/tags/correspondent generation) +4. Manual review or auto-apply +5. Update back to paperless-ngx + +#### OCR Processing Flow +1. Image/PDF input +2. Vision model processing +3. Text extraction and cleanup +4. Integration with document processing + +### 6. Security Patterns +- API token authentication +- Environment-based configuration +- Docker container isolation +- Rate limiting and token management + +### 7. Development Patterns +- Clear separation of concerns +- Dependency injection +- Interface-based design +- Concurrent processing with safety +- Comprehensive error handling +- Template-based customization + +### 8. Testing Patterns +- Unit tests for core logic +- Integration tests for API +- E2E tests for web interface +- Test fixtures and mocks +- Playwright for frontend testing diff --git a/cline_docs/techContext.md b/cline_docs/techContext.md new file mode 100644 index 0000000..a2f2631 --- /dev/null +++ b/cline_docs/techContext.md @@ -0,0 +1,122 @@ +# Technical Context + +## Technology Stack + +### Backend (Go) +- **Runtime**: Go +- **Key Libraries**: + - langchaingo: LLM integration + - logrus: Structured logging + - net/http: API server + +### Frontend (React/TypeScript) +- **Framework**: React with TypeScript +- **Build Tool**: Vite +- **Testing**: Playwright +- **Styling**: Tailwind CSS +- **Package Manager**: npm + +### Infrastructure +- **Containerization**: Docker +- **Deployment**: Docker Compose +- **CI/CD**: GitHub Actions + +## Development Setup + +### Prerequisites +1. Docker and Docker Compose +2. Go development environment +3. Node.js and npm +4. Access to LLM provider (OpenAI or Ollama) + +### Local Development Steps +1. Clone repository +2. Configure environment variables +3. Start paperless-ngx instance +4. Build and run paperless-gpt +5. Access web interface + +## Configuration + +### Environment Variables + +#### Required Variables +``` +PAPERLESS_BASE_URL=http://paperless-ngx:8000 +PAPERLESS_API_TOKEN=your_paperless_api_token +LLM_PROVIDER=openai|ollama +LLM_MODEL=model_name +``` + +#### Optional Variables +``` +PAPERLESS_PUBLIC_URL=public_url +MANUAL_TAG=paperless-gpt +AUTO_TAG=paperless-gpt-auto +OPENAI_API_KEY=key (if using OpenAI) +OPENAI_BASE_URL=custom_url +LLM_LANGUAGE=English +OLLAMA_HOST=host_url +VISION_LLM_PROVIDER=provider +VISION_LLM_MODEL=model +AUTO_OCR_TAG=tag +OCR_LIMIT_PAGES=5 +LOG_LEVEL=info +``` + +### Docker Configuration +- Network configuration for service communication +- Volume mounts for prompts and persistence +- Resource limits and scaling options +- Port mappings for web interface + +### LLM Provider Setup + +#### OpenAI Configuration +- API key management +- Model selection +- Base URL configuration (for custom endpoints) +- Vision API access for OCR + +#### Ollama Configuration +- Server setup and hosting +- Model installation and management +- Network access configuration +- Resource allocation + +### Custom Prompts + +#### Template Files +- title_prompt.tmpl +- tag_prompt.tmpl +- ocr_prompt.tmpl +- correspondent_prompt.tmpl + +#### Template Variables +- Language +- Content +- AvailableTags +- OriginalTags +- Title +- AvailableCorrespondents +- BlackList + +## Technical Constraints + +### Performance Considerations +- Token limits for LLM requests +- OCR page limits +- Concurrent processing limits +- Network bandwidth requirements + +### Security Requirements +- API token security +- Environment variable management +- Network isolation +- Data privacy considerations + +### Integration Requirements +- paperless-ngx compatibility +- LLM provider API compatibility +- Docker environment compatibility +- Web browser compatibility