Merge c6521ef312 into 1bd25a297c

2025-03-12 21:08:00 -05:00 · 2025-02-03 09:28:32 +01:00 · 2025-02-03 09:28:32 +01:00 · 8dc489e035
commit 8dc489e035
parent 1bd25a297c c6521ef312
13 changed files with 1724 additions and 0 deletions
--- a/.github/ISSUE_TEMPLATE/bug_report.yml
+++ b/.github/ISSUE_TEMPLATE/bug_report.yml
@ -0,0 +1,116 @@
+name: Bug Report
+description: Create a report to help us improve
+title: "[BUG] "
+labels: ["bug", "triage"]
+body:
+  - type: markdown
+    attributes:
+      value: |
+        Thanks for taking the time to fill out this bug report!
+        Before submitting, please check if a similar issue already exists.
+
+  - type: input
+    id: version
+    attributes:
+      label: Version
+      description: What version of paperless-gpt are you running?
+      placeholder: "e.g., 1.0.0"
+    validations:
+      required: true
+
+  - type: dropdown
+    id: deployment
+    attributes:
+      label: Deployment Method
+      description: How are you running paperless-gpt?
+      options:
+        - Docker (official image)
+        - Docker Compose
+        - Manual Installation
+        - Other
+    validations:
+      required: true
+
+  - type: input
+    id: llm-provider
+    attributes:
+      label: LLM Provider
+      description: Which LLM provider are you using?
+      placeholder: "e.g., OpenAI, Ollama"
+    validations:
+      required: true
+
+  - type: input
+    id: llm-model
+    attributes:
+      label: LLM Model
+      description: Which model are you using?
+      placeholder: "e.g., gpt-4, llama2"
+    validations:
+      required: true
+
+  - type: dropdown
+    id: os
+    attributes:
+      label: Operating System
+      description: What operating system are you using?
+      options:
+        - Linux
+        - macOS
+        - Windows
+        - Other
+    validations:
+      required: true
+
+  - type: textarea
+    id: what-happened
+    attributes:
+      label: What happened?
+      description: A clear and concise description of the bug.
+      placeholder: "Tell us what you see!"
+    validations:
+      required: true
+
+  - type: textarea
+    id: expected
+    attributes:
+      label: Expected behavior
+      description: What did you expect to happen?
+      placeholder: "Tell us what you expected"
+    validations:
+      required: true
+
+  - type: textarea
+    id: reproduction
+    attributes:
+      label: Steps to reproduce
+      description: How can we reproduce this issue?
+      placeholder: |
+        1. Go to '...'
+        2. Click on '...'
+        3. Scroll down to '...'
+        4. See error
+    validations:
+      required: true
+
+  - type: textarea
+    id: logs
+    attributes:
+      label: Relevant log output
+      description: Please copy and paste any relevant log output. This will be automatically formatted into code.
+      render: shell
+
+  - type: textarea
+    id: config
+    attributes:
+      label: Configuration
+      description: |
+        Please provide your configuration (with sensitive information redacted).
+        This could be your docker-compose.yml or environment variables.
+      render: yaml
+
+  - type: textarea
+    id: additional
+    attributes:
+      label: Additional context
+      description: Add any other context about the problem here
--- a/.github/ISSUE_TEMPLATE/feature_request.yml
+++ b/.github/ISSUE_TEMPLATE/feature_request.yml
@ -0,0 +1,118 @@
+name: Feature Request
+description: Suggest an idea for this project
+title: "[FEATURE] "
+labels: ["enhancement"]
+body:
+  - type: markdown
+    attributes:
+      value: |
+        Thanks for taking the time to suggest a new feature!
+        Please fill out this form as completely as possible.
+
+  - type: textarea
+    id: problem
+    attributes:
+      label: Problem Statement
+      description: Is your feature request related to a problem? Please describe.
+      placeholder: "I'm always frustrated when [...]"
+    validations:
+      required: true
+
+  - type: textarea
+    id: solution
+    attributes:
+      label: Proposed Solution
+      description: Describe the solution you'd like to see
+      placeholder: "It would be great if [...]"
+    validations:
+      required: true
+
+  - type: textarea
+    id: alternatives
+    attributes:
+      label: Alternatives Considered
+      description: Describe any alternative solutions or features you've considered
+      placeholder: "I've thought about [...]"
+
+  - type: dropdown
+    id: importance
+    attributes:
+      label: Importance Level
+      description: How important is this feature to your use case?
+      options:
+        - Critical (Blocking my use of the project)
+        - High (Would significantly improve my workflow)
+        - Medium (Would be nice to have)
+        - Low (Just an idea)
+    validations:
+      required: true
+
+  - type: dropdown
+    id: component
+    attributes:
+      label: Component
+      description: Which part of paperless-gpt would this feature primarily affect?
+      options:
+        - OCR Processing
+        - LLM Integration
+        - Document Management
+        - UI/UX
+        - API
+        - Configuration
+        - Documentation
+        - Performance
+        - Security
+        - Other
+    validations:
+      required: true
+
+  - type: dropdown
+    id: scope
+    attributes:
+      label: Implementation Scope
+      description: How extensive would the changes be?
+      options:
+        - Minor (Simple change, few files)
+        - Moderate (Multiple files, some complexity)
+        - Major (Significant changes, new features)
+        - Breaking (Requires breaking changes)
+    validations:
+      required: true
+
+  - type: textarea
+    id: context
+    attributes:
+      label: Additional Context
+      description: Add any other context about the feature request here
+      placeholder: "Include use cases, benefits, or screenshots"
+
+  - type: textarea
+    id: implementation
+    attributes:
+      label: Implementation Ideas
+      description: If you have specific ideas about how to implement this feature, please share them
+      placeholder: "We could implement this by..."
+
+  - type: checkboxes
+    id: terms
+    attributes:
+      label: Contribution
+      description: Would you be interested in helping implement this feature?
+      options:
+        - label: I'm interested in contributing to this feature's implementation
+          required: false
+        - label: I have read the contribution guidelines
+          required: true
+
+  - type: textarea
+    id: success_criteria
+    attributes:
+      label: Success Criteria
+      description: What would make this feature implementation successful?
+      placeholder: |
+        Example criteria:
+        - Feature works with both OpenAI and Ollama
+        - Performance impact is minimal
+        - No breaking changes to existing functionality
+    validations:
+      required: true
--- a/.github/config.yml
+++ b/.github/config.yml
@ -0,0 +1,163 @@
+# GitHub App Configuration
+
+# Label Configuration
+labels:
+  # Type labels
+  - name: bug
+    color: d73a4a
+    description: Something isn't working
+  - name: enhancement
+    color: a2eeef
+    description: New feature or request
+  - name: documentation
+    color: 0075ca
+    description: Documentation improvements
+  - name: security
+    color: ee0701
+    description: Security-related issues
+  
+  # Priority labels
+  - name: critical
+    color: b60205
+    description: Needs immediate attention
+  - name: high
+    color: d93f0b
+    description: High priority
+  - name: medium
+    color: fbca04
+    description: Medium priority
+  - name: low
+    color: 0e8a16
+    description: Low priority
+
+  # Status labels
+  - name: triage
+    color: d4c5f9
+    description: Needs triage
+  - name: in-progress
+    color: 9ee12f
+    description: Work in progress
+  - name: blocked
+    color: b60205
+    description: Blocked or needs clarification
+  
+  # Component labels
+  - name: frontend
+    color: 1d76db
+    description: Frontend related
+  - name: backend
+    color: 0052cc
+    description: Backend related
+  - name: ocr
+    color: 5319e7
+    description: OCR functionality
+  - name: llm
+    color: 006b75
+    description: LLM integration
+  
+  # Size labels
+  - name: size/xs
+    color: d4c5f9
+    description: Extra small change
+  - name: size/s
+    color: 84b6eb
+    description: Small change
+  - name: size/m
+    color: fbca04
+    description: Medium change
+  - name: size/l
+    color: d93f0b
+    description: Large change
+  - name: size/xl
+    color: b60205
+    description: Extra large change
+
+# Stale issue configuration
+stale:
+  daysUntilStale: 60
+  daysUntilClose: 7
+  exemptLabels:
+    - security
+    - critical
+    - pinned
+  staleLabel: stale
+  markComment: >
+    This issue has been automatically marked as stale because it has not had
+    recent activity. It will be closed if no further activity occurs. Thank you
+    for your contributions.
+  closeComment: >
+    This issue has been automatically closed due to inactivity. Please feel free
+    to reopen it if you still experience this problem.
+
+# Welcome message for new contributors
+newContributorWelcomeComment: >
+  Thanks for making your first contribution to paperless-gpt! 🎉
+  
+  Please make sure you've read our [Contributing Guidelines](CONTRIBUTING.md)
+  and [Code of Conduct](CODE_OF_CONDUCT.md).
+  
+  If you need any help, feel free to mention @icereed or ask in our Discord.
+
+# PR size labeling
+prSize:
+  xs:
+    lines: 10
+  s:
+    lines: 50
+  m:
+    lines: 250
+  l:
+    lines: 500
+  xl:
+    lines: 1000
+
+# Code review settings
+reviews:
+  request_count: 1
+  notify_on_changes: true
+  auto_assign: true
+  auto_merge: false
+
+# Branch protection settings
+branchProtection:
+  main:
+    required_status_checks:
+      - "build"
+      - "test"
+      - "lint"
+    enforce_admins: true
+    required_pull_request_reviews:
+      required_approving_review_count: 1
+      dismiss_stale_reviews: true
+      require_code_owner_reviews: true
+    allow_force_pushes: false
+    allow_deletions: false
+
+# Issue template settings
+issueTemplate:
+  checkNew: true
+  useConfigure: true
+  configureMessage: >
+    Please use our issue templates to report bugs or request features.
+    This helps us track and resolve issues more effectively.
+
+# Pull request template settings
+pullRequestTemplate:
+  checkNew: true
+  useConfigure: true
+  configureMessage: >
+    Please make sure your PR follows our guidelines and includes all necessary information.
+    Don't forget to link any related issues.
+
+# Repository settings
+repository:
+  private: false
+  has_issues: true
+  has_projects: true
+  has_wiki: true
+  has_downloads: true
+  default_branch: main
+  allow_squash_merge: true
+  allow_merge_commit: false
+  allow_rebase_merge: true
+  delete_branch_on_merge: true
--- a/.github/pull_request_template.md
+++ b/.github/pull_request_template.md
@ -0,0 +1,79 @@
+# Description
+
+Please include a summary of the changes and the related issue. Please also include relevant motivation and context. List any dependencies that are required for this change.
+
+Fixes # (issue)
+
+## Type of change
+
+Please delete options that are not relevant.
+
+- [ ] Bug fix (non-breaking change which fixes an issue)
+- [ ] New feature (non-breaking change which adds functionality)
+- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
+- [ ] Documentation update
+- [ ] This change requires a documentation update
+
+## Checklist:
+
+Before submitting your PR, please review the following checklist:
+
+### General
+- [ ] I have performed a self-review of my code
+- [ ] I have commented my code, particularly in hard-to-understand areas
+- [ ] I have made corresponding changes to the documentation
+- [ ] My changes generate no new warnings
+- [ ] Any dependent changes have been merged and published
+- [ ] I have checked my code and corrected any misspellings
+
+### Testing
+- [ ] I have added tests that prove my fix is effective or that my feature works
+- [ ] New and existing unit tests pass locally with my changes
+- [ ] I have tested this code in development environment
+- [ ] I have tested edge cases and error conditions
+
+### Security
+- [ ] My code follows the project's security guidelines
+- [ ] I have conducted a security impact assessment of my changes
+- [ ] I have verified no sensitive information is exposed
+
+### Performance
+- [ ] I have verified my changes don't introduce performance regressions
+- [ ] I have optimized any resource-intensive operations
+- [ ] I have considered the impact on system resources
+
+### Documentation
+- [ ] I have updated the README.md (if applicable)
+- [ ] I have updated the API documentation (if applicable)
+- [ ] I have updated architecture docs (if applicable)
+- [ ] I have added JSDoc/comments for all new code
+
+### Dependencies
+- [ ] I have updated the dependency list (if applicable)
+- [ ] I have checked for and resolved any dependency conflicts
+- [ ] I have verified compatibility with existing dependencies
+
+### Compatibility
+- [ ] My changes are backward compatible
+- [ ] I have tested with different LLM providers
+- [ ] I have tested with different configurations
+- [ ] I have verified Docker compatibility
+
+### Code Quality
+- [ ] My code follows the project's style guidelines
+- [ ] I have run linting tools and fixed any issues
+- [ ] I have maintained or improved code coverage
+- [ ] I have followed SOLID principles
+
+## Screenshots/Videos
+
+If applicable, add screenshots or videos to help explain your changes.
+
+## Additional Notes
+
+Add any other context about the PR here.
+
+## Linked Issues
+
+- Resolves #(issue number)
+- Related to #(issue number)
--- a/.github/workflows/code-quality.yml
+++ b/.github/workflows/code-quality.yml
@ -0,0 +1,217 @@
+name: Code Quality
+
+on:
+  push:
+    branches: [ main ]
+  pull_request:
+    branches: [ main ]
+
+permissions:
+  contents: read
+  pull-requests: write
+
+jobs:
+  lint:
+    name: Lint
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Go
+        uses: actions/setup-go@v5
+        with:
+          go-version: '1.22'
+
+      - name: Install golangci-lint
+        run: |
+          curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin v1.55.2
+
+      - name: Go Lint
+        uses: golangci/golangci-lint-action@v4
+        with:
+          version: latest
+          args: --timeout=5m
+
+      - name: Set up Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+          cache: 'npm'
+          cache-dependency-path: './web-app/package-lock.json'
+
+      - name: Install frontend dependencies
+        run: npm ci
+        working-directory: ./web-app
+
+      - name: Frontend Lint
+        run: npm run lint
+        working-directory: ./web-app
+
+  type-check:
+    name: Type Check
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Go
+        uses: actions/setup-go@v5
+        with:
+          go-version: '1.22'
+
+      - name: Go Type Check
+        run: go vet ./...
+
+      - name: Set up Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+          cache: 'npm'
+          cache-dependency-path: './web-app/package-lock.json'
+
+      - name: Install frontend dependencies
+        run: npm ci
+        working-directory: ./web-app
+
+      - name: TypeScript Check
+        run: npm run type-check
+        working-directory: ./web-app
+
+  security:
+    name: Security Scan
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Run Gosec Security Scanner
+        uses: securego/gosec@master
+        with:
+          args: './...'
+
+      - name: Run npm audit
+        run: npm audit
+        working-directory: ./web-app
+
+      - name: Run Snyk to check for vulnerabilities
+        uses: snyk/actions/node@master
+        env:
+          SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
+        with:
+          args: --severity-threshold=high --all-projects
+
+  coverage:
+    name: Code Coverage
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Go
+        uses: actions/setup-go@v5
+        with:
+          go-version: '1.22'
+
+      - name: Install mupdf
+        run: sudo apt-get install -y mupdf
+
+      - name: Set library path
+        run: echo "/usr/lib" | sudo tee -a /etc/ld.so.conf.d/mupdf.conf && sudo ldconfig
+
+      - name: Run Go Coverage
+        run: |
+          go test -race -coverprofile=coverage.txt -covermode=atomic ./...
+          go tool cover -func=coverage.txt
+
+      - name: Upload Go coverage to Codecov
+        uses: codecov/codecov-action@v4
+        with:
+          file: ./coverage.txt
+          flags: backend
+          fail_ci_if_error: true
+
+      - name: Set up Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+          cache: 'npm'
+          cache-dependency-path: './web-app/package-lock.json'
+
+      - name: Install frontend dependencies
+        run: npm ci
+        working-directory: ./web-app
+
+      - name: Run Frontend Coverage
+        run: npm run test:coverage
+        working-directory: ./web-app
+
+      - name: Upload Frontend coverage to Codecov
+        uses: codecov/codecov-action@v4
+        with:
+          file: ./web-app/coverage/coverage-final.json
+          flags: frontend
+          fail_ci_if_error: true
+
+  format:
+    name: Code Formatting
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Go
+        uses: actions/setup-go@v5
+        with:
+          go-version: '1.22'
+
+      - name: Check Go Formatting
+        run: |
+          if [ -n "$(gofmt -l .)" ]; then
+            echo "Go files need formatting:"
+            gofmt -d .
+            exit 1
+          fi
+
+      - name: Set up Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+          cache: 'npm'
+          cache-dependency-path: './web-app/package-lock.json'
+
+      - name: Install frontend dependencies
+        run: npm ci
+        working-directory: ./web-app
+
+      - name: Check Frontend Formatting
+        run: npm run format:check
+        working-directory: ./web-app
+
+  complexity:
+    name: Code Complexity
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Go
+        uses: actions/setup-go@v5
+        with:
+          go-version: '1.22'
+
+      - name: Install gocyclo
+        run: go install github.com/fzipp/gocyclo/cmd/gocyclo@latest
+
+      - name: Check Go Code Complexity
+        run: |
+          gocyclo -over 15 .
+
+      - name: Set up Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+          cache: 'npm'
+          cache-dependency-path: './web-app/package-lock.json'
+
+      - name: Install frontend dependencies
+        run: npm ci
+        working-directory: ./web-app
+
+      - name: Check Frontend Complexity
+        run: npx ts-complexity ./src --max-complexity 15
+        working-directory: ./web-app
--- a/.github/workflows/documentation.yml
+++ b/.github/workflows/documentation.yml
@ -0,0 +1,193 @@
+name: Documentation
+
+on:
+  push:
+    branches: [ main ]
+    paths:
+      - '**/*.md'
+      - 'docs/**'
+      - '.github/workflows/documentation.yml'
+  pull_request:
+    branches: [ main ]
+    paths:
+      - '**/*.md'
+      - 'docs/**'
+      - '.github/workflows/documentation.yml'
+
+permissions:
+  contents: read
+  pages: write
+  id-token: write
+
+jobs:
+  markdown-lint:
+    name: Markdown Lint
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Setup Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+
+      - name: Install markdownlint
+        run: npm install -g markdownlint-cli
+
+      - name: Check Markdown files
+        run: markdownlint '**/*.md' --ignore node_modules
+
+      - name: Check for broken links
+        uses: gaurav-nelson/github-action-markdown-link-check@v1
+        with:
+          use-quiet-mode: 'yes'
+          use-verbose-mode: 'yes'
+          config-file: '.github/workflows/mlc_config.json'
+          folder-path: '.'
+          max-depth: -1
+
+  api-documentation:
+    name: API Documentation
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Go
+        uses: actions/setup-go@v5
+        with:
+          go-version: '1.22'
+
+      - name: Install swag
+        run: go install github.com/swaggo/swag/cmd/swag@latest
+
+      - name: Generate Swagger Documentation
+        run: swag init
+
+      - name: Check if documentation changed
+        run: |
+          if [[ `git status --porcelain` ]]; then
+            echo "API documentation is out of date. Please run 'swag init' locally and commit the changes."
+            exit 1
+          fi
+
+  typescript-documentation:
+    name: TypeScript Documentation
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Setup Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+
+      - name: Install TypeDoc
+        run: npm install -g typedoc
+
+      - name: Generate TypeScript Documentation
+        working-directory: ./web-app
+        run: typedoc --out docs/typescript src/
+
+      - name: Check documentation style
+        working-directory: ./web-app
+        run: |
+          if find src -name "*.tsx" -o -name "*.ts" | xargs grep -l "@todo\|FIXME"; then
+            echo "Found TODO or FIXME comments in the code. Please resolve them before merging."
+            exit 1
+          fi
+
+  spelling:
+    name: Documentation Spelling
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Check Spelling
+        uses: streetsidesoftware/cspell-action@v5
+        with:
+          files: |
+            **/*.md
+            docs/**/*
+            
+  validate-examples:
+    name: Validate Code Examples
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Setup Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+
+      - name: Install dependencies
+        run: npm install markdown-code-block-runner
+
+      - name: Validate code examples in documentation
+        run: npx markdown-code-block-runner "**/*.md"
+
+  build-wiki:
+    name: Build Wiki
+    needs: [markdown-lint, spelling]
+    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Setup mdBook
+        uses: peaceiris/actions-mdbook@v1
+        with:
+          mdbook-version: 'latest'
+
+      - name: Build documentation
+        run: |
+          mdbook build docs/
+
+      - name: Setup Pages
+        uses: actions/configure-pages@v4
+
+      - name: Upload artifact
+        uses: actions/upload-pages-artifact@v3
+        with:
+          path: 'docs/book'
+
+      - name: Deploy to GitHub Pages
+        id: deployment
+        uses: actions/deploy-pages@v4
+
+  check-docs-coverage:
+    name: Documentation Coverage
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Go
+        uses: actions/setup-go@v5
+        with:
+          go-version: '1.22'
+
+      - name: Install doc coverage tool
+        run: go install github.com/client9/misspell/cmd/misspell@latest
+
+      - name: Check public API documentation coverage
+        run: |
+          COVERAGE=$(go doc -all ./... | wc -l)
+          if [ "$COVERAGE" -lt 100 ]; then
+            echo "Documentation coverage is below threshold"
+            exit 1
+          fi
+
+      - name: Setup Node.js
+        uses: actions/setup-node@v4
+        with:
+          node-version: '20'
+
+      - name: Check TypeScript documentation coverage
+        working-directory: ./web-app
+        run: |
+          npm install -g typescript
+          COVERAGE=$(find src -name "*.ts" -o -name "*.tsx" | xargs grep -l "@doc" | wc -l)
+          if [ "$COVERAGE" -lt 50 ]; then
+            echo "TypeScript documentation coverage is below threshold"
+            exit 1
+          fi
--- a/.github/workflows/mlc_config.json
+++ b/.github/workflows/mlc_config.json
@ -0,0 +1,29 @@
+{
+  "replacementPatterns": [
+    {
+      "pattern": "^/",
+      "replacement": "{{BASEURL}}/"
+    }
+  ],
+  "ignorePatterns": [
+    {
+      "pattern": "^http://localhost"
+    },
+    {
+      "pattern": "^#"
+    }
+  ],
+  "timeout": "20s",
+  "retryOn429": true,
+  "retryCount": 5,
+  "fallbackRetryDelay": "30s",
+  "aliveStatusCodes": [200, 206],
+  "httpHeaders": [
+    {
+      "urls": ["https://github.com/"],
+      "headers": {
+        "Accept": "application/vnd.github.v3+json"
+      }
+    }
+  ]
+}
--- a/.markdownlint.json
+++ b/.markdownlint.json
@ -0,0 +1,125 @@
+{
+  "default": true,
+  "MD001": true,
+  "MD002": {
+    "level": 1
+  },
+  "MD003": {
+    "style": "atx"
+  },
+  "MD004": {
+    "style": "dash"
+  },
+  "MD005": true,
+  "MD006": true,
+  "MD007": {
+    "indent": 2
+  },
+  "MD009": {
+    "br_spaces": 2,
+    "list_item_empty_lines": false,
+    "strict": false
+  },
+  "MD010": {
+    "code_blocks": false,
+    "spaces_per_tab": 2
+  },
+  "MD011": true,
+  "MD012": {
+    "maximum": 1
+  },
+  "MD013": {
+    "line_length": 120,
+    "code_blocks": false,
+    "tables": false,
+    "headings": false
+  },
+  "MD014": false,
+  "MD018": true,
+  "MD019": true,
+  "MD020": true,
+  "MD021": true,
+  "MD022": true,
+  "MD023": true,
+  "MD024": {
+    "allow_different_nesting": true
+  },
+  "MD025": {
+    "level": 1,
+    "front_matter_title": ""
+  },
+  "MD026": {
+    "punctuation": ".,;:!。，；：！"
+  },
+  "MD027": true,
+  "MD028": true,
+  "MD029": {
+    "style": "ordered"
+  },
+  "MD030": {
+    "ul_single": 1,
+    "ol_single": 1,
+    "ul_multi": 1,
+    "ol_multi": 1
+  },
+  "MD031": true,
+  "MD032": true,
+  "MD033": {
+    "allowed_elements": [
+      "br",
+      "details",
+      "summary",
+      "kbd",
+      "div",
+      "img",
+      "pre"
+    ]
+  },
+  "MD034": true,
+  "MD035": {
+    "style": "---"
+  },
+  "MD036": false,
+  "MD037": true,
+  "MD038": true,
+  "MD039": true,
+  "MD040": true,
+  "MD041": {
+    "level": 1,
+    "front_matter_title": ""
+  },
+  "MD042": true,
+  "MD043": false,
+  "MD044": {
+    "names": [
+      "JavaScript",
+      "TypeScript",
+      "React",
+      "Docker",
+      "Node.js",
+      "npm",
+      "Go",
+      "OpenAI",
+      "Ollama",
+      "paperless-gpt"
+    ],
+    "code_blocks": false
+  },
+  "MD045": true,
+  "MD046": {
+    "style": "fenced"
+  },
+  "MD047": true,
+  "MD048": {
+    "style": "backtick"
+  },
+  "MD049": {
+    "style": "underscore"
+  },
+  "MD050": {
+    "style": "asterisk"
+  },
+  "MD051": true,
+  "MD052": true,
+  "MD053": true
+}
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -0,0 +1,34 @@
+# Changelog
+
+All notable changes to this project will be documented in this file.
+
+The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
+and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+
+## [Unreleased]
+### Added
+- Enhanced project documentation and organization
+- Project governance guidelines
+- Security policy and guidelines
+- Architecture documentation
+
+## [1.0.0] - Initial Release
+### Added
+- LLM-Enhanced OCR capabilities
+- Automatic title & tag generation
+- Automatic correspondent generation
+- Custom prompt templates
+- Docker deployment support
+- Web UI for document management
+- Support for multiple LLM providers (OpenAI, Ollama)
+- Configurable environment variables
+- Integration with paperless-ngx
+- Manual and automatic processing modes
+- Basic documentation and setup guides
+
+### Security
+- API token authentication
+- Environment-based configuration
+- Docker container isolation
+
+For earlier history, please see the git commit log.
--- a/GOVERNANCE.md
+++ b/GOVERNANCE.md
@ -0,0 +1,226 @@
+# Project Governance
+
+This document outlines the governance model for the paperless-gpt project. It describes how decisions are made and how community members can participate in project development.
+
+## Project Roles
+
+### Users
+- People who use paperless-gpt
+- Can submit bug reports and feature requests
+- Can contribute to discussions
+- Can help other users
+
+### Contributors
+- Users who contribute to the project
+- Submit pull requests
+- Improve documentation
+- Help with testing
+- Participate in issue discussions
+
+### Maintainers
+- Review and merge pull requests
+- Manage issues and project boards
+- Guide technical direction
+- Ensure code quality
+- Help onboard new contributors
+- Responsibilities:
+  - Respond to issues and PRs
+  - Review code changes
+  - Maintain documentation
+  - Ensure tests pass
+  - Release new versions
+  - Uphold code of conduct
+
+### Project Lead
+- Final decision maker for project direction
+- Sets technical standards
+- Manages maintainer team
+- Oversees releases
+- Current lead: [@icereed](https://github.com/icereed)
+
+## Decision Making
+
+### Technical Decisions
+1. **Discussion Phase**
+   - Open an issue for discussion
+   - Gather community feedback
+   - Consider alternatives
+   - Document trade-offs
+
+2. **Implementation Phase**
+   - Create detailed proposal
+   - Submit pull request
+   - Address review feedback
+   - Update documentation
+
+3. **Review Process**
+   - At least one maintainer review
+   - Automated tests must pass
+   - Documentation must be updated
+   - Breaking changes require extra scrutiny
+
+### Project Direction
+1. **Long-term Planning**
+   - Quarterly roadmap updates
+   - Community feedback periods
+   - Clear communication of goals
+   - Published milestones
+
+2. **Feature Acceptance**
+   - Must align with project goals
+   - Consider maintenance burden
+   - Evaluate user benefit
+   - Check implementation feasibility
+
+### Release Process
+1. **Version Planning**
+   - Follow semantic versioning
+   - Document all changes
+   - Update dependencies
+   - Security review
+
+2. **Release Preparation**
+   - Create release branch
+   - Run test suite
+   - Update changelog
+   - Draft release notes
+
+3. **Release Publication**
+   - Tag version in repository
+   - Publish to registries
+   - Announce to community
+   - Monitor for issues
+
+## Communication
+
+### Channels
+- GitHub Issues: Bug reports, feature requests
+- GitHub Discussions: General discussion
+- Pull Requests: Code changes
+- Discord: Community chat
+- Email: Security issues
+
+### Guidelines
+- Be respectful and professional
+- Stay on topic
+- English is the working language
+- Document decisions and rationale
+- Keep security issues private
+
+## Contributing
+
+### Process
+1. **Getting Started**
+   - Read contribution guidelines
+   - Set up development environment
+   - Understand code structure
+   - Pick starter issues
+
+2. **Making Changes**
+   - Create feature branch
+   - Follow code style
+   - Write tests
+   - Update docs
+
+3. **Submitting Changes**
+   - Create pull request
+   - Fill out template
+   - Respond to reviews
+   - Keep changes focused
+
+### Standards
+- Follow code style guide
+- Include tests
+- Update documentation
+- Sign commits
+- One feature per PR
+
+## Code Review
+
+### Requirements
+- At least one maintainer approval
+- All tests passing
+- Documentation updated
+- Code style compliance
+- No security issues
+
+### Process
+1. **Automated Checks**
+   - Linting
+   - Tests
+   - Coverage
+   - Dependencies
+
+2. **Manual Review**
+   - Code quality
+   - Architecture
+   - Security
+   - Performance
+
+3. **Final Checks**
+   - Merge conflicts
+   - Documentation
+   - Breaking changes
+   - Version updates
+
+## Issue Management
+
+### Categories
+- Bug: Software defects
+- Feature: New functionality
+- Enhancement: Improvements
+- Documentation: Doc changes
+- Question: User queries
+
+### Priority Levels
+1. **Critical**
+   - Security issues
+   - Major bugs
+   - Blocking issues
+
+2. **High**
+   - Important features
+   - User experience issues
+   - Performance problems
+
+3. **Normal**
+   - Regular enhancements
+   - Minor bugs
+   - Documentation updates
+
+4. **Low**
+   - Nice-to-have features
+   - Style improvements
+   - Non-critical fixes
+
+## Project Changes
+
+### Governance Changes
+- Open for community discussion
+- Two week comment period
+- Maintainer consensus required
+- Project lead approval needed
+
+### Role Changes
+- Based on consistent contributions
+- Maintainer nomination
+- Community feedback
+- Project lead approval
+
+## Success Metrics
+
+### Project Health
+- Issue resolution time
+- PR merge time
+- Test coverage
+- Documentation quality
+- Community engagement
+
+### Code Quality
+- Automated metrics
+- Review thoroughness
+- Test coverage
+- Documentation completeness
+- Security standards
+
+This governance model is a living document and may be updated as the project evolves. Changes will be proposed and discussed with the community before implementation.
--- a/SECURITY.md
+++ b/SECURITY.md
@ -0,0 +1,125 @@
+# Security Policy
+
+## Reporting a Vulnerability
+
+At paperless-gpt, we take security seriously. If you discover a security vulnerability, please follow these steps:
+
+1. **DO NOT** disclose the vulnerability publicly.
+2. Send a detailed report to security@icereed.net including:
+   - A description of the vulnerability
+   - Steps to reproduce the issue
+   - Potential impact
+   - Any suggested fixes (if available)
+3. Allow up to 48 hours for an initial response.
+4. Please do not disclose the issue publicly until we've had a chance to address it.
+
+## Security Considerations
+
+### API Keys and Tokens
+- Never commit API keys, tokens, or sensitive credentials to the repository
+- Use environment variables for all sensitive configuration
+- Rotate API keys and tokens regularly
+- Use the minimum required permissions for API tokens
+
+### Data Privacy
+- All document processing is done locally or via your configured LLM provider
+- No document data is stored permanently outside your system
+- Temporary files are cleaned up after processing
+- Documents are transmitted securely using HTTPS
+
+### Docker Security
+- Containers run with minimal privileges
+- Images are regularly updated with security patches
+- Dependencies are scanned for vulnerabilities
+- Official base images are used
+
+### LLM Provider Security
+- API calls to LLM providers use encrypted connections
+- Rate limiting is implemented to prevent abuse
+- Input validation is performed on all user inputs
+- Error messages are sanitized to prevent information leakage
+
+### Access Control
+- Use strong passwords for all services
+- Implement the principle of least privilege
+- Regular security audits of access controls
+- Monitor for unauthorized access attempts
+
+## Version Support
+
+We provide security updates for:
+- The current major version
+- The previous major version for 6 months after a new major release
+
+## Best Practices for Deployment
+
+1. **Network Security**
+   - Use HTTPS for all connections
+   - Implement proper firewall rules
+   - Use secure DNS configurations
+   - Regular security audits
+
+2. **System Updates**
+   - Keep all system packages updated
+   - Subscribe to security advisories
+   - Regular vulnerability scanning
+   - Automated update notifications
+
+3. **Monitoring**
+   - Monitor system logs for suspicious activity
+   - Track resource usage patterns
+   - Alert on anomalous behavior
+   - Regular security assessments
+
+4. **Backup and Recovery**
+   - Regular backups of critical data
+   - Secure backup storage
+   - Tested recovery procedures
+   - Documented incident response plan
+
+## Dependencies
+
+We regularly monitor and update dependencies for security vulnerabilities:
+- Automated dependency updates via Renovate
+- Regular security audits of dependencies
+- Minimal use of third-party packages
+- Verification of package signatures
+
+## Contributing Security Fixes
+
+If you want to contribute security fixes:
+1. Follow the standard pull request process
+2. Mark security-related PRs as "security fix"
+3. Provide detailed description of the security impact
+4. Include tests that verify the fix
+
+## Security Release Process
+
+When a security issue is identified:
+1. Issue is assessed and prioritized
+2. Fix is developed and tested
+3. Security advisory is prepared
+4. Fix is deployed and announced
+5. Users are notified through appropriate channels
+
+## Incident Response
+
+In case of a security incident:
+1. Issue is immediately assessed
+2. Affected systems are isolated
+3. Root cause is identified
+4. Fix is developed and tested
+5. Systems are restored
+6. Incident report is prepared
+7. Preventive measures are implemented
+
+## Contact
+
+For security-related matters, contact:
+- Email: security@icereed.net
+- Response time: Within 48 hours
+- Language: English
+
+## Acknowledgments
+
+We'd like to thank all security researchers who have helped improve the security of paperless-gpt. A list of acknowledged researchers can be found in our [Hall of Fame](CONTRIBUTORS.md#security-researchers).
--- a/cline_docs/productContext.md
+++ b/cline_docs/productContext.md
@ -0,0 +1,78 @@
+# Product Context
+
+## Project Purpose
+paperless-gpt is designed to enhance document management by integrating AI capabilities with paperless-ngx. Its primary purpose is to automate and improve the accuracy of document processing tasks that traditionally require manual intervention.
+
+## Problems Solved
+1. Manual Document Organization
+   - Eliminates tedious manual tagging and titling
+   - Reduces time spent on document categorization
+   - Minimizes human error in classification
+
+2. OCR Quality Issues
+   - Improves text extraction from poor quality scans
+   - Enhances accuracy through LLM-based OCR
+   - Provides context-aware text interpretation
+
+3. Document Processing Automation
+   - Automates correspondent identification
+   - Streamlines document categorization
+   - Enables bulk processing capabilities
+
+## Core Functionality
+1. AI-Powered Document Processing
+   - Title generation using LLMs
+   - Intelligent tag suggestions
+   - Automated correspondent detection
+   - Enhanced OCR capabilities
+
+2. Integration Features
+   - Seamless paperless-ngx integration
+   - Support for multiple LLM providers
+   - Docker-based deployment
+   - Customizable prompt templates
+
+3. User Experience
+   - Web-based interface
+   - Manual review capabilities
+   - Automatic processing options
+   - Flexible configuration options
+
+## Success Criteria
+1. Accuracy Metrics
+   - High-quality OCR results
+   - Accurate document classification
+   - Relevant tag suggestions
+   - Correct correspondent identification
+
+2. Performance Goals
+   - Fast processing times
+   - Reliable system operation
+   - Scalable document handling
+   - Efficient resource usage
+
+3. User Satisfaction
+   - Intuitive interface
+   - Clear feedback mechanisms
+   - Minimal manual intervention
+   - Consistent results
+
+## Future Vision
+1. Enhanced Capabilities
+   - Support for more AI providers
+   - Statistics and analytics features
+   - Advanced document analysis
+   - Improved processing algorithms
+   - Extended automation options
+
+2. Community Growth
+   - Active contributor base
+   - Regular feature additions
+   - Strong documentation
+   - Responsive maintenance
+
+3. Technical Evolution
+   - Improved architecture
+   - Enhanced performance
+   - Extended integrations
+   - Robust testing
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@ -0,0 +1,221 @@
+# paperless-gpt Architecture
+
+This document provides a comprehensive overview of the paperless-gpt architecture, explaining how different components interact to provide AI-powered document processing capabilities.
+
+## System Overview
+
+paperless-gpt is designed as a companion service to paperless-ngx, adding AI capabilities for document processing. The system consists of several key components:
+
+```mermaid
+graph TB
+    UI[Web UI] --> API[Backend API]
+    API --> LLM[LLM Service]
+    API --> OCR[OCR Service]
+    API --> DB[Local DB]
+    API --> PaperlessNGX[paperless-ngx API]
+    LLM --> OpenAI[OpenAI]
+    LLM --> Ollama[Ollama]
+    OCR --> VisionLLM[Vision LLM]
+```
+
+## Core Components
+
+### 1. Backend API (Go)
+- Handles all business logic
+- Manages document processing workflow
+- Coordinates between services
+- Provides REST API endpoints
+- Manages state and caching
+
+### 2. Web UI (React + TypeScript)
+- User interface for document management
+- Real-time processing status
+- Document preview and editing
+- Configuration interface
+- Responsive design
+
+### 3. LLM Service
+- Manages LLM provider connections
+- Handles prompt engineering
+- Processes document content
+- Generates metadata suggestions
+- Supports multiple providers:
+  - OpenAI (gpt-4, gpt-3.5-turbo)
+  - Ollama (llama2, etc.)
+
+### 4. OCR Service
+- Vision LLM integration
+- Image preprocessing
+- Text extraction
+- Layout analysis
+- Quality enhancement
+
+### 5. Local Database
+- Caches processing results
+- Stores configuration
+- Manages queues
+- Tracks document state
+
+## Data Flow
+
+### Document Processing Flow
+```mermaid
+sequenceDiagram
+    participant U as User
+    participant UI as Web UI
+    participant API as Backend API
+    participant LLM as LLM Service
+    participant OCR as OCR Service
+    participant PNX as paperless-ngx
+
+    U->>UI: Upload Document
+    UI->>API: Process Request
+    API->>OCR: Extract Text
+    OCR-->>API: Text Content
+    API->>LLM: Generate Metadata
+    LLM-->>API: Suggestions
+    API->>UI: Preview Results
+    U->>UI: Approve Changes
+    UI->>API: Apply Changes
+    API->>PNX: Update Document
+    PNX-->>API: Confirmation
+    API-->>UI: Success
+```
+
+## Key Design Decisions
+
+### 1. Modular Architecture
+- Separation of concerns
+- Pluggable components
+- Easy to extend
+- Maintainable code
+
+### 2. Stateless Design
+- Scalable architecture
+- No shared state
+- Resilient operation
+- Easy deployment
+
+### 3. Security First
+- API authentication
+- Data encryption
+- Input validation
+- Error handling
+
+### 4. Performance Optimization
+- Local caching
+- Batch processing
+- Async operations
+- Resource management
+
+## Directory Structure
+
+```
+paperless-gpt/
+├── main.go                 # Application entry point
+├── app_llm.go             # LLM service implementation
+├── app_http_handlers.go    # HTTP handlers
+├── paperless.go           # paperless-ngx integration
+├── ocr.go                 # OCR service
+├── types.go               # Type definitions
+├── web-app/               # Frontend application
+│   ├── src/
+│   │   ├── components/    # React components
+│   │   ├── App.tsx       # Main application
+│   │   └── ...
+│   └── ...
+└── ...
+```
+
+## Configuration Management
+
+The system uses environment variables for configuration, allowing easy deployment and configuration changes:
+
+```
+PAPERLESS_BASE_URL        # paperless-ngx connection
+LLM_PROVIDER             # AI backend selection
+VISION_LLM_PROVIDER      # OCR provider selection
+...
+```
+
+## Error Handling
+
+The system implements comprehensive error handling:
+
+1. **User Errors**
+   - Input validation
+   - Clear error messages
+   - Guided resolution
+
+2. **System Errors**
+   - Graceful degradation
+   - Automatic retry
+   - Error logging
+   - Monitoring alerts
+
+3. **External Service Errors**
+   - Fallback options
+   - Circuit breaking
+   - Rate limiting
+   - Error reporting
+
+## Scaling Considerations
+
+The architecture supports scaling through:
+
+1. **Horizontal Scaling**
+   - Stateless design
+   - Load balancing
+   - Distributed processing
+
+2. **Resource Management**
+   - Connection pooling
+   - Cache management
+   - Queue processing
+   - Rate limiting
+
+3. **Performance Optimization**
+   - Batch processing
+   - Async operations
+   - Efficient algorithms
+   - Resource caching
+
+## Future Considerations
+
+The architecture is designed to support future enhancements:
+
+1. **Plugin System**
+   - Custom processors
+   - Integration points
+   - Event hooks
+
+2. **Advanced Features**
+   - Multi-language support
+   - Custom ML models
+   - Advanced analytics
+
+3. **Integration Options**
+   - API extensions
+   - Service hooks
+   - Custom providers
+
+## Development Guidelines
+
+When making changes to the architecture:
+
+1. **Documentation**
+   - Update this document
+   - Add inline comments
+   - Update API docs
+
+2. **Testing**
+   - Unit tests
+   - Integration tests
+   - Performance tests
+
+3. **Review Process**
+   - Architecture review
+   - Security review
+   - Performance review
+
+This architecture documentation is maintained by the core team and updated as the system evolves.