mirror of
https://github.com/icereed/paperless-gpt.git
synced 2025-03-12 21:08:00 -05:00
Merge c6521ef312
into 1bd25a297c
This commit is contained in:
commit
8dc489e035
13 changed files with 1724 additions and 0 deletions
116
.github/ISSUE_TEMPLATE/bug_report.yml
vendored
Normal file
116
.github/ISSUE_TEMPLATE/bug_report.yml
vendored
Normal file
|
@ -0,0 +1,116 @@
|
|||
name: Bug Report
|
||||
description: Create a report to help us improve
|
||||
title: "[BUG] "
|
||||
labels: ["bug", "triage"]
|
||||
body:
|
||||
- type: markdown
|
||||
attributes:
|
||||
value: |
|
||||
Thanks for taking the time to fill out this bug report!
|
||||
Before submitting, please check if a similar issue already exists.
|
||||
|
||||
- type: input
|
||||
id: version
|
||||
attributes:
|
||||
label: Version
|
||||
description: What version of paperless-gpt are you running?
|
||||
placeholder: "e.g., 1.0.0"
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: dropdown
|
||||
id: deployment
|
||||
attributes:
|
||||
label: Deployment Method
|
||||
description: How are you running paperless-gpt?
|
||||
options:
|
||||
- Docker (official image)
|
||||
- Docker Compose
|
||||
- Manual Installation
|
||||
- Other
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: input
|
||||
id: llm-provider
|
||||
attributes:
|
||||
label: LLM Provider
|
||||
description: Which LLM provider are you using?
|
||||
placeholder: "e.g., OpenAI, Ollama"
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: input
|
||||
id: llm-model
|
||||
attributes:
|
||||
label: LLM Model
|
||||
description: Which model are you using?
|
||||
placeholder: "e.g., gpt-4, llama2"
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: dropdown
|
||||
id: os
|
||||
attributes:
|
||||
label: Operating System
|
||||
description: What operating system are you using?
|
||||
options:
|
||||
- Linux
|
||||
- macOS
|
||||
- Windows
|
||||
- Other
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: what-happened
|
||||
attributes:
|
||||
label: What happened?
|
||||
description: A clear and concise description of the bug.
|
||||
placeholder: "Tell us what you see!"
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: expected
|
||||
attributes:
|
||||
label: Expected behavior
|
||||
description: What did you expect to happen?
|
||||
placeholder: "Tell us what you expected"
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: reproduction
|
||||
attributes:
|
||||
label: Steps to reproduce
|
||||
description: How can we reproduce this issue?
|
||||
placeholder: |
|
||||
1. Go to '...'
|
||||
2. Click on '...'
|
||||
3. Scroll down to '...'
|
||||
4. See error
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: logs
|
||||
attributes:
|
||||
label: Relevant log output
|
||||
description: Please copy and paste any relevant log output. This will be automatically formatted into code.
|
||||
render: shell
|
||||
|
||||
- type: textarea
|
||||
id: config
|
||||
attributes:
|
||||
label: Configuration
|
||||
description: |
|
||||
Please provide your configuration (with sensitive information redacted).
|
||||
This could be your docker-compose.yml or environment variables.
|
||||
render: yaml
|
||||
|
||||
- type: textarea
|
||||
id: additional
|
||||
attributes:
|
||||
label: Additional context
|
||||
description: Add any other context about the problem here
|
118
.github/ISSUE_TEMPLATE/feature_request.yml
vendored
Normal file
118
.github/ISSUE_TEMPLATE/feature_request.yml
vendored
Normal file
|
@ -0,0 +1,118 @@
|
|||
name: Feature Request
|
||||
description: Suggest an idea for this project
|
||||
title: "[FEATURE] "
|
||||
labels: ["enhancement"]
|
||||
body:
|
||||
- type: markdown
|
||||
attributes:
|
||||
value: |
|
||||
Thanks for taking the time to suggest a new feature!
|
||||
Please fill out this form as completely as possible.
|
||||
|
||||
- type: textarea
|
||||
id: problem
|
||||
attributes:
|
||||
label: Problem Statement
|
||||
description: Is your feature request related to a problem? Please describe.
|
||||
placeholder: "I'm always frustrated when [...]"
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: solution
|
||||
attributes:
|
||||
label: Proposed Solution
|
||||
description: Describe the solution you'd like to see
|
||||
placeholder: "It would be great if [...]"
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: alternatives
|
||||
attributes:
|
||||
label: Alternatives Considered
|
||||
description: Describe any alternative solutions or features you've considered
|
||||
placeholder: "I've thought about [...]"
|
||||
|
||||
- type: dropdown
|
||||
id: importance
|
||||
attributes:
|
||||
label: Importance Level
|
||||
description: How important is this feature to your use case?
|
||||
options:
|
||||
- Critical (Blocking my use of the project)
|
||||
- High (Would significantly improve my workflow)
|
||||
- Medium (Would be nice to have)
|
||||
- Low (Just an idea)
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: dropdown
|
||||
id: component
|
||||
attributes:
|
||||
label: Component
|
||||
description: Which part of paperless-gpt would this feature primarily affect?
|
||||
options:
|
||||
- OCR Processing
|
||||
- LLM Integration
|
||||
- Document Management
|
||||
- UI/UX
|
||||
- API
|
||||
- Configuration
|
||||
- Documentation
|
||||
- Performance
|
||||
- Security
|
||||
- Other
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: dropdown
|
||||
id: scope
|
||||
attributes:
|
||||
label: Implementation Scope
|
||||
description: How extensive would the changes be?
|
||||
options:
|
||||
- Minor (Simple change, few files)
|
||||
- Moderate (Multiple files, some complexity)
|
||||
- Major (Significant changes, new features)
|
||||
- Breaking (Requires breaking changes)
|
||||
validations:
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: context
|
||||
attributes:
|
||||
label: Additional Context
|
||||
description: Add any other context about the feature request here
|
||||
placeholder: "Include use cases, benefits, or screenshots"
|
||||
|
||||
- type: textarea
|
||||
id: implementation
|
||||
attributes:
|
||||
label: Implementation Ideas
|
||||
description: If you have specific ideas about how to implement this feature, please share them
|
||||
placeholder: "We could implement this by..."
|
||||
|
||||
- type: checkboxes
|
||||
id: terms
|
||||
attributes:
|
||||
label: Contribution
|
||||
description: Would you be interested in helping implement this feature?
|
||||
options:
|
||||
- label: I'm interested in contributing to this feature's implementation
|
||||
required: false
|
||||
- label: I have read the contribution guidelines
|
||||
required: true
|
||||
|
||||
- type: textarea
|
||||
id: success_criteria
|
||||
attributes:
|
||||
label: Success Criteria
|
||||
description: What would make this feature implementation successful?
|
||||
placeholder: |
|
||||
Example criteria:
|
||||
- Feature works with both OpenAI and Ollama
|
||||
- Performance impact is minimal
|
||||
- No breaking changes to existing functionality
|
||||
validations:
|
||||
required: true
|
163
.github/config.yml
vendored
Normal file
163
.github/config.yml
vendored
Normal file
|
@ -0,0 +1,163 @@
|
|||
# GitHub App Configuration
|
||||
|
||||
# Label Configuration
|
||||
labels:
|
||||
# Type labels
|
||||
- name: bug
|
||||
color: d73a4a
|
||||
description: Something isn't working
|
||||
- name: enhancement
|
||||
color: a2eeef
|
||||
description: New feature or request
|
||||
- name: documentation
|
||||
color: 0075ca
|
||||
description: Documentation improvements
|
||||
- name: security
|
||||
color: ee0701
|
||||
description: Security-related issues
|
||||
|
||||
# Priority labels
|
||||
- name: critical
|
||||
color: b60205
|
||||
description: Needs immediate attention
|
||||
- name: high
|
||||
color: d93f0b
|
||||
description: High priority
|
||||
- name: medium
|
||||
color: fbca04
|
||||
description: Medium priority
|
||||
- name: low
|
||||
color: 0e8a16
|
||||
description: Low priority
|
||||
|
||||
# Status labels
|
||||
- name: triage
|
||||
color: d4c5f9
|
||||
description: Needs triage
|
||||
- name: in-progress
|
||||
color: 9ee12f
|
||||
description: Work in progress
|
||||
- name: blocked
|
||||
color: b60205
|
||||
description: Blocked or needs clarification
|
||||
|
||||
# Component labels
|
||||
- name: frontend
|
||||
color: 1d76db
|
||||
description: Frontend related
|
||||
- name: backend
|
||||
color: 0052cc
|
||||
description: Backend related
|
||||
- name: ocr
|
||||
color: 5319e7
|
||||
description: OCR functionality
|
||||
- name: llm
|
||||
color: 006b75
|
||||
description: LLM integration
|
||||
|
||||
# Size labels
|
||||
- name: size/xs
|
||||
color: d4c5f9
|
||||
description: Extra small change
|
||||
- name: size/s
|
||||
color: 84b6eb
|
||||
description: Small change
|
||||
- name: size/m
|
||||
color: fbca04
|
||||
description: Medium change
|
||||
- name: size/l
|
||||
color: d93f0b
|
||||
description: Large change
|
||||
- name: size/xl
|
||||
color: b60205
|
||||
description: Extra large change
|
||||
|
||||
# Stale issue configuration
|
||||
stale:
|
||||
daysUntilStale: 60
|
||||
daysUntilClose: 7
|
||||
exemptLabels:
|
||||
- security
|
||||
- critical
|
||||
- pinned
|
||||
staleLabel: stale
|
||||
markComment: >
|
||||
This issue has been automatically marked as stale because it has not had
|
||||
recent activity. It will be closed if no further activity occurs. Thank you
|
||||
for your contributions.
|
||||
closeComment: >
|
||||
This issue has been automatically closed due to inactivity. Please feel free
|
||||
to reopen it if you still experience this problem.
|
||||
|
||||
# Welcome message for new contributors
|
||||
newContributorWelcomeComment: >
|
||||
Thanks for making your first contribution to paperless-gpt! 🎉
|
||||
|
||||
Please make sure you've read our [Contributing Guidelines](CONTRIBUTING.md)
|
||||
and [Code of Conduct](CODE_OF_CONDUCT.md).
|
||||
|
||||
If you need any help, feel free to mention @icereed or ask in our Discord.
|
||||
|
||||
# PR size labeling
|
||||
prSize:
|
||||
xs:
|
||||
lines: 10
|
||||
s:
|
||||
lines: 50
|
||||
m:
|
||||
lines: 250
|
||||
l:
|
||||
lines: 500
|
||||
xl:
|
||||
lines: 1000
|
||||
|
||||
# Code review settings
|
||||
reviews:
|
||||
request_count: 1
|
||||
notify_on_changes: true
|
||||
auto_assign: true
|
||||
auto_merge: false
|
||||
|
||||
# Branch protection settings
|
||||
branchProtection:
|
||||
main:
|
||||
required_status_checks:
|
||||
- "build"
|
||||
- "test"
|
||||
- "lint"
|
||||
enforce_admins: true
|
||||
required_pull_request_reviews:
|
||||
required_approving_review_count: 1
|
||||
dismiss_stale_reviews: true
|
||||
require_code_owner_reviews: true
|
||||
allow_force_pushes: false
|
||||
allow_deletions: false
|
||||
|
||||
# Issue template settings
|
||||
issueTemplate:
|
||||
checkNew: true
|
||||
useConfigure: true
|
||||
configureMessage: >
|
||||
Please use our issue templates to report bugs or request features.
|
||||
This helps us track and resolve issues more effectively.
|
||||
|
||||
# Pull request template settings
|
||||
pullRequestTemplate:
|
||||
checkNew: true
|
||||
useConfigure: true
|
||||
configureMessage: >
|
||||
Please make sure your PR follows our guidelines and includes all necessary information.
|
||||
Don't forget to link any related issues.
|
||||
|
||||
# Repository settings
|
||||
repository:
|
||||
private: false
|
||||
has_issues: true
|
||||
has_projects: true
|
||||
has_wiki: true
|
||||
has_downloads: true
|
||||
default_branch: main
|
||||
allow_squash_merge: true
|
||||
allow_merge_commit: false
|
||||
allow_rebase_merge: true
|
||||
delete_branch_on_merge: true
|
79
.github/pull_request_template.md
vendored
Normal file
79
.github/pull_request_template.md
vendored
Normal file
|
@ -0,0 +1,79 @@
|
|||
# Description
|
||||
|
||||
Please include a summary of the changes and the related issue. Please also include relevant motivation and context. List any dependencies that are required for this change.
|
||||
|
||||
Fixes # (issue)
|
||||
|
||||
## Type of change
|
||||
|
||||
Please delete options that are not relevant.
|
||||
|
||||
- [ ] Bug fix (non-breaking change which fixes an issue)
|
||||
- [ ] New feature (non-breaking change which adds functionality)
|
||||
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
|
||||
- [ ] Documentation update
|
||||
- [ ] This change requires a documentation update
|
||||
|
||||
## Checklist:
|
||||
|
||||
Before submitting your PR, please review the following checklist:
|
||||
|
||||
### General
|
||||
- [ ] I have performed a self-review of my code
|
||||
- [ ] I have commented my code, particularly in hard-to-understand areas
|
||||
- [ ] I have made corresponding changes to the documentation
|
||||
- [ ] My changes generate no new warnings
|
||||
- [ ] Any dependent changes have been merged and published
|
||||
- [ ] I have checked my code and corrected any misspellings
|
||||
|
||||
### Testing
|
||||
- [ ] I have added tests that prove my fix is effective or that my feature works
|
||||
- [ ] New and existing unit tests pass locally with my changes
|
||||
- [ ] I have tested this code in development environment
|
||||
- [ ] I have tested edge cases and error conditions
|
||||
|
||||
### Security
|
||||
- [ ] My code follows the project's security guidelines
|
||||
- [ ] I have conducted a security impact assessment of my changes
|
||||
- [ ] I have verified no sensitive information is exposed
|
||||
|
||||
### Performance
|
||||
- [ ] I have verified my changes don't introduce performance regressions
|
||||
- [ ] I have optimized any resource-intensive operations
|
||||
- [ ] I have considered the impact on system resources
|
||||
|
||||
### Documentation
|
||||
- [ ] I have updated the README.md (if applicable)
|
||||
- [ ] I have updated the API documentation (if applicable)
|
||||
- [ ] I have updated architecture docs (if applicable)
|
||||
- [ ] I have added JSDoc/comments for all new code
|
||||
|
||||
### Dependencies
|
||||
- [ ] I have updated the dependency list (if applicable)
|
||||
- [ ] I have checked for and resolved any dependency conflicts
|
||||
- [ ] I have verified compatibility with existing dependencies
|
||||
|
||||
### Compatibility
|
||||
- [ ] My changes are backward compatible
|
||||
- [ ] I have tested with different LLM providers
|
||||
- [ ] I have tested with different configurations
|
||||
- [ ] I have verified Docker compatibility
|
||||
|
||||
### Code Quality
|
||||
- [ ] My code follows the project's style guidelines
|
||||
- [ ] I have run linting tools and fixed any issues
|
||||
- [ ] I have maintained or improved code coverage
|
||||
- [ ] I have followed SOLID principles
|
||||
|
||||
## Screenshots/Videos
|
||||
|
||||
If applicable, add screenshots or videos to help explain your changes.
|
||||
|
||||
## Additional Notes
|
||||
|
||||
Add any other context about the PR here.
|
||||
|
||||
## Linked Issues
|
||||
|
||||
- Resolves #(issue number)
|
||||
- Related to #(issue number)
|
217
.github/workflows/code-quality.yml
vendored
Normal file
217
.github/workflows/code-quality.yml
vendored
Normal file
|
@ -0,0 +1,217 @@
|
|||
name: Code Quality
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [ main ]
|
||||
pull_request:
|
||||
branches: [ main ]
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
pull-requests: write
|
||||
|
||||
jobs:
|
||||
lint:
|
||||
name: Lint
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Set up Go
|
||||
uses: actions/setup-go@v5
|
||||
with:
|
||||
go-version: '1.22'
|
||||
|
||||
- name: Install golangci-lint
|
||||
run: |
|
||||
curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin v1.55.2
|
||||
|
||||
- name: Go Lint
|
||||
uses: golangci/golangci-lint-action@v4
|
||||
with:
|
||||
version: latest
|
||||
args: --timeout=5m
|
||||
|
||||
- name: Set up Node.js
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: '20'
|
||||
cache: 'npm'
|
||||
cache-dependency-path: './web-app/package-lock.json'
|
||||
|
||||
- name: Install frontend dependencies
|
||||
run: npm ci
|
||||
working-directory: ./web-app
|
||||
|
||||
- name: Frontend Lint
|
||||
run: npm run lint
|
||||
working-directory: ./web-app
|
||||
|
||||
type-check:
|
||||
name: Type Check
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Set up Go
|
||||
uses: actions/setup-go@v5
|
||||
with:
|
||||
go-version: '1.22'
|
||||
|
||||
- name: Go Type Check
|
||||
run: go vet ./...
|
||||
|
||||
- name: Set up Node.js
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: '20'
|
||||
cache: 'npm'
|
||||
cache-dependency-path: './web-app/package-lock.json'
|
||||
|
||||
- name: Install frontend dependencies
|
||||
run: npm ci
|
||||
working-directory: ./web-app
|
||||
|
||||
- name: TypeScript Check
|
||||
run: npm run type-check
|
||||
working-directory: ./web-app
|
||||
|
||||
security:
|
||||
name: Security Scan
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Run Gosec Security Scanner
|
||||
uses: securego/gosec@master
|
||||
with:
|
||||
args: './...'
|
||||
|
||||
- name: Run npm audit
|
||||
run: npm audit
|
||||
working-directory: ./web-app
|
||||
|
||||
- name: Run Snyk to check for vulnerabilities
|
||||
uses: snyk/actions/node@master
|
||||
env:
|
||||
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
|
||||
with:
|
||||
args: --severity-threshold=high --all-projects
|
||||
|
||||
coverage:
|
||||
name: Code Coverage
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Set up Go
|
||||
uses: actions/setup-go@v5
|
||||
with:
|
||||
go-version: '1.22'
|
||||
|
||||
- name: Install mupdf
|
||||
run: sudo apt-get install -y mupdf
|
||||
|
||||
- name: Set library path
|
||||
run: echo "/usr/lib" | sudo tee -a /etc/ld.so.conf.d/mupdf.conf && sudo ldconfig
|
||||
|
||||
- name: Run Go Coverage
|
||||
run: |
|
||||
go test -race -coverprofile=coverage.txt -covermode=atomic ./...
|
||||
go tool cover -func=coverage.txt
|
||||
|
||||
- name: Upload Go coverage to Codecov
|
||||
uses: codecov/codecov-action@v4
|
||||
with:
|
||||
file: ./coverage.txt
|
||||
flags: backend
|
||||
fail_ci_if_error: true
|
||||
|
||||
- name: Set up Node.js
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: '20'
|
||||
cache: 'npm'
|
||||
cache-dependency-path: './web-app/package-lock.json'
|
||||
|
||||
- name: Install frontend dependencies
|
||||
run: npm ci
|
||||
working-directory: ./web-app
|
||||
|
||||
- name: Run Frontend Coverage
|
||||
run: npm run test:coverage
|
||||
working-directory: ./web-app
|
||||
|
||||
- name: Upload Frontend coverage to Codecov
|
||||
uses: codecov/codecov-action@v4
|
||||
with:
|
||||
file: ./web-app/coverage/coverage-final.json
|
||||
flags: frontend
|
||||
fail_ci_if_error: true
|
||||
|
||||
format:
|
||||
name: Code Formatting
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Set up Go
|
||||
uses: actions/setup-go@v5
|
||||
with:
|
||||
go-version: '1.22'
|
||||
|
||||
- name: Check Go Formatting
|
||||
run: |
|
||||
if [ -n "$(gofmt -l .)" ]; then
|
||||
echo "Go files need formatting:"
|
||||
gofmt -d .
|
||||
exit 1
|
||||
fi
|
||||
|
||||
- name: Set up Node.js
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: '20'
|
||||
cache: 'npm'
|
||||
cache-dependency-path: './web-app/package-lock.json'
|
||||
|
||||
- name: Install frontend dependencies
|
||||
run: npm ci
|
||||
working-directory: ./web-app
|
||||
|
||||
- name: Check Frontend Formatting
|
||||
run: npm run format:check
|
||||
working-directory: ./web-app
|
||||
|
||||
complexity:
|
||||
name: Code Complexity
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Set up Go
|
||||
uses: actions/setup-go@v5
|
||||
with:
|
||||
go-version: '1.22'
|
||||
|
||||
- name: Install gocyclo
|
||||
run: go install github.com/fzipp/gocyclo/cmd/gocyclo@latest
|
||||
|
||||
- name: Check Go Code Complexity
|
||||
run: |
|
||||
gocyclo -over 15 .
|
||||
|
||||
- name: Set up Node.js
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: '20'
|
||||
cache: 'npm'
|
||||
cache-dependency-path: './web-app/package-lock.json'
|
||||
|
||||
- name: Install frontend dependencies
|
||||
run: npm ci
|
||||
working-directory: ./web-app
|
||||
|
||||
- name: Check Frontend Complexity
|
||||
run: npx ts-complexity ./src --max-complexity 15
|
||||
working-directory: ./web-app
|
193
.github/workflows/documentation.yml
vendored
Normal file
193
.github/workflows/documentation.yml
vendored
Normal file
|
@ -0,0 +1,193 @@
|
|||
name: Documentation
|
||||
|
||||
on:
|
||||
push:
|
||||
branches: [ main ]
|
||||
paths:
|
||||
- '**/*.md'
|
||||
- 'docs/**'
|
||||
- '.github/workflows/documentation.yml'
|
||||
pull_request:
|
||||
branches: [ main ]
|
||||
paths:
|
||||
- '**/*.md'
|
||||
- 'docs/**'
|
||||
- '.github/workflows/documentation.yml'
|
||||
|
||||
permissions:
|
||||
contents: read
|
||||
pages: write
|
||||
id-token: write
|
||||
|
||||
jobs:
|
||||
markdown-lint:
|
||||
name: Markdown Lint
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Setup Node.js
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: '20'
|
||||
|
||||
- name: Install markdownlint
|
||||
run: npm install -g markdownlint-cli
|
||||
|
||||
- name: Check Markdown files
|
||||
run: markdownlint '**/*.md' --ignore node_modules
|
||||
|
||||
- name: Check for broken links
|
||||
uses: gaurav-nelson/github-action-markdown-link-check@v1
|
||||
with:
|
||||
use-quiet-mode: 'yes'
|
||||
use-verbose-mode: 'yes'
|
||||
config-file: '.github/workflows/mlc_config.json'
|
||||
folder-path: '.'
|
||||
max-depth: -1
|
||||
|
||||
api-documentation:
|
||||
name: API Documentation
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Set up Go
|
||||
uses: actions/setup-go@v5
|
||||
with:
|
||||
go-version: '1.22'
|
||||
|
||||
- name: Install swag
|
||||
run: go install github.com/swaggo/swag/cmd/swag@latest
|
||||
|
||||
- name: Generate Swagger Documentation
|
||||
run: swag init
|
||||
|
||||
- name: Check if documentation changed
|
||||
run: |
|
||||
if [[ `git status --porcelain` ]]; then
|
||||
echo "API documentation is out of date. Please run 'swag init' locally and commit the changes."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
typescript-documentation:
|
||||
name: TypeScript Documentation
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Setup Node.js
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: '20'
|
||||
|
||||
- name: Install TypeDoc
|
||||
run: npm install -g typedoc
|
||||
|
||||
- name: Generate TypeScript Documentation
|
||||
working-directory: ./web-app
|
||||
run: typedoc --out docs/typescript src/
|
||||
|
||||
- name: Check documentation style
|
||||
working-directory: ./web-app
|
||||
run: |
|
||||
if find src -name "*.tsx" -o -name "*.ts" | xargs grep -l "@todo\|FIXME"; then
|
||||
echo "Found TODO or FIXME comments in the code. Please resolve them before merging."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
spelling:
|
||||
name: Documentation Spelling
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Check Spelling
|
||||
uses: streetsidesoftware/cspell-action@v5
|
||||
with:
|
||||
files: |
|
||||
**/*.md
|
||||
docs/**/*
|
||||
|
||||
validate-examples:
|
||||
name: Validate Code Examples
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Setup Node.js
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: '20'
|
||||
|
||||
- name: Install dependencies
|
||||
run: npm install markdown-code-block-runner
|
||||
|
||||
- name: Validate code examples in documentation
|
||||
run: npx markdown-code-block-runner "**/*.md"
|
||||
|
||||
build-wiki:
|
||||
name: Build Wiki
|
||||
needs: [markdown-lint, spelling]
|
||||
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Setup mdBook
|
||||
uses: peaceiris/actions-mdbook@v1
|
||||
with:
|
||||
mdbook-version: 'latest'
|
||||
|
||||
- name: Build documentation
|
||||
run: |
|
||||
mdbook build docs/
|
||||
|
||||
- name: Setup Pages
|
||||
uses: actions/configure-pages@v4
|
||||
|
||||
- name: Upload artifact
|
||||
uses: actions/upload-pages-artifact@v3
|
||||
with:
|
||||
path: 'docs/book'
|
||||
|
||||
- name: Deploy to GitHub Pages
|
||||
id: deployment
|
||||
uses: actions/deploy-pages@v4
|
||||
|
||||
check-docs-coverage:
|
||||
name: Documentation Coverage
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Set up Go
|
||||
uses: actions/setup-go@v5
|
||||
with:
|
||||
go-version: '1.22'
|
||||
|
||||
- name: Install doc coverage tool
|
||||
run: go install github.com/client9/misspell/cmd/misspell@latest
|
||||
|
||||
- name: Check public API documentation coverage
|
||||
run: |
|
||||
COVERAGE=$(go doc -all ./... | wc -l)
|
||||
if [ "$COVERAGE" -lt 100 ]; then
|
||||
echo "Documentation coverage is below threshold"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
- name: Setup Node.js
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: '20'
|
||||
|
||||
- name: Check TypeScript documentation coverage
|
||||
working-directory: ./web-app
|
||||
run: |
|
||||
npm install -g typescript
|
||||
COVERAGE=$(find src -name "*.ts" -o -name "*.tsx" | xargs grep -l "@doc" | wc -l)
|
||||
if [ "$COVERAGE" -lt 50 ]; then
|
||||
echo "TypeScript documentation coverage is below threshold"
|
||||
exit 1
|
||||
fi
|
29
.github/workflows/mlc_config.json
vendored
Normal file
29
.github/workflows/mlc_config.json
vendored
Normal file
|
@ -0,0 +1,29 @@
|
|||
{
|
||||
"replacementPatterns": [
|
||||
{
|
||||
"pattern": "^/",
|
||||
"replacement": "{{BASEURL}}/"
|
||||
}
|
||||
],
|
||||
"ignorePatterns": [
|
||||
{
|
||||
"pattern": "^http://localhost"
|
||||
},
|
||||
{
|
||||
"pattern": "^#"
|
||||
}
|
||||
],
|
||||
"timeout": "20s",
|
||||
"retryOn429": true,
|
||||
"retryCount": 5,
|
||||
"fallbackRetryDelay": "30s",
|
||||
"aliveStatusCodes": [200, 206],
|
||||
"httpHeaders": [
|
||||
{
|
||||
"urls": ["https://github.com/"],
|
||||
"headers": {
|
||||
"Accept": "application/vnd.github.v3+json"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
125
.markdownlint.json
Normal file
125
.markdownlint.json
Normal file
|
@ -0,0 +1,125 @@
|
|||
{
|
||||
"default": true,
|
||||
"MD001": true,
|
||||
"MD002": {
|
||||
"level": 1
|
||||
},
|
||||
"MD003": {
|
||||
"style": "atx"
|
||||
},
|
||||
"MD004": {
|
||||
"style": "dash"
|
||||
},
|
||||
"MD005": true,
|
||||
"MD006": true,
|
||||
"MD007": {
|
||||
"indent": 2
|
||||
},
|
||||
"MD009": {
|
||||
"br_spaces": 2,
|
||||
"list_item_empty_lines": false,
|
||||
"strict": false
|
||||
},
|
||||
"MD010": {
|
||||
"code_blocks": false,
|
||||
"spaces_per_tab": 2
|
||||
},
|
||||
"MD011": true,
|
||||
"MD012": {
|
||||
"maximum": 1
|
||||
},
|
||||
"MD013": {
|
||||
"line_length": 120,
|
||||
"code_blocks": false,
|
||||
"tables": false,
|
||||
"headings": false
|
||||
},
|
||||
"MD014": false,
|
||||
"MD018": true,
|
||||
"MD019": true,
|
||||
"MD020": true,
|
||||
"MD021": true,
|
||||
"MD022": true,
|
||||
"MD023": true,
|
||||
"MD024": {
|
||||
"allow_different_nesting": true
|
||||
},
|
||||
"MD025": {
|
||||
"level": 1,
|
||||
"front_matter_title": ""
|
||||
},
|
||||
"MD026": {
|
||||
"punctuation": ".,;:!。,;:!"
|
||||
},
|
||||
"MD027": true,
|
||||
"MD028": true,
|
||||
"MD029": {
|
||||
"style": "ordered"
|
||||
},
|
||||
"MD030": {
|
||||
"ul_single": 1,
|
||||
"ol_single": 1,
|
||||
"ul_multi": 1,
|
||||
"ol_multi": 1
|
||||
},
|
||||
"MD031": true,
|
||||
"MD032": true,
|
||||
"MD033": {
|
||||
"allowed_elements": [
|
||||
"br",
|
||||
"details",
|
||||
"summary",
|
||||
"kbd",
|
||||
"div",
|
||||
"img",
|
||||
"pre"
|
||||
]
|
||||
},
|
||||
"MD034": true,
|
||||
"MD035": {
|
||||
"style": "---"
|
||||
},
|
||||
"MD036": false,
|
||||
"MD037": true,
|
||||
"MD038": true,
|
||||
"MD039": true,
|
||||
"MD040": true,
|
||||
"MD041": {
|
||||
"level": 1,
|
||||
"front_matter_title": ""
|
||||
},
|
||||
"MD042": true,
|
||||
"MD043": false,
|
||||
"MD044": {
|
||||
"names": [
|
||||
"JavaScript",
|
||||
"TypeScript",
|
||||
"React",
|
||||
"Docker",
|
||||
"Node.js",
|
||||
"npm",
|
||||
"Go",
|
||||
"OpenAI",
|
||||
"Ollama",
|
||||
"paperless-gpt"
|
||||
],
|
||||
"code_blocks": false
|
||||
},
|
||||
"MD045": true,
|
||||
"MD046": {
|
||||
"style": "fenced"
|
||||
},
|
||||
"MD047": true,
|
||||
"MD048": {
|
||||
"style": "backtick"
|
||||
},
|
||||
"MD049": {
|
||||
"style": "underscore"
|
||||
},
|
||||
"MD050": {
|
||||
"style": "asterisk"
|
||||
},
|
||||
"MD051": true,
|
||||
"MD052": true,
|
||||
"MD053": true
|
||||
}
|
34
CHANGELOG.md
Normal file
34
CHANGELOG.md
Normal file
|
@ -0,0 +1,34 @@
|
|||
# Changelog
|
||||
|
||||
All notable changes to this project will be documented in this file.
|
||||
|
||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
||||
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
||||
|
||||
## [Unreleased]
|
||||
### Added
|
||||
- Enhanced project documentation and organization
|
||||
- Project governance guidelines
|
||||
- Security policy and guidelines
|
||||
- Architecture documentation
|
||||
|
||||
## [1.0.0] - Initial Release
|
||||
### Added
|
||||
- LLM-Enhanced OCR capabilities
|
||||
- Automatic title & tag generation
|
||||
- Automatic correspondent generation
|
||||
- Custom prompt templates
|
||||
- Docker deployment support
|
||||
- Web UI for document management
|
||||
- Support for multiple LLM providers (OpenAI, Ollama)
|
||||
- Configurable environment variables
|
||||
- Integration with paperless-ngx
|
||||
- Manual and automatic processing modes
|
||||
- Basic documentation and setup guides
|
||||
|
||||
### Security
|
||||
- API token authentication
|
||||
- Environment-based configuration
|
||||
- Docker container isolation
|
||||
|
||||
For earlier history, please see the git commit log.
|
226
GOVERNANCE.md
Normal file
226
GOVERNANCE.md
Normal file
|
@ -0,0 +1,226 @@
|
|||
# Project Governance
|
||||
|
||||
This document outlines the governance model for the paperless-gpt project. It describes how decisions are made and how community members can participate in project development.
|
||||
|
||||
## Project Roles
|
||||
|
||||
### Users
|
||||
- People who use paperless-gpt
|
||||
- Can submit bug reports and feature requests
|
||||
- Can contribute to discussions
|
||||
- Can help other users
|
||||
|
||||
### Contributors
|
||||
- Users who contribute to the project
|
||||
- Submit pull requests
|
||||
- Improve documentation
|
||||
- Help with testing
|
||||
- Participate in issue discussions
|
||||
|
||||
### Maintainers
|
||||
- Review and merge pull requests
|
||||
- Manage issues and project boards
|
||||
- Guide technical direction
|
||||
- Ensure code quality
|
||||
- Help onboard new contributors
|
||||
- Responsibilities:
|
||||
- Respond to issues and PRs
|
||||
- Review code changes
|
||||
- Maintain documentation
|
||||
- Ensure tests pass
|
||||
- Release new versions
|
||||
- Uphold code of conduct
|
||||
|
||||
### Project Lead
|
||||
- Final decision maker for project direction
|
||||
- Sets technical standards
|
||||
- Manages maintainer team
|
||||
- Oversees releases
|
||||
- Current lead: [@icereed](https://github.com/icereed)
|
||||
|
||||
## Decision Making
|
||||
|
||||
### Technical Decisions
|
||||
1. **Discussion Phase**
|
||||
- Open an issue for discussion
|
||||
- Gather community feedback
|
||||
- Consider alternatives
|
||||
- Document trade-offs
|
||||
|
||||
2. **Implementation Phase**
|
||||
- Create detailed proposal
|
||||
- Submit pull request
|
||||
- Address review feedback
|
||||
- Update documentation
|
||||
|
||||
3. **Review Process**
|
||||
- At least one maintainer review
|
||||
- Automated tests must pass
|
||||
- Documentation must be updated
|
||||
- Breaking changes require extra scrutiny
|
||||
|
||||
### Project Direction
|
||||
1. **Long-term Planning**
|
||||
- Quarterly roadmap updates
|
||||
- Community feedback periods
|
||||
- Clear communication of goals
|
||||
- Published milestones
|
||||
|
||||
2. **Feature Acceptance**
|
||||
- Must align with project goals
|
||||
- Consider maintenance burden
|
||||
- Evaluate user benefit
|
||||
- Check implementation feasibility
|
||||
|
||||
### Release Process
|
||||
1. **Version Planning**
|
||||
- Follow semantic versioning
|
||||
- Document all changes
|
||||
- Update dependencies
|
||||
- Security review
|
||||
|
||||
2. **Release Preparation**
|
||||
- Create release branch
|
||||
- Run test suite
|
||||
- Update changelog
|
||||
- Draft release notes
|
||||
|
||||
3. **Release Publication**
|
||||
- Tag version in repository
|
||||
- Publish to registries
|
||||
- Announce to community
|
||||
- Monitor for issues
|
||||
|
||||
## Communication
|
||||
|
||||
### Channels
|
||||
- GitHub Issues: Bug reports, feature requests
|
||||
- GitHub Discussions: General discussion
|
||||
- Pull Requests: Code changes
|
||||
- Discord: Community chat
|
||||
- Email: Security issues
|
||||
|
||||
### Guidelines
|
||||
- Be respectful and professional
|
||||
- Stay on topic
|
||||
- English is the working language
|
||||
- Document decisions and rationale
|
||||
- Keep security issues private
|
||||
|
||||
## Contributing
|
||||
|
||||
### Process
|
||||
1. **Getting Started**
|
||||
- Read contribution guidelines
|
||||
- Set up development environment
|
||||
- Understand code structure
|
||||
- Pick starter issues
|
||||
|
||||
2. **Making Changes**
|
||||
- Create feature branch
|
||||
- Follow code style
|
||||
- Write tests
|
||||
- Update docs
|
||||
|
||||
3. **Submitting Changes**
|
||||
- Create pull request
|
||||
- Fill out template
|
||||
- Respond to reviews
|
||||
- Keep changes focused
|
||||
|
||||
### Standards
|
||||
- Follow code style guide
|
||||
- Include tests
|
||||
- Update documentation
|
||||
- Sign commits
|
||||
- One feature per PR
|
||||
|
||||
## Code Review
|
||||
|
||||
### Requirements
|
||||
- At least one maintainer approval
|
||||
- All tests passing
|
||||
- Documentation updated
|
||||
- Code style compliance
|
||||
- No security issues
|
||||
|
||||
### Process
|
||||
1. **Automated Checks**
|
||||
- Linting
|
||||
- Tests
|
||||
- Coverage
|
||||
- Dependencies
|
||||
|
||||
2. **Manual Review**
|
||||
- Code quality
|
||||
- Architecture
|
||||
- Security
|
||||
- Performance
|
||||
|
||||
3. **Final Checks**
|
||||
- Merge conflicts
|
||||
- Documentation
|
||||
- Breaking changes
|
||||
- Version updates
|
||||
|
||||
## Issue Management
|
||||
|
||||
### Categories
|
||||
- Bug: Software defects
|
||||
- Feature: New functionality
|
||||
- Enhancement: Improvements
|
||||
- Documentation: Doc changes
|
||||
- Question: User queries
|
||||
|
||||
### Priority Levels
|
||||
1. **Critical**
|
||||
- Security issues
|
||||
- Major bugs
|
||||
- Blocking issues
|
||||
|
||||
2. **High**
|
||||
- Important features
|
||||
- User experience issues
|
||||
- Performance problems
|
||||
|
||||
3. **Normal**
|
||||
- Regular enhancements
|
||||
- Minor bugs
|
||||
- Documentation updates
|
||||
|
||||
4. **Low**
|
||||
- Nice-to-have features
|
||||
- Style improvements
|
||||
- Non-critical fixes
|
||||
|
||||
## Project Changes
|
||||
|
||||
### Governance Changes
|
||||
- Open for community discussion
|
||||
- Two week comment period
|
||||
- Maintainer consensus required
|
||||
- Project lead approval needed
|
||||
|
||||
### Role Changes
|
||||
- Based on consistent contributions
|
||||
- Maintainer nomination
|
||||
- Community feedback
|
||||
- Project lead approval
|
||||
|
||||
## Success Metrics
|
||||
|
||||
### Project Health
|
||||
- Issue resolution time
|
||||
- PR merge time
|
||||
- Test coverage
|
||||
- Documentation quality
|
||||
- Community engagement
|
||||
|
||||
### Code Quality
|
||||
- Automated metrics
|
||||
- Review thoroughness
|
||||
- Test coverage
|
||||
- Documentation completeness
|
||||
- Security standards
|
||||
|
||||
This governance model is a living document and may be updated as the project evolves. Changes will be proposed and discussed with the community before implementation.
|
125
SECURITY.md
Normal file
125
SECURITY.md
Normal file
|
@ -0,0 +1,125 @@
|
|||
# Security Policy
|
||||
|
||||
## Reporting a Vulnerability
|
||||
|
||||
At paperless-gpt, we take security seriously. If you discover a security vulnerability, please follow these steps:
|
||||
|
||||
1. **DO NOT** disclose the vulnerability publicly.
|
||||
2. Send a detailed report to security@icereed.net including:
|
||||
- A description of the vulnerability
|
||||
- Steps to reproduce the issue
|
||||
- Potential impact
|
||||
- Any suggested fixes (if available)
|
||||
3. Allow up to 48 hours for an initial response.
|
||||
4. Please do not disclose the issue publicly until we've had a chance to address it.
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### API Keys and Tokens
|
||||
- Never commit API keys, tokens, or sensitive credentials to the repository
|
||||
- Use environment variables for all sensitive configuration
|
||||
- Rotate API keys and tokens regularly
|
||||
- Use the minimum required permissions for API tokens
|
||||
|
||||
### Data Privacy
|
||||
- All document processing is done locally or via your configured LLM provider
|
||||
- No document data is stored permanently outside your system
|
||||
- Temporary files are cleaned up after processing
|
||||
- Documents are transmitted securely using HTTPS
|
||||
|
||||
### Docker Security
|
||||
- Containers run with minimal privileges
|
||||
- Images are regularly updated with security patches
|
||||
- Dependencies are scanned for vulnerabilities
|
||||
- Official base images are used
|
||||
|
||||
### LLM Provider Security
|
||||
- API calls to LLM providers use encrypted connections
|
||||
- Rate limiting is implemented to prevent abuse
|
||||
- Input validation is performed on all user inputs
|
||||
- Error messages are sanitized to prevent information leakage
|
||||
|
||||
### Access Control
|
||||
- Use strong passwords for all services
|
||||
- Implement the principle of least privilege
|
||||
- Regular security audits of access controls
|
||||
- Monitor for unauthorized access attempts
|
||||
|
||||
## Version Support
|
||||
|
||||
We provide security updates for:
|
||||
- The current major version
|
||||
- The previous major version for 6 months after a new major release
|
||||
|
||||
## Best Practices for Deployment
|
||||
|
||||
1. **Network Security**
|
||||
- Use HTTPS for all connections
|
||||
- Implement proper firewall rules
|
||||
- Use secure DNS configurations
|
||||
- Regular security audits
|
||||
|
||||
2. **System Updates**
|
||||
- Keep all system packages updated
|
||||
- Subscribe to security advisories
|
||||
- Regular vulnerability scanning
|
||||
- Automated update notifications
|
||||
|
||||
3. **Monitoring**
|
||||
- Monitor system logs for suspicious activity
|
||||
- Track resource usage patterns
|
||||
- Alert on anomalous behavior
|
||||
- Regular security assessments
|
||||
|
||||
4. **Backup and Recovery**
|
||||
- Regular backups of critical data
|
||||
- Secure backup storage
|
||||
- Tested recovery procedures
|
||||
- Documented incident response plan
|
||||
|
||||
## Dependencies
|
||||
|
||||
We regularly monitor and update dependencies for security vulnerabilities:
|
||||
- Automated dependency updates via Renovate
|
||||
- Regular security audits of dependencies
|
||||
- Minimal use of third-party packages
|
||||
- Verification of package signatures
|
||||
|
||||
## Contributing Security Fixes
|
||||
|
||||
If you want to contribute security fixes:
|
||||
1. Follow the standard pull request process
|
||||
2. Mark security-related PRs as "security fix"
|
||||
3. Provide detailed description of the security impact
|
||||
4. Include tests that verify the fix
|
||||
|
||||
## Security Release Process
|
||||
|
||||
When a security issue is identified:
|
||||
1. Issue is assessed and prioritized
|
||||
2. Fix is developed and tested
|
||||
3. Security advisory is prepared
|
||||
4. Fix is deployed and announced
|
||||
5. Users are notified through appropriate channels
|
||||
|
||||
## Incident Response
|
||||
|
||||
In case of a security incident:
|
||||
1. Issue is immediately assessed
|
||||
2. Affected systems are isolated
|
||||
3. Root cause is identified
|
||||
4. Fix is developed and tested
|
||||
5. Systems are restored
|
||||
6. Incident report is prepared
|
||||
7. Preventive measures are implemented
|
||||
|
||||
## Contact
|
||||
|
||||
For security-related matters, contact:
|
||||
- Email: security@icereed.net
|
||||
- Response time: Within 48 hours
|
||||
- Language: English
|
||||
|
||||
## Acknowledgments
|
||||
|
||||
We'd like to thank all security researchers who have helped improve the security of paperless-gpt. A list of acknowledged researchers can be found in our [Hall of Fame](CONTRIBUTORS.md#security-researchers).
|
78
cline_docs/productContext.md
Normal file
78
cline_docs/productContext.md
Normal file
|
@ -0,0 +1,78 @@
|
|||
# Product Context
|
||||
|
||||
## Project Purpose
|
||||
paperless-gpt is designed to enhance document management by integrating AI capabilities with paperless-ngx. Its primary purpose is to automate and improve the accuracy of document processing tasks that traditionally require manual intervention.
|
||||
|
||||
## Problems Solved
|
||||
1. Manual Document Organization
|
||||
- Eliminates tedious manual tagging and titling
|
||||
- Reduces time spent on document categorization
|
||||
- Minimizes human error in classification
|
||||
|
||||
2. OCR Quality Issues
|
||||
- Improves text extraction from poor quality scans
|
||||
- Enhances accuracy through LLM-based OCR
|
||||
- Provides context-aware text interpretation
|
||||
|
||||
3. Document Processing Automation
|
||||
- Automates correspondent identification
|
||||
- Streamlines document categorization
|
||||
- Enables bulk processing capabilities
|
||||
|
||||
## Core Functionality
|
||||
1. AI-Powered Document Processing
|
||||
- Title generation using LLMs
|
||||
- Intelligent tag suggestions
|
||||
- Automated correspondent detection
|
||||
- Enhanced OCR capabilities
|
||||
|
||||
2. Integration Features
|
||||
- Seamless paperless-ngx integration
|
||||
- Support for multiple LLM providers
|
||||
- Docker-based deployment
|
||||
- Customizable prompt templates
|
||||
|
||||
3. User Experience
|
||||
- Web-based interface
|
||||
- Manual review capabilities
|
||||
- Automatic processing options
|
||||
- Flexible configuration options
|
||||
|
||||
## Success Criteria
|
||||
1. Accuracy Metrics
|
||||
- High-quality OCR results
|
||||
- Accurate document classification
|
||||
- Relevant tag suggestions
|
||||
- Correct correspondent identification
|
||||
|
||||
2. Performance Goals
|
||||
- Fast processing times
|
||||
- Reliable system operation
|
||||
- Scalable document handling
|
||||
- Efficient resource usage
|
||||
|
||||
3. User Satisfaction
|
||||
- Intuitive interface
|
||||
- Clear feedback mechanisms
|
||||
- Minimal manual intervention
|
||||
- Consistent results
|
||||
|
||||
## Future Vision
|
||||
1. Enhanced Capabilities
|
||||
- Support for more AI providers
|
||||
- Statistics and analytics features
|
||||
- Advanced document analysis
|
||||
- Improved processing algorithms
|
||||
- Extended automation options
|
||||
|
||||
2. Community Growth
|
||||
- Active contributor base
|
||||
- Regular feature additions
|
||||
- Strong documentation
|
||||
- Responsive maintenance
|
||||
|
||||
3. Technical Evolution
|
||||
- Improved architecture
|
||||
- Enhanced performance
|
||||
- Extended integrations
|
||||
- Robust testing
|
221
docs/ARCHITECTURE.md
Normal file
221
docs/ARCHITECTURE.md
Normal file
|
@ -0,0 +1,221 @@
|
|||
# paperless-gpt Architecture
|
||||
|
||||
This document provides a comprehensive overview of the paperless-gpt architecture, explaining how different components interact to provide AI-powered document processing capabilities.
|
||||
|
||||
## System Overview
|
||||
|
||||
paperless-gpt is designed as a companion service to paperless-ngx, adding AI capabilities for document processing. The system consists of several key components:
|
||||
|
||||
```mermaid
|
||||
graph TB
|
||||
UI[Web UI] --> API[Backend API]
|
||||
API --> LLM[LLM Service]
|
||||
API --> OCR[OCR Service]
|
||||
API --> DB[Local DB]
|
||||
API --> PaperlessNGX[paperless-ngx API]
|
||||
LLM --> OpenAI[OpenAI]
|
||||
LLM --> Ollama[Ollama]
|
||||
OCR --> VisionLLM[Vision LLM]
|
||||
```
|
||||
|
||||
## Core Components
|
||||
|
||||
### 1. Backend API (Go)
|
||||
- Handles all business logic
|
||||
- Manages document processing workflow
|
||||
- Coordinates between services
|
||||
- Provides REST API endpoints
|
||||
- Manages state and caching
|
||||
|
||||
### 2. Web UI (React + TypeScript)
|
||||
- User interface for document management
|
||||
- Real-time processing status
|
||||
- Document preview and editing
|
||||
- Configuration interface
|
||||
- Responsive design
|
||||
|
||||
### 3. LLM Service
|
||||
- Manages LLM provider connections
|
||||
- Handles prompt engineering
|
||||
- Processes document content
|
||||
- Generates metadata suggestions
|
||||
- Supports multiple providers:
|
||||
- OpenAI (gpt-4, gpt-3.5-turbo)
|
||||
- Ollama (llama2, etc.)
|
||||
|
||||
### 4. OCR Service
|
||||
- Vision LLM integration
|
||||
- Image preprocessing
|
||||
- Text extraction
|
||||
- Layout analysis
|
||||
- Quality enhancement
|
||||
|
||||
### 5. Local Database
|
||||
- Caches processing results
|
||||
- Stores configuration
|
||||
- Manages queues
|
||||
- Tracks document state
|
||||
|
||||
## Data Flow
|
||||
|
||||
### Document Processing Flow
|
||||
```mermaid
|
||||
sequenceDiagram
|
||||
participant U as User
|
||||
participant UI as Web UI
|
||||
participant API as Backend API
|
||||
participant LLM as LLM Service
|
||||
participant OCR as OCR Service
|
||||
participant PNX as paperless-ngx
|
||||
|
||||
U->>UI: Upload Document
|
||||
UI->>API: Process Request
|
||||
API->>OCR: Extract Text
|
||||
OCR-->>API: Text Content
|
||||
API->>LLM: Generate Metadata
|
||||
LLM-->>API: Suggestions
|
||||
API->>UI: Preview Results
|
||||
U->>UI: Approve Changes
|
||||
UI->>API: Apply Changes
|
||||
API->>PNX: Update Document
|
||||
PNX-->>API: Confirmation
|
||||
API-->>UI: Success
|
||||
```
|
||||
|
||||
## Key Design Decisions
|
||||
|
||||
### 1. Modular Architecture
|
||||
- Separation of concerns
|
||||
- Pluggable components
|
||||
- Easy to extend
|
||||
- Maintainable code
|
||||
|
||||
### 2. Stateless Design
|
||||
- Scalable architecture
|
||||
- No shared state
|
||||
- Resilient operation
|
||||
- Easy deployment
|
||||
|
||||
### 3. Security First
|
||||
- API authentication
|
||||
- Data encryption
|
||||
- Input validation
|
||||
- Error handling
|
||||
|
||||
### 4. Performance Optimization
|
||||
- Local caching
|
||||
- Batch processing
|
||||
- Async operations
|
||||
- Resource management
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
paperless-gpt/
|
||||
├── main.go # Application entry point
|
||||
├── app_llm.go # LLM service implementation
|
||||
├── app_http_handlers.go # HTTP handlers
|
||||
├── paperless.go # paperless-ngx integration
|
||||
├── ocr.go # OCR service
|
||||
├── types.go # Type definitions
|
||||
├── web-app/ # Frontend application
|
||||
│ ├── src/
|
||||
│ │ ├── components/ # React components
|
||||
│ │ ├── App.tsx # Main application
|
||||
│ │ └── ...
|
||||
│ └── ...
|
||||
└── ...
|
||||
```
|
||||
|
||||
## Configuration Management
|
||||
|
||||
The system uses environment variables for configuration, allowing easy deployment and configuration changes:
|
||||
|
||||
```
|
||||
PAPERLESS_BASE_URL # paperless-ngx connection
|
||||
LLM_PROVIDER # AI backend selection
|
||||
VISION_LLM_PROVIDER # OCR provider selection
|
||||
...
|
||||
```
|
||||
|
||||
## Error Handling
|
||||
|
||||
The system implements comprehensive error handling:
|
||||
|
||||
1. **User Errors**
|
||||
- Input validation
|
||||
- Clear error messages
|
||||
- Guided resolution
|
||||
|
||||
2. **System Errors**
|
||||
- Graceful degradation
|
||||
- Automatic retry
|
||||
- Error logging
|
||||
- Monitoring alerts
|
||||
|
||||
3. **External Service Errors**
|
||||
- Fallback options
|
||||
- Circuit breaking
|
||||
- Rate limiting
|
||||
- Error reporting
|
||||
|
||||
## Scaling Considerations
|
||||
|
||||
The architecture supports scaling through:
|
||||
|
||||
1. **Horizontal Scaling**
|
||||
- Stateless design
|
||||
- Load balancing
|
||||
- Distributed processing
|
||||
|
||||
2. **Resource Management**
|
||||
- Connection pooling
|
||||
- Cache management
|
||||
- Queue processing
|
||||
- Rate limiting
|
||||
|
||||
3. **Performance Optimization**
|
||||
- Batch processing
|
||||
- Async operations
|
||||
- Efficient algorithms
|
||||
- Resource caching
|
||||
|
||||
## Future Considerations
|
||||
|
||||
The architecture is designed to support future enhancements:
|
||||
|
||||
1. **Plugin System**
|
||||
- Custom processors
|
||||
- Integration points
|
||||
- Event hooks
|
||||
|
||||
2. **Advanced Features**
|
||||
- Multi-language support
|
||||
- Custom ML models
|
||||
- Advanced analytics
|
||||
|
||||
3. **Integration Options**
|
||||
- API extensions
|
||||
- Service hooks
|
||||
- Custom providers
|
||||
|
||||
## Development Guidelines
|
||||
|
||||
When making changes to the architecture:
|
||||
|
||||
1. **Documentation**
|
||||
- Update this document
|
||||
- Add inline comments
|
||||
- Update API docs
|
||||
|
||||
2. **Testing**
|
||||
- Unit tests
|
||||
- Integration tests
|
||||
- Performance tests
|
||||
|
||||
3. **Review Process**
|
||||
- Architecture review
|
||||
- Security review
|
||||
- Performance review
|
||||
|
||||
This architecture documentation is maintained by the core team and updated as the system evolves.
|
Loading…
Reference in a new issue