mirror of
https://github.com/icereed/paperless-gpt.git
synced 2025-03-12 21:08:00 -05:00
Update README.md
This commit is contained in:
parent
4b2ab05bce
commit
c9c468bf65
1 changed files with 129 additions and 220 deletions
349
README.md
349
README.md
|
@ -1,122 +1,126 @@
|
||||||
# paperless-gpt
|
# paperless-gpt
|
||||||
|
|
||||||
[](LICENSE)
|
[](LICENSE)
|
||||||
[](https://hub.docker.com/r/icereed/paperless-gpt)
|
[](https://hub.docker.com/r/icereed/paperless-gpt)
|
||||||
[](CODE_OF_CONDUCT.md)
|
[](CODE_OF_CONDUCT.md)
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
**paperless-gpt** is a tool designed to generate accurate and meaningful document titles and tags for [paperless-ngx](https://github.com/paperless-ngx/paperless-ngx) using Large Language Models (LLMs). It supports multiple LLM providers, including **OpenAI** and **Ollama**. With paperless-gpt, you can streamline your document management by automatically suggesting appropriate titles and tags based on the content of your scanned documents.
|
**paperless-gpt** seamlessly pairs with [paperless-ngx][paperless-ngx] to generate **AI-powered document titles** and **tags**, saving you hours of manual sorting. While other tools may offer AI chat features, **paperless-gpt** stands out by **supercharging OCR with LLMs**—ensuring high accuracy, even with tricky scans. If you’re craving next-level text extraction and effortless document organization, this is your solution.
|
||||||
|
|
||||||
[](./demo.gif)
|
[](./demo.gif)
|
||||||
|
|
||||||
## Features
|
---
|
||||||
|
|
||||||
- **Multiple LLM Support**: Choose between OpenAI and Ollama for generating document titles and tags.
|
## Key Highlights
|
||||||
- **Customizable Prompts**: Modify the prompt templates to suit your specific needs.
|
|
||||||
- **Easy Integration**: Works seamlessly with your existing paperless-ngx setup.
|
1. **LLM-Enhanced OCR**
|
||||||
- **User-Friendly Interface**: Intuitive web interface for reviewing and applying suggested titles and tags.
|
Harness Large Language Models (OpenAI or Ollama) for **better-than-traditional** OCR—turn messy or low-quality scans into context-aware, high-fidelity text.
|
||||||
- **Dockerized Deployment**: Simple setup using Docker and Docker Compose.
|
|
||||||
- **Automatic Document Processing**: Automatically apply generated suggestions for documents with the `paperless-gpt-auto` tag.
|
2. **Automatic Title & Tag Generation**
|
||||||
- **Experimental OCR Feature**: Send documents to a vision LLM for OCR processing.
|
No more guesswork. Let the AI do the naming and categorizing. You can easily review suggestions and refine them if needed.
|
||||||
|
|
||||||
|
3. **Extensive Customization**
|
||||||
|
- **Prompt Templates**: Tweak your AI prompts to reflect your domain, style, or preference.
|
||||||
|
- **Tagging**: Decide how documents get tagged—manually, automatically, or via OCR-based flows.
|
||||||
|
|
||||||
|
4. **Simple Docker Deployment**
|
||||||
|
A few environment variables, and you’re off! Compose it alongside paperless-ngx with minimal fuss.
|
||||||
|
|
||||||
|
5. **Unified Web UI**
|
||||||
|
- **Manual Review**: Approve or tweak AI’s suggestions.
|
||||||
|
- **Auto Processing**: Focus only on edge cases while the rest is sorted for you.
|
||||||
|
|
||||||
|
6. **Opt-In LLM-based OCR**
|
||||||
|
If you opt in, your images get read by a Vision LLM, pushing boundaries beyond standard OCR tools.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Table of Contents
|
## Table of Contents
|
||||||
|
- [Key Highlights](#key-highlights)
|
||||||
|
- [Getting Started](#getting-started)
|
||||||
|
- [Prerequisites](#prerequisites)
|
||||||
|
- [Installation](#installation)
|
||||||
|
- [Docker Compose](#docker-compose)
|
||||||
|
- [Manual Setup](#manual-setup)
|
||||||
|
- [Configuration](#configuration)
|
||||||
|
- [Environment Variables](#environment-variables)
|
||||||
|
- [Custom Prompt Templates](#custom-prompt-templates)
|
||||||
|
- [Prompt Templates Directory](#prompt-templates-directory)
|
||||||
|
- [Mounting the Prompts Directory](#mounting-the-prompts-directory)
|
||||||
|
- [Editing the Prompt Templates](#editing-the-prompt-templates)
|
||||||
|
- [Template Syntax and Variables](#template-syntax-and-variables)
|
||||||
|
- [Usage](#usage)
|
||||||
|
- [Contributing](#contributing)
|
||||||
|
- [License](#license)
|
||||||
|
- [Star History](#star-history)
|
||||||
|
- [Disclaimer](#disclaimer)
|
||||||
|
|
||||||
- [paperless-gpt](#paperless-gpt)
|
---
|
||||||
- [Features](#features)
|
|
||||||
- [Table of Contents](#table-of-contents)
|
|
||||||
- [Getting Started](#getting-started)
|
|
||||||
- [Prerequisites](#prerequisites)
|
|
||||||
- [Installation](#installation)
|
|
||||||
- [Docker Compose](#docker-compose)
|
|
||||||
- [Manual Setup](#manual-setup)
|
|
||||||
- [Configuration](#configuration)
|
|
||||||
- [Environment Variables](#environment-variables)
|
|
||||||
- [Custom Prompt Templates](#custom-prompt-templates)
|
|
||||||
- [Prompt Templates Directory](#prompt-templates-directory)
|
|
||||||
- [Mounting the Prompts Directory](#mounting-the-prompts-directory)
|
|
||||||
- [Editing the Prompt Templates](#editing-the-prompt-templates)
|
|
||||||
- [Template Syntax and Variables](#template-syntax-and-variables)
|
|
||||||
- [Usage](#usage)
|
|
||||||
- [Contributing](#contributing)
|
|
||||||
- [License](#license)
|
|
||||||
- [Star History](#star-history)
|
|
||||||
|
|
||||||
## Getting Started
|
## Getting Started
|
||||||
|
|
||||||
### Prerequisites
|
### Prerequisites
|
||||||
|
- [Docker][docker-install] installed.
|
||||||
- [Docker](https://www.docker.com/get-started) installed on your system.
|
- A running instance of [paperless-ngx][paperless-ngx].
|
||||||
- A running instance of [paperless-ngx](https://github.com/paperless-ngx/paperless-ngx).
|
|
||||||
- Access to an LLM provider:
|
- Access to an LLM provider:
|
||||||
- **OpenAI**: An API key with access to models like `gpt-4o` or `gpt-3.5-turbo`.
|
- **OpenAI**: An API key with models like `gpt-4o` or `gpt-3.5-turbo`.
|
||||||
- **Ollama**: A running Ollama server with models like `llama2` installed.
|
- **Ollama**: A running Ollama server with models like `llama2`.
|
||||||
|
|
||||||
### Installation
|
### Installation
|
||||||
|
|
||||||
#### Docker Compose
|
#### Docker Compose
|
||||||
|
|
||||||
The easiest way to get started is by using Docker Compose. Below is an example `docker-compose.yml` file to set up paperless-gpt alongside paperless-ngx.
|
Here’s an example `docker-compose.yml` to spin up **paperless-gpt** alongside paperless-ngx:
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
version: '3.7'
|
version: '3.7'
|
||||||
services:
|
services:
|
||||||
paperless-ngx:
|
paperless-ngx:
|
||||||
image: ghcr.io/paperless-ngx/paperless-ngx:latest
|
image: ghcr.io/paperless-ngx/paperless-ngx:latest
|
||||||
# ... (your existing paperless-ngx configuration)
|
# ... (your existing paperless-ngx config)
|
||||||
|
|
||||||
paperless-gpt:
|
paperless-gpt:
|
||||||
image: icereed/paperless-gpt:latest
|
image: icereed/paperless-gpt:latest
|
||||||
environment:
|
environment:
|
||||||
PAPERLESS_BASE_URL: 'http://paperless-ngx:8000'
|
PAPERLESS_BASE_URL: 'http://paperless-ngx:8000'
|
||||||
PAPERLESS_API_TOKEN: 'your_paperless_api_token'
|
PAPERLESS_API_TOKEN: 'your_paperless_api_token'
|
||||||
PAPERLESS_PUBLIC_URL: 'http://paperless.mydomain.com' # Optional, your public link to access Paperless
|
PAPERLESS_PUBLIC_URL: 'http://paperless.mydomain.com' # Optional
|
||||||
MANUAL_TAG: 'paperless-gpt' # Optional, default is 'paperless-gpt'
|
MANUAL_TAG: 'paperless-gpt' # Optional, default: paperless-gpt
|
||||||
AUTO_TAG: 'paperless-gpt-auto' # Optional, default is 'paperless-gpt-auto'
|
AUTO_TAG: 'paperless-gpt-auto' # Optional, default: paperless-gpt-auto
|
||||||
LLM_PROVIDER: 'openai' # or 'ollama'
|
LLM_PROVIDER: 'openai' # or 'ollama'
|
||||||
LLM_MODEL: 'gpt-4o' # or 'llama2'
|
LLM_MODEL: 'gpt-4o' # or 'llama2'
|
||||||
OPENAI_API_KEY: 'your_openai_api_key' # Required if using OpenAI
|
OPENAI_API_KEY: 'your_openai_api_key'
|
||||||
LLM_LANGUAGE: 'English' # Optional, default is 'English'
|
LLM_LANGUAGE: 'English' # Optional, default: English
|
||||||
OLLAMA_HOST: 'http://host.docker.internal:11434' # If using Ollama
|
OLLAMA_HOST: 'http://host.docker.internal:11434' # If using Ollama
|
||||||
VISION_LLM_PROVIDER: 'ollama' # Optional (for OCR) - ollama or openai
|
VISION_LLM_PROVIDER: 'ollama' # (for OCR) - openai or ollama
|
||||||
VISION_LLM_MODEL: 'minicpm-v' # Optional (for OCR) - minicpm-v, for example for ollama, gpt-4o for openai
|
VISION_LLM_MODEL: 'minicpm-v' # (for OCR) - minicpm-v (ollama example), gpt-4o (for openai), etc.
|
||||||
AUTO_OCR_TAG: 'paperless-gpt-ocr-auto' # Optional, default is 'paperless-gpt-ocr-auto'
|
AUTO_OCR_TAG: 'paperless-gpt-ocr-auto' # Optional, default: paperless-gpt-ocr-auto
|
||||||
LOG_LEVEL: 'info' # Optional or 'debug', 'warn', 'error'
|
LOG_LEVEL: 'info' # Optional: debug, warn, error
|
||||||
volumes:
|
volumes:
|
||||||
- ./prompts:/app/prompts # Mount the prompts directory
|
- ./prompts:/app/prompts # Mount the prompts directory
|
||||||
ports:
|
ports:
|
||||||
- '8080:8080'
|
- '8080:8080'
|
||||||
depends_on:
|
depends_on:
|
||||||
- paperless-ngx
|
- paperless-ngx
|
||||||
```
|
```
|
||||||
|
|
||||||
**Note:** Replace the placeholder values with your actual configuration.
|
**Pro Tip**: Replace placeholders with real values and read the logs if something looks off.
|
||||||
|
|
||||||
#### Manual Setup
|
#### Manual Setup
|
||||||
|
1. **Clone the Repository**
|
||||||
If you prefer to run the application manually:
|
|
||||||
|
|
||||||
1. **Clone the Repository:**
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
git clone https://github.com/icereed/paperless-gpt.git
|
git clone https://github.com/icereed/paperless-gpt.git
|
||||||
cd paperless-gpt
|
cd paperless-gpt
|
||||||
```
|
```
|
||||||
|
2. **Create a `prompts` Directory**
|
||||||
2. **Create a `prompts` Directory:**
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
mkdir prompts
|
mkdir prompts
|
||||||
```
|
```
|
||||||
|
3. **Build the Docker Image**
|
||||||
3. **Build the Docker Image:**
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
docker build -t paperless-gpt .
|
docker build -t paperless-gpt .
|
||||||
```
|
```
|
||||||
|
4. **Run the Container**
|
||||||
4. **Run the Container:**
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
docker run -d \
|
docker run -d \
|
||||||
-e PAPERLESS_BASE_URL='http://your_paperless_ngx_url' \
|
-e PAPERLESS_BASE_URL='http://your_paperless_ngx_url' \
|
||||||
|
@ -128,200 +132,105 @@ If you prefer to run the application manually:
|
||||||
-e VISION_LLM_PROVIDER='ollama' \
|
-e VISION_LLM_PROVIDER='ollama' \
|
||||||
-e VISION_LLM_MODEL='minicpm-v' \
|
-e VISION_LLM_MODEL='minicpm-v' \
|
||||||
-e LOG_LEVEL='info' \
|
-e LOG_LEVEL='info' \
|
||||||
-v $(pwd)/prompts:/app/prompts \ # Mount the prompts directory
|
-v $(pwd)/prompts:/app/prompts \
|
||||||
-p 8080:8080 \
|
-p 8080:8080 \
|
||||||
paperless-gpt
|
paperless-gpt
|
||||||
```
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
### Environment Variables
|
### Environment Variables
|
||||||
|
|
||||||
| Variable | Description | Required |
|
| Variable | Description | Required |
|
||||||
|-----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
|
|------------------------|------------------------------------------------------------------------------------------------------------------|----------|
|
||||||
| `PAPERLESS_BASE_URL` | The base URL of your paperless-ngx instance (e.g., `http://paperless-ngx:8000`). | Yes |
|
| `PAPERLESS_BASE_URL` | URL of your paperless-ngx instance (e.g. `http://paperless-ngx:8000`). | Yes |
|
||||||
| `PAPERLESS_API_TOKEN` | API token for accessing paperless-ngx. You can generate one in the paperless-ngx admin interface. | Yes |
|
| `PAPERLESS_API_TOKEN` | API token for paperless-ngx. Generate one in paperless-ngx admin. | Yes |
|
||||||
| `PAPERLESS_PUBLIC_URL` | The public URL for your Paperless instance, if it is different to your `PAPERLESS_BASE_URL` - say if you are running in Docker Compose | No |
|
| `PAPERLESS_PUBLIC_URL` | Public URL for Paperless (if different from `PAPERLESS_BASE_URL`). | No |
|
||||||
| `MANUAL_TAG` | The tag to use for manually processing documents. Default is `paperless-gpt`. | No |
|
| `MANUAL_TAG` | Tag for manual processing. Default: `paperless-gpt`. | No |
|
||||||
| `AUTO_TAG` | The tag to use for automatically processing documents. Default is `paperless-gpt-auto`. | No |
|
| `AUTO_TAG` | Tag for auto processing. Default: `paperless-gpt-auto`. | No |
|
||||||
| `LLM_PROVIDER` | The LLM provider to use (`openai` or `ollama`). | Yes |
|
| `LLM_PROVIDER` | AI backend (`openai` or `ollama`). | Yes |
|
||||||
| `LLM_MODEL` | The model name to use (e.g., `gpt-4o`, `gpt-3.5-turbo`, `llama2`). | Yes |
|
| `LLM_MODEL` | AI model name, e.g. `gpt-4o`, `gpt-3.5-turbo`, `llama2`. | Yes |
|
||||||
| `OPENAI_API_KEY` | Your OpenAI API key. Required if using OpenAI as the LLM provider. | Cond. |
|
| `OPENAI_API_KEY` | OpenAI API key (required if using OpenAI). | Cond. |
|
||||||
| `LLM_LANGUAGE` | The likely language of your documents (e.g., `English`, `German`). Default is `English`. | No |
|
| `LLM_LANGUAGE` | Likely language for documents (e.g. `English`). Default: `English`. | No |
|
||||||
| `OLLAMA_HOST` | The URL of the Ollama server (e.g., `http://host.docker.internal:11434`). Useful if using Ollama. Default is `http://127.0.0.1:11434`. | No |
|
| `OLLAMA_HOST` | Ollama server URL (e.g. `http://host.docker.internal:11434`). | No |
|
||||||
| `VISION_LLM_PROVIDER` | The vision LLM provider to use for OCR (`openai` or `ollama`). | No |
|
| `VISION_LLM_PROVIDER` | AI backend for OCR (`openai` or `ollama`). | No |
|
||||||
| `VISION_LLM_MODEL` | The model name to use for OCR (e.g., `minicpm-v`). | No |
|
| `VISION_LLM_MODEL` | Model name for OCR (e.g. `minicpm-v`). | No |
|
||||||
| `AUTO_OCR_TAG` | The tag to use for automatically processing documents with OCR. Default is `paperless-gpt-ocr-auto`. | No |
|
| `AUTO_OCR_TAG` | Tag for automatically processing docs with OCR. Default: `paperless-gpt-ocr-auto`. | No |
|
||||||
| `LOG_LEVEL` | The log level for the application (`info`, `debug`, `warn`, `error`). Default is `info`. | No |
|
| `LOG_LEVEL` | Application log level (`info`, `debug`, `warn`, `error`). Default: `info`. | No |
|
||||||
| `LISTEN_INTERFACE` | The interface paperless-gpt listens to. Default is `:8080` | No |
|
| `LISTEN_INTERFACE` | Network interface to listen on. Default: `:8080`. | No |
|
||||||
| `WEBUI_PATH` | The path to load static content from. Default is `./web-app/dist` | No |
|
| `WEBUI_PATH` | Path for static content. Default: `./web-app/dist`. | No |
|
||||||
| `AUTO_GENERATE_TITLE` | Enable/disable title generation when automatically applying suggestions with `paperless-gpt-auto`. Default is `true` | No |
|
| `AUTO_GENERATE_TITLE` | Generate titles automatically if `paperless-gpt-auto` is used. Default: `true`. | No |
|
||||||
| `AUTO_GENERATE_TAGS` | Enable/disable tag generation when automatically applying suggestions with `paperless-gpt-auto`. Default is `true` | No |
|
| `AUTO_GENERATE_TAGS` | Generate tags automatically if `paperless-gpt-auto` is used. Default: `true`. | No |
|
||||||
|
|
||||||
**Note:** When using Ollama, ensure that the Ollama server is running and accessible from the paperless-gpt container.
|
|
||||||
|
|
||||||
### Custom Prompt Templates
|
### Custom Prompt Templates
|
||||||
|
|
||||||
You can customize the prompt templates used by paperless-gpt to generate titles and tags. By default, the application uses built-in templates, but you can modify them by editing the template files.
|
paperless-gpt’s flexible **prompt templates** let you shape how AI responds:
|
||||||
|
|
||||||
#### Prompt Templates Directory
|
1. **`title_prompt.tmpl`**: For document titles.
|
||||||
|
2. **`tag_prompt.tmpl`**: For tagging logic.
|
||||||
|
|
||||||
The prompt templates are stored in the `prompts` directory inside the application. The two main template files are:
|
Mount them into your container via:
|
||||||
|
|
||||||
- `title_prompt.tmpl`: Template used for generating document titles.
|
|
||||||
- `tag_prompt.tmpl`: Template used for generating document tags.
|
|
||||||
|
|
||||||
#### Mounting the Prompts Directory
|
|
||||||
|
|
||||||
To modify the prompt templates, you need to mount a local `prompts` directory into the container.
|
|
||||||
|
|
||||||
**Docker Compose Example:**
|
|
||||||
|
|
||||||
```yaml
|
```yaml
|
||||||
services:
|
volumes:
|
||||||
paperless-gpt:
|
- ./prompts:/app/prompts
|
||||||
image: icereed/paperless-gpt:latest
|
|
||||||
# ... (other configurations)
|
|
||||||
volumes:
|
|
||||||
- ./prompts:/app/prompts # Mount the prompts directory
|
|
||||||
```
|
```
|
||||||
|
|
||||||
**Docker Run Command Example:**
|
Then tweak at will—**paperless-gpt** reloads them automatically on startup!
|
||||||
|
|
||||||
```bash
|
---
|
||||||
docker run -d \
|
|
||||||
# ... (other configurations)
|
|
||||||
-v $(pwd)/prompts:/app/prompts \
|
|
||||||
paperless-gpt
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Editing the Prompt Templates
|
|
||||||
|
|
||||||
1. **Start the Container:**
|
|
||||||
|
|
||||||
When you first start the container with the `prompts` directory mounted, it will automatically create the default template files in your local `prompts` directory if they do not exist.
|
|
||||||
|
|
||||||
2. **Edit the Template Files:**
|
|
||||||
|
|
||||||
- Open `prompts/title_prompt.tmpl` and `prompts/tag_prompt.tmpl` with your favorite text editor.
|
|
||||||
- Modify the templates using Go's `text/template` syntax.
|
|
||||||
- Save the changes.
|
|
||||||
|
|
||||||
3. **Restart the Container (if necessary):**
|
|
||||||
|
|
||||||
The application automatically reloads the templates when it starts. If the container is already running, you may need to restart it to apply the changes.
|
|
||||||
|
|
||||||
#### Template Syntax and Variables
|
|
||||||
|
|
||||||
The templates use Go's `text/template` syntax and have access to the following variables:
|
|
||||||
|
|
||||||
- **For `title_prompt.tmpl`:**
|
|
||||||
|
|
||||||
- `{{.Language}}`: The language specified in `LLM_LANGUAGE` (default is `English`).
|
|
||||||
- `{{.Content}}`: The content of the document.
|
|
||||||
|
|
||||||
- **For `tag_prompt.tmpl`:**
|
|
||||||
|
|
||||||
- `{{.Language}}`: The language specified in `LLM_LANGUAGE`.
|
|
||||||
- `{{.AvailableTags}}`: A list (array) of available tags from paperless-ngx.
|
|
||||||
- `{{.Title}}`: The suggested title for the document.
|
|
||||||
- `{{.Content}}`: The content of the document.
|
|
||||||
|
|
||||||
**Example `title_prompt.tmpl`:**
|
|
||||||
|
|
||||||
```text
|
|
||||||
I will provide you with the content of a document that has been partially read by OCR (so it may contain errors).
|
|
||||||
Your task is to find a suitable document title that I can use as the title in the paperless-ngx program.
|
|
||||||
Respond only with the title, without any additional information. The content is likely in {{.Language}}.
|
|
||||||
|
|
||||||
Be sure to add one fitting emoji at the beginning of the title to make it more visually appealing.
|
|
||||||
|
|
||||||
Content:
|
|
||||||
{{.Content}}
|
|
||||||
```
|
|
||||||
|
|
||||||
**Example `tag_prompt.tmpl`:**
|
|
||||||
|
|
||||||
```text
|
|
||||||
I will provide you with the content and the title of a document. Your task is to select appropriate tags for the document from the list of available tags I will provide. Only select tags from the provided list. Respond only with the selected tags as a comma-separated list, without any additional information. The content is likely in {{.Language}}.
|
|
||||||
|
|
||||||
Available Tags:
|
|
||||||
{{.AvailableTags | join ","}}
|
|
||||||
|
|
||||||
Title:
|
|
||||||
{{.Title}}
|
|
||||||
|
|
||||||
Content:
|
|
||||||
{{.Content}}
|
|
||||||
|
|
||||||
Please concisely select the {{.Language}} tags from the list above that best describe the document.
|
|
||||||
Be very selective and only choose the most relevant tags since too many tags will make the document less discoverable.
|
|
||||||
```
|
|
||||||
|
|
||||||
**Note:** Advanced users can utilize additional functions from the [Sprig](http://masterminds.github.io/sprig/) template library, as it is included in the application.
|
|
||||||
|
|
||||||
## Usage
|
## Usage
|
||||||
|
|
||||||
1. **Tag Documents in paperless-ngx:**
|
1. **Tag Documents**
|
||||||
|
- Add `paperless-gpt` or your custom tag to the docs you want to AI-ify.
|
||||||
|
|
||||||
- Add the tag `paperless-gpt` to documents you want to process. This tag is configurable via the `tagToFilter` variable in the code (default is `paperless-gpt`).
|
2. **Visit Web UI**
|
||||||
|
- Go to `http://localhost:8080` (or your host) in your browser.
|
||||||
|
|
||||||
2. **Access the paperless-gpt Interface:**
|
3. **Generate & Apply Suggestions**
|
||||||
|
- Click “Generate Suggestions” to see AI-proposed titles/tags.
|
||||||
|
- Approve, edit, or discard. Hit “Apply” to finalize in paperless-ngx.
|
||||||
|
|
||||||
- Open your browser and navigate to `http://localhost:8080`.
|
4. **Try LLM-Based OCR (Experimental)**
|
||||||
|
- If you enabled `VISION_LLM_PROVIDER` and `VISION_LLM_MODEL`, let AI-based OCR read your scanned PDFs.
|
||||||
|
- Tag those documents with `paperless-gpt-ocr-auto` (or your custom `AUTO_OCR_TAG`).
|
||||||
|
|
||||||
3. **Process Documents:**
|
**Tip**: The entire pipeline can be **fully automated** if you prefer minimal manual intervention.
|
||||||
|
|
||||||
- Click on **"Generate Suggestions"** to let the LLM generate title suggestions based on the document content.
|
---
|
||||||
|
|
||||||
4. **Review and Apply Titles and Tags:**
|
|
||||||
|
|
||||||
- Review the suggested titles. You can edit them if necessary.
|
|
||||||
- Click on **"Apply Suggestions"** to update the document titles in paperless-ngx.
|
|
||||||
|
|
||||||
5. **Experimental OCR Feature:**
|
|
||||||
|
|
||||||
- Send documents to a vision LLM for OCR processing.
|
|
||||||
- Example configuration to enable OCR with Ollama:
|
|
||||||
```env
|
|
||||||
VISION_LLM_PROVIDER=ollama
|
|
||||||
VISION_LLM_MODEL=minicpm-v
|
|
||||||
```
|
|
||||||
|
|
||||||
## Contributing
|
## Contributing
|
||||||
|
|
||||||
Contributions are welcome! Please read the [contributing guidelines](CONTRIBUTING.md) before submitting a pull request.
|
**Pull requests** and **issues** are welcome!
|
||||||
|
1. Fork the repo
|
||||||
|
2. Create a branch (`feature/my-awesome-update`)
|
||||||
|
3. Commit changes (`git commit -m "Improve X"`)
|
||||||
|
4. Open a PR
|
||||||
|
|
||||||
1. **Fork the Repository**
|
Check out our [contributing guidelines](CONTRIBUTING.md) for details.
|
||||||
|
|
||||||
2. **Create a Feature Branch**
|
---
|
||||||
|
|
||||||
```bash
|
|
||||||
git checkout -b feature/my-new-feature
|
|
||||||
```
|
|
||||||
|
|
||||||
3. **Commit Your Changes**
|
|
||||||
|
|
||||||
```bash
|
|
||||||
git commit -am 'Add some feature'
|
|
||||||
```
|
|
||||||
|
|
||||||
4. **Push to the Branch**
|
|
||||||
|
|
||||||
```bash
|
|
||||||
git push origin feature/my-new-feature
|
|
||||||
```
|
|
||||||
|
|
||||||
5. **Create a Pull Request**
|
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
|
||||||
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|
paperless-gpt is licensed under the [MIT License](LICENSE). Feel free to adapt and share!
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Star History
|
## Star History
|
||||||
|
|
||||||
[](https://star-history.com/#icereed/paperless-gpt&Date)
|
[](https://star-history.com/#icereed/paperless-gpt&Date)
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
**Disclaimer:** This project is not affiliated with the official paperless-ngx project. Use at your own discretion.
|
## Disclaimer
|
||||||
|
This project is **not** officially affiliated with [paperless-ngx][paperless-ngx]. Use at your own risk.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
**paperless-gpt**: The **LLM-based** companion your doc management has been waiting for. Enjoy effortless, intelligent document titles, tags, and next-level OCR.
|
||||||
|
|
||||||
|
[paperless-ngx]: https://github.com/paperless-ngx/paperless-ngx
|
||||||
|
[docker-install]: https://docs.docker.com/get-docker/
|
||||||
|
|
Loading…
Reference in a new issue