AI Watch Folders

AI watch folders use artificial intelligence to automatically determine the best folder location and file name for each incoming document. Unlike standard watch folders that dump everything into a single destination, AI watch folders analyze each file and sort it intelligently.

Key difference from standard watch folders: Standard watch folders import into a single folder. AI watch folders operate at the top-level file cabinet and use AI to decide which folder within that cabinet each document belongs in, and what it should be named.

How It Works

You point an AI watch folder at a source directory on your server.
You assign it to a top-level file cabinet in DocMan (not a specific folder).
When a new file arrives, DocMan sends it to the configured AI connection (self-hosted or cloud).
The AI analyzes the document content — either by reading the image directly (vision mode) or by processing extracted OCR text.
Based on your folder mode and template, the AI determines the best folder location and suggests a file name.
The document is placed in the Review Queue. If auto-approve is enabled and the confidence meets your threshold, it's filed automatically.

DocMan Review Queue showing AI-suggested filings pending approval

The Review Queue — AI-suggested filings are reviewed here before final placement.

Setting Up AI Connections

Before creating AI watch folders, you need at least one AI connection. Navigate to Administration → AI Connections in the left sidebar.

DocMan AI Connections page showing gemma3:12b and llama3.2 models configured

AI Connections page — configure your self-hosted Ollama models or cloud AI providers here.

Each connection defines:

Name: A friendly label (e.g., "Gemma", "llama").
API URL: The endpoint (e.g., http://localhost:11434 for Ollama).
Model: The model name (e.g., gemma3:12b, llama3.2).
Vision: Whether the model supports image input (direct page analysis).
Re-process: Allow re-running AI on previously processed documents.

Folder Naming Modes

AI watch folders support two folder modes, configured per cabinet in Cabinet Settings → AI Settings → Filing.

Mode 1: Existing Folders (Pre-Defined)

The AI receives a list of all existing folders in the cabinet and picks the best match. No new folders are created. This is ideal when you have a known, fixed set of categories — like a medical chart with standard sections.

Example: Medical chart cabinet

Patient: John Smith
  ├── Demographics
  ├── Insurance
  ├── Lab Results
  ├── Radiology
  ├── Progress Notes
  ├── Prescriptions
  ├── Referrals
  └── Correspondence

When a lab report is scanned, the AI reads the document, identifies it as a lab result, and files it into Lab Results. A referral letter goes into Referrals. The folders are never created or modified — the AI just picks the best match from what already exists.

Example: Accounting cabinet

Accounting
  ├── Invoices
  ├── Receipts
  ├── Bank Statements
  ├── Tax Documents
  ├── Payroll
  └── Contracts

Mode 2: Template (Auto Folder Creation)

You define a folder structure using a template with tokens. The AI extracts values from each document (company name, date, document type, etc.) and builds the folder path automatically. Folders are created as needed.

The folder template is configured in Cabinet Settings → AI Settings → Filing → Folder Template.

Folder Template Tokens

Tokens are placeholders wrapped in angle brackets that get replaced with AI-extracted values. You combine them with / to build folder paths.

AI-Extracted Field Tokens

Token	Description	Example Output
`<Company>`	Full company or vendor name extracted from the document (sanitized for the file system).	Acme Corp
`<CompanyFirstLetter>`	First letter of the company name, uppercase.	A
`<DocType>`	Document type (invoice, receipt, contract, letter, etc.).	Invoice
`<Amount>`	Dollar amount extracted from the document.	1,250.00
`<DocumentName>`	Original filename of the imported file.	scan_001.pdf

Date Tokens

Date tokens are smart — they use the document date extracted by the AI first. If no document date is found, they fall back to the import date.

Token	Description	Example Output
`<YYYY>` or `<Year>`	4-digit year.	2026
`<YY>`	2-digit year.	26
`<Month>`	Full month name.	April
`<MM>`	2-digit month number.	04
`<M>`	Month number, no leading zero.	4
`<DD>`	2-digit day of month.	20
`<D>`	Day of month, no leading zero.	20
`<Day>`	Day of week name.	Sunday
`<Date>`	Full date in YYYY-MM-DD format.	2026-04-20

Pre-Built Date Format Tokens

Token	Example Output
`<YYYY-MM-DD>`	2026-04-20
`<MM/DD/YYYY>`	04/20/2026
`<MM-DD-YYYY>`	04-20-2026
`<YYYYMMDD>`	20260420

Explicit Date Source Tokens

If you need to force a specific date source instead of the smart fallback:

Token	Description
`<DocDate>`, `<DocYear>`, `<DocMonth>`, `<DocDay>`	Date extracted from the document content only. No fallback.
`<CreatedDate>`, `<CreatedYear>`, `<CreatedMonth>`, `<CreatedDay>`	Import timestamp only. Always available.

Folder Structure Examples

Below are common folder templates and the structures they produce. Each example shows how DocMan would organize an "Invoices" cabinet receiving documents from multiple vendors.

`<CompanyFirstLetter>/<Company>/<YYYY>`

Alpha prefix / Company name / Year. Best for large volumes — the alphabetical prefix keeps the top level manageable when you have hundreds of vendors.

Invoices                          (Cabinet)
  ├── A                           (<CompanyFirstLetter>)
  │   ├── Acme Corp               (<Company>)
  │   │   ├── 2025                (<YYYY>)
  │   │   └── 2026
  │   └── Alberts Machinery
  │       └── 2026
  ├── B
  │   └── Big Signs Plus
  │       └── 2023
  ├── E
  │   └── East Repair Inc.
  │       └── 2019
  └── T
      └── Townplace Roadhouse
          └── 2026

DocMan showing AI-created folder structure in the Invoices cabinet

A real DocMan cabinet using AI-generated folder structures. Note the alpha prefix "A" with "Alberts Machinery" nested underneath, and company/year folders throughout.

`<Company>/<YYYY>`

Company name / Year. Simpler layout that works well when you have a manageable number of entities (under ~50).

Invoices                          (Cabinet)
  ├── Acme Corp                   (<Company>)
  │   ├── 2025                    (<YYYY>)
  │   └── 2026
  ├── Big Signs Plus
  │   └── 2023
  ├── East Repair Inc.
  │   └── 2019
  └── Townplace Roadhouse
      └── 2026

`<YYYY>/<Company>`

Year / Company name. Year-first makes it easy to archive or close out an entire year. Good when time is the primary way you think about your documents.

Invoices                          (Cabinet)
  ├── 2019                        (<YYYY>)
  │   └── East Repair Inc.        (<Company>)
  ├── 2023
  │   └── Big Signs Plus
  ├── 2025
  │   └── Acme Corp
  └── 2026
      ├── Acme Corp
      ├── Alberts Machinery
      └── Townplace Roadhouse

`<Company>/<YYYY>/<MM>-<Month>`

Company / Year / Month. Most granular. Useful for high-volume operations where you need to locate documents by vendor and month — like accounts payable departments processing hundreds of invoices per month.

Invoices                          (Cabinet)
  ├── Acme Corp                   (<Company>)
  │   └── 2026                    (<YYYY>)
  │       ├── 01-January          (<MM>-<Month>)
  │       ├── 02-February
  │       ├── 03-March
  │       └── 04-April
  └── Big Signs Plus
      └── 2023
          ├── 09-September
          └── 11-November

Where to configure: Right-click a cabinet in the sidebar → Cabinet Settings → AI Settings → Filing section. Set the Folder Structure dropdown to "Use a template to build folder path" and enter your template in the Folder Template field.

AI Processing Modes

Each AI connection can operate in one of two processing modes, configured in the AI connection's Vision setting.

Mode	How It Works	Best For
Vision (Image)	The document page is sent as an image directly to the AI model. The model reads the content visually, including layout, logos, letterheads, and handwriting.	Scanned documents, faxes, forms with layouts that matter, documents with logos or letterheads that help identify the source.
OCR Text Only	DocMan first extracts text via Tesseract OCR, then sends only the extracted text to the AI model. Faster and uses less GPU memory.	Clean, text-heavy documents like invoices, letters, and reports. Works well with smaller AI models or lower-powered hardware.

Recommendation: Start with OCR Text mode if you're running a smaller GPU or want faster processing. Switch to Vision mode for documents where visual layout matters (forms, mixed-format documents, or when OCR quality is poor). The gemma3:12b model supports both modes; llama3.2 supports OCR text only.

Self-Hosted AI with Ollama

DocMan integrates with Ollama for fully self-hosted, on-premise AI processing. No data leaves your network. Ollama runs AI models locally on your server's GPU (or CPU, though GPU is strongly recommended).

Recommended Models

Model	VRAM Required	Best For
gemma3:12b	~8 GB	Best overall accuracy. Supports both vision and OCR text modes. Recommended for dedicated servers with a modern GPU (RTX 3060 12GB or better).
llama3.2	~4 GB	Good accuracy with lower hardware requirements. OCR text mode only. A solid choice for workstations or servers with a lower-end GPU (GTX 1660, RTX 3050, etc.).

Installing Ollama on Windows

Download the installer
Go to https://ollama.com/download/windows and download the Windows installer.
Run the installer
Run OllamaSetup.exe and follow the prompts. Ollama installs as a background service and starts automatically.
Verify installation
Open a terminal (Command Prompt or PowerShell) and run:
```
ollama --version
```
You should see the installed version number.
Pull a model
Download the recommended model:
```
ollama pull gemma3:12b
```
Or for lower-powered hardware:
```
ollama pull llama3.2
```
The download may take several minutes depending on your internet connection.
Test the model
Verify the model is working:
```
ollama run gemma3:12b "Hello, are you working?"
```
You should get a response. Press Ctrl+D to exit.
Configure DocMan
In DocMan, go to Administration → AI Connections and click New Connection. Set:
- Name: Gemma (or any label)
- API URL: http://localhost:11434
- Model: gemma3:12b (or llama3.2)
- Vision: Enable for gemma3:12b, leave off for llama3.2

Installing Ollama on Linux Mint

Install Ollama
Open a terminal and run:
```
curl -fsSL https://ollama.com/install.sh | sh
```
This installs Ollama and registers it as a systemd service.
Verify installation
```
ollama --version
```

Check that the service is running

sudo systemctl status ollama

You should see active (running). If not:

sudo systemctl start ollama
sudo systemctl enable ollama

Install NVIDIA drivers (if using a NVIDIA GPU)
Linux Mint includes a Driver Manager. Open it from the application menu and install the recommended NVIDIA driver. Reboot after installation.
```
nvidia-smi
```
Verify the GPU is detected and shows available memory.

Pull a model

ollama pull gemma3:12b

Or for lower-powered hardware:

ollama pull llama3.2

Test the model

ollama run gemma3:12b "Hello, are you working?"

Configure DocMan
In DocMan, go to Administration → AI Connections and create a new connection with the Ollama URL set to http://localhost:11434 (or the server's IP if DocMan runs on a different machine) and select your model.

Remote Ollama: If you run Ollama on a separate machine from DocMan, set the OLLAMA_HOST environment variable to 0.0.0.0 on the Ollama server so it accepts connections from other machines. Then use that machine's IP address in DocMan's AI connection (e.g., http://192.168.1.50:11434).

Cloud AI Processing

If you prefer not to run AI locally, DocMan also supports cloud AI providers. Cloud processing offers higher accuracy on complex documents and requires no local GPU hardware. Add a cloud provider as an AI connection in Administration → AI Connections with your API key.

Cloud processing sends document content (image or OCR text) to the provider's API. If data privacy is a concern, use the self-hosted Ollama option instead.

Configuration Reference

AI Watch Folder Settings

Configured in Cabinet Settings → AI Settings → Watch Folders.

Setting	Description
Source Path	The server directory to monitor for incoming files.
Target Cabinet	The top-level file cabinet. The AI determines which folder within this cabinet to use.
AI Connection	Which AI connection to use for processing documents in this watch folder.
File Pattern	Glob pattern for file types to import (default: `*` for all files).
Poll Interval	How often to check for new files, in seconds. Default is 60.
Delete After Import	Remove the source file after successfully importing into DocMan.
Active	Toggle to enable or disable the AI watch folder.

Cabinet AI Filing Settings

Configured in Cabinet Settings → AI Settings → Filing.

Setting	Description
Folder Structure	Existing Folders (AI picks from what's there) or Template (AI creates folders from your pattern).
Folder Template	Template mode only. The folder path pattern using tokens (e.g., `<Company>/<YYYY>`).
Filename Template	Optional template for renaming files (e.g., `<Company> - <DocType> - <Date>`).
Auto-Approve Threshold	Confidence score (0–100) above which documents are filed automatically without review. Default is 75. Set to 0 to disable auto-approve and review everything.

Back to: Documentation Overview • Previous: Watch Folders