Files
Sortarr/proj-info.md
2026-05-15 02:41:52 +00:00

252 lines
10 KiB
Markdown

# Sortarr Project Info
Purpose: self-hosted Jellyfin ecosystem organizer and dashboard, fully editable and Docker Compose runnable. It watches downloads, plans/moves media into Jellyfin-friendly folders across four media drives, displays storage/library/download/release status, and exposes configurable tools such as subtitle audit and ffmpeg transcoding.
## Runtime
- Root: `/home/drop/jellyfin/scripts/sortarr`
- Web UI: `http://localhost:8088` or host LAN IP on port `8088`
- Backend API: port `8099`
- Compose files: `compose.yaml`, `compose.override.yaml`, `compose.prod.yaml`
- Env file: `.env`
- Default dry-run: enabled via `SORTARR_DRY_RUN=true`
- Active containers: `sortarr-web`, `sortarr-backend`
- Known unrelated/orphan container: `sortarr` may still appear restarting from an older compose shape.
## Host Paths
Configured in `.env`:
- Downloads: `/home/drop/jellyfin/downloads` mounted as `/downloads`
- Media drive 1: `/home/drop/jellyfin/mediashare1` mounted as `/media/drive1`
- Media drive 2: `/home/drop/jellyfin/mediashare2` mounted as `/media/drive2`
- Media drive 3: `/home/drop/jellyfin/mediashare3` mounted as `/media/drive3`
- Media drive 4: `/home/drop/jellyfin/mediashare4` mounted as `/media/drive4`
- Config: `/home/drop/jellyfin/scripts/sortarr/config`
- Logs: `/home/drop/jellyfin/scripts/sortarr/logs`
- Data/state: `/home/drop/jellyfin/scripts/sortarr/data`
## Architecture
- `web`: nginx serves static HTML/CSS/JS from `web/src` and proxies `/api/*` to backend.
- `backend`: Python 3.12 stdlib HTTP API plus background scanner thread. Backend image installs `ffmpeg`.
- Optional profiles:
- `redis` profile `cache`
- `postgres` profile `database`
- `media-tools` profile `tools`
No frontend framework and no backend web framework are used. This is intentional for editability.
## Important Files
- `.env.example`: sample deployment variables.
- `.env`: real local deployment paths and runtime values. Ignored by git.
- `compose.yaml`: main stack.
- `compose.override.yaml`: dev bind mounts and debug defaults.
- `compose.prod.yaml`: prod restart/dry-run defaults.
- `backend/default-config/app.toml`: full default config.
- `config/app.toml`: host-editable override config.
- `config/custom-theme.css`: host-editable CSS token overrides.
- `backend/sortarr/app.py`: API server and route handlers.
- `backend/sortarr/config.py`: TOML/env config loading and merging.
- `backend/sortarr/scanner.py`: 24/7 downloads scanner thread.
- `backend/sortarr/parser.py`: filename media parser.
- `backend/sortarr/organizer.py`: destination planning, collision handling, move execution, NFO writing.
- `backend/sortarr/storage.py`: drive stats and drive selection.
- `backend/sortarr/library.py`: explicit library scan/indexing and Movies/TV collection grouping.
- `backend/sortarr/metadata.py`: optional TMDb metadata lookup for covers, summaries, and TV episode lists.
- `backend/sortarr/media_probe.py`: safe ffprobe wrapper for audio/subtitle/video stream details.
- `backend/sortarr/tools.py`: subtitle audit and transcoder tools.
- `backend/sortarr/downloads.py`: current `/downloads` listing and recent moved/planned download history.
- `backend/sortarr/releases.py`: free RSS/JSON upcoming release providers.
- `backend/sortarr/store.py`: JSON state store in `data/state.json`.
- `web/src/index.html`: app shell and page markup.
- `web/src/app.js`: hash router, API calls, rendering, settings/tools behavior.
- `web/src/styles.css`: layout/design system.
- `web/src/themes.css`: 10 editable theme presets.
- `docs/*.md`: API/config/operations docs.
## Configuration Model
Config precedence:
1. `backend/default-config/app.toml`
2. `config/app.toml`
3. `.env` variables passed into Compose
4. Runtime settings saved in `data/state.json` under `settings`
Key config areas:
- `[app]`: dry-run, scan interval, settle time, log level, extensions, incomplete suffixes, library scan limits, cache size cap.
- `[paths]`: downloads/data/logs/cache container paths.
- `[[drives]]`: four media drives with id/name/path/min-free-space.
- `[library]`: folder and filename templates, collision policy, permissions mode.
- `[metadata]`: NFO behavior and optional TMDb credentials/settings.
- `[[release_providers]]`: free RSS/JSON providers.
- `[theme]`: default theme and custom CSS.
Runtime Settings page can update:
- `dry_run`
- `scan_interval_seconds`
- `settle_seconds`
- `library_scan_max_files`
- `library_scan_timeout_seconds`
- `log_level`
## Media Organizer Behavior
Background scanner watches `/downloads` continuously.
Safety:
- Ignores incomplete suffixes such as `.part`, `.!qB`, `.tmp`, `.crdownload`.
- Requires files to be stable for `settle_seconds`.
- Dry-run plans moves without moving.
- Actual moves go through a temporary `.sorting` path before final rename.
- Collision policies: `keep-both`, `skip`, `replace`.
- Events and plans are stored in `data/state.json`.
Parsing:
- Detects movies, episodes, seasons, and multi-episode releases.
- Recognizes `S01E02`, `S01E02E03`, and `1x02` style episode patterns.
- Extracts year and quality tokens where present.
Drive choice:
1. Checks whether the title already has a home under `Movies` or `Shows`.
2. If no home exists, picks eligible drive with most free space.
3. Enforces `min_free_gb`.
Naming:
- Movies: `Movies/{title} ({year})/{title} ({year}){quality}{ext}`
- Episodes: `Shows/{title}/Season {season:02d}/{title} - SxxExx - Episode{quality}{ext}`
- Templates are editable in TOML.
## Library Indexing
Regular dashboard refresh does not walk the media filesystem.
Library indexing is explicit:
- UI button: Library page -> `Scan library`
- API: `POST /api/library/scan`
- Scans only direct child folders of each media drive named:
- `Movies`
- `Shows`
- `TV`
- `TV Shows`
The library scanner skips system/recycle folders and has timeout/file-count limits. Results are cached in `data/state.json` and used by dashboard/tools.
Current cache fields include:
- drive stats
- indexed media items split by `Movies` and `TV`/`TV Shows` roots
- collection groups for movies and TV series
- optional TMDb posters, overviews, and TV season episode metadata
- extension breakdown
- scanned file count
- truncation flag
- per-media `has_subtitles` when available from scan
## Frontend Pages
The UI uses hash routing in `web/src/app.js`.
Routes:
- `#/overview`: storage, file type breakdown, recent events.
- `#/library`: poster grid with All/Movies/TV Shows tabs, series/episode drilldown, missing/upcoming episode state, and media stream inspection.
- `#/downloads`: current `/downloads` media bundles with matching subtitles/sidecars plus recent Sortarr plans/moves from `/downloads`.
- `#/releases`: missing/upcoming library episodes plus configured public providers.
- `#/tools`: transcoder, subtitle audit, duplicate finder placeholder.
- `#/settings`: appearance controls, descriptive runtime controls, raw config details.
Theme system:
- Theme choices live on the Settings page and persist in `localStorage`.
- Compact density toggle persists in `localStorage`.
- Presets: `slate`, `midnight`, `graphite`, `nord`, `dracula`, `solar`, `forest`, `marine`, `ember`, `paper`.
- Tokens live in `web/src/themes.css`; host overrides in `config/custom-theme.css`.
## Backend API
- `GET /api/health`: healthcheck.
- `GET /api/config`: public config with secrets removed.
- `GET /api/dashboard`: state + cached library + drive stats; no filesystem library scan.
- `POST /api/scan`: run one downloads scan now.
- `POST /api/library/scan`: refresh cached library index.
- `GET /api/downloads`: current `/downloads` files plus recent planned/moved download history.
- `GET /api/releases`: upcoming releases.
- `GET /api/media/probe`: ffprobe stream details for a selected file.
- `POST /api/media/tracks`: dry-run or execute ffmpeg remux track default/removal changes.
- `GET /api/theme/custom.css`: custom CSS.
- `POST /api/settings`: update runtime settings.
- `GET /api/tools/subtitles`: subtitle audit from cached library data.
- `GET /api/tools/transcoder`: build ffmpeg transcode queue from cached library.
- `POST /api/tools/transcoder/run-next`: run next ffmpeg transcode if dry-run is disabled.
## Tools
Subtitle audit:
- Uses cached library index, not live filesystem probes.
- Requires a fresh library scan for accurate `has_subtitles`.
- Reports checked count, with-subtitles count, missing count, unknown count, and missing examples.
Transcoder:
- Backend image installs `ffmpeg`.
- Queue includes cached indexed media not already `.mp4`.
- Output path is source path with `.mp4` suffix.
- Command uses `libx264`, `aac`, and `mov_text`.
- In dry-run mode, `run-next` reports without executing.
- With dry-run disabled, runs one job synchronously with a 1 hour timeout.
Duplicate finder:
- Reports duplicate title groups from the cached library index.
## Release Providers
No paid API dependency.
Bundled providers, disabled by default so the Releases page stays centered on the local library:
- TMDb RSS upcoming movies.
- TVMaze public schedule JSON.
Provider logic is in `backend/sortarr/releases.py`; add new RSS/JSON adapters there and configure in TOML.
## Verification Commands
Common checks:
```bash
python -m compileall backend/sortarr
node --check web/src/app.js
docker compose config
docker compose up -d --build
docker exec sortarr-backend python -m sortarr.healthcheck
docker exec sortarr-backend ffmpeg -version
```
Endpoint checks from inside backend:
```bash
docker exec sortarr-backend python -c "from urllib.request import urlopen; print(urlopen('http://127.0.0.1:8099/api/health').status)"
docker exec sortarr-backend python -c "from urllib.request import urlopen; import json; print(json.load(urlopen('http://127.0.0.1:8099/api/tools/transcoder'))['transcoder']['ffmpeg_available'])"
```
## Current Caveats / Next Good Tasks
- Settings are runtime/persisted in JSON state but not written back into `config/app.toml`.
- Transcoding runs synchronously; future improvement should add a job queue with progress/cancel/history.
- Duplicate finder reports duplicate title groups from the cached library index.
- Subtitle audit only becomes exact after a fresh manual library scan because it relies on cached `has_subtitles`.
- Library scan only checks direct child folders named `Movies`, `TV`, or `TV Shows` under each media drive.
- Backend is stdlib HTTP server; fine for self-hosting behind LAN/reverse proxy, but add auth before exposing publicly.