> [!abstract] Summary
> **Deadline Log Analyser** — a Python tool by Thomas Spony that inspects Deadline render jobs and flags risky ones (timeouts, RAM issues, oversized USD scenes, un-mipmapped textures, wrong server output). Read-only — diagnoses, never modifies. Use cases: **daily triage** and **single-job check**. Output: HTML / CSV / Markdown reports grouped by Group → User → Job with severity badges.
> [!info] Author & contact
> **Thomas Spony.** Read-only tool — looks at render reports and the Deadline job database; never changes a job, render, or anything on the servers.
---
# Build the .exe (one-time)
```bash
# Install build tools (pip, not npm)
pip install usd-core pyinstaller
# Build the .exe
pyinstaller --onefile --windowed --name DeadlineCleaner --collect-all pxr deadline_cleaner.py
```
> [!video] Explanation video
> [Open in Google Drive](https://drive.google.com/file/d/1CZJn9puQDk0ywYm9UgEDb7l4ql433b6e/view?usp=sharing)
> [!example] Source .py
> [Open in Google Drive](https://drive.google.com/file/d/1yST-Ezu1ctN5EEA77ogf-7y-7tNXSoMe/view?usp=sharing)
---
# What it flags
- [x] MIPMAP flagging.
- [x] AOV excess.
- [x] Memory pressure.
- [ ] USD dependencies.
- [ ] Non-indexed attribute `shop_materialpath` string.
- [ ] Invalid mesh.
- [ ] Bad primvar sample size.
- [ ] Displacement shader doesn't modify P.
- [ ] Singular matrix detected.
*(Find online answers for the unchecked items.)*
## Display improvements
- [ ] Organise errors per category.
---
# 1. What it's for
Student renders sometimes bring the farm — and the `DATA_PFE` server — to its knees. Deadline Cleaner reads logs Deadline already produced and flags the jobs most likely responsible so you can fix them before they crash the farm again.
Two typical uses:
- **Daily triage** — *"show me everything that failed on the farm this week."*
- **Single-job check** — *"why did *my* job fail / why was it so slow?"*
# 2. Before you start
- **Windows** with the **Deadline Client** installed (you already have it if you submit renders). The tool auto-discovers `deadlinecommand.exe`.
- For **USD inspection** (texture + scene dependency analysis), the packaged `.exe` already bundles the USD libraries.
# 3. Launching the tool
- **Double-click `DeadlineCleaner.exe`** — the window opens immediately.
- If using the Python source: open a terminal in its folder, run `python deadline_cleaner.py`.
A **green** "*Deadline Client found*" line at the top means the tool can talk to the farm. **Red** = Deadline Client missing — ask a pipeline TD.
# 4. The window — control by control
| Control | What it does |
|---|---|
| **Look back** | How far back to search: `24h`, `7d`, `30d`, `all`. Ignored if Job IDs are filled. |
| **Max jobs to scan** | Slider — `5, 10, 20, 50, 100, …, All`. Newest first. Start small (20–50). |
| **Job IDs** | Optional. Paste 24-char job IDs (one per line) to analyse exactly those jobs, skipping the time search. See §7. |
| **Groups** | Tick/untick the student groups (`HYT, JRN, VRT, CNB, PMF, COR, GTE, OTHER`). |
| **Deadline job status** | Which statuses to pull: `Failed`, `Active`, `Completed`, `Suspended`, `Pending`. Defaults to **Failed + Active**. |
| **Inspect USD scenes** | Optional, slower. Opens each USD scene to count dependencies and check textures. |
| **Output folder** | Default `report`. |
| **Analyze** | Runs the analysis. Progress in the box below; the report opens in your browser when done. |
# 5. Running your first analysis
1. Leave **Look back** at `7d`.
2. Set **Max jobs** to `20` (quick).
3. Leave **Groups** all ticked, **status** at Failed + Active.
4. Click **Analyze**.
5. After a few seconds the HTML report opens in your browser.
# 6. Reading the report
Organised **Group → User → Job**.
### Severity
| Badge | Meaning |
|---|---|
| 🔴 **critical** | Almost certainly hurting the farm — fix first. |
| 🟠 **high** | A real problem; fix soon. |
| 🟡 **medium** | Worth correcting; not an emergency. |
| 🔵 **low** | Minor / informational. |
| ⚪ **info** | Nothing wrong detected. |
Groups and users with **critical** or **high** items expand automatically; quieter ones stay collapsed. Top chips jump you to a group.
### A job card
Shows job name, frame, elapsed time, peak RAM, AOV count, texture summary, USD dependency size, output path, and a list of **flags**.
Two states are shown on purpose:
- **task: …** — how one frame ended (success / timeout / failed / …).
- **job: …** — overall Deadline job status.
A job marked **Failed** can still have many frames that rendered fine — only the bad frame failed.
# 7. Checking one specific job
Spotted a red job in **Deadline Monitor**?
1. Right-click → **Copy Job ID**.
2. Paste into the tool's **Job IDs** box (multiple supported, one per line).
3. Click **Analyze**.
Fast — fetches only those jobs and ignores Look-back / status filters.
# 8. What the flags mean
| Flag | Severity | What it means |
|---|---|---|
| `timeout` | critical | Deadline killed the task — ran too long. |
| `stuck_progress` | critical | <5 % rendered after 30 min — render not converging. |
| `output_to_data_pfe` | critical | Render is writing EXRs to `V:` (DATA_PFE). Must use `W:` (DATA_FRM). Directly loads the server that crashes. |
| `ram_saturation` | critical | Peak RAM came within 1 GB of the worker's total — machine was about to swap/crash. |
| `long_frame` | high | A single frame took over 1 h 30 — the ceiling. |
| `memory_grower` | high | Karma's memory kept climbing during the render (leak / runaway scene). |
| `usd_load_failed` | high | Husk could not open the USD scene — broken or missing reference. |
| `non_zero_exit` | high | Renderer exited with an error, no more specific cause found. |
| `mipmap_missing` | high / medium | Textures not mipmapped (`UNTILED`). Convert to `.rat` (Karma) / `.tex` (RenderMan). |
| `aov_excess` | medium | More than 20 AOVs — usually more than the shot needs. |
| `wont_fit_32gb` | medium | Peak RAM ≥ 31 GB — can't run on a 32 GB machine. |
| `env_config_error` | medium | Worker had no `HUSK_PATH` — render never started. |
| `usd_dep_heavy` | medium | 200+ USD layers — very heavy scene graph. |
| `wont_fit_16gb` | low | Peak RAM ≥ 15 GB — can't run on a 16 GB machine. |
| `mipmap_offspec` | low | Tiled, but not renderer-native `.rat` / `.tex` format. |
| `render_warnings` | low | Non-fatal warnings (missing primvars, light/displacement notes, etc.). |
### Quick fixes
- **`output_to_data_pfe`** → repoint render output to `W:/`.
- **`mipmap_missing` / `mipmap_offspec`** → convert textures to `.rat` (Karma) or `.tex` (RenderMan).
- **`aov_excess`** → delete render vars the shot doesn't use.
- **`long_frame` / `timeout` / `memory_grower`** → lighten the scene: lower samples, reduce volume detail, check for runaway geometry/instances.
- **`wont_fit_16gb` / `wont_fit_32gb`** → reduce memory or pin to bigger workers.
# 9. The output files
| File | Use |
|---|---|
| `report.html` | The main report. |
| `report.csv` | Same data, sortable spreadsheet. |
| `per_student.md` | Plain-text rollup grouped by group → student. |
# 10. Command line (optional)
```bash
# Failed + Active jobs from the last 7 days
python deadline_cleaner.py --deadline --since 7d -o report
# Only failed jobs, last 24h, with USD inspection
python deadline_cleaner.py --deadline --job-status Failed --since 24h --inspect-usd -o report
# Specific jobs by ID
python deadline_cleaner.py --deadline --job-ids 6a00cd5b756fadd627ead018,69f9fceda9224a4ba9985021 -o report
```
Run with no arguments to open the GUI.
# 11. Troubleshooting
| Symptom | Fix |
|---|---|
| Red *"deadlinecommand not found"* line | Deadline Client missing — use a workstation that has it. |
| *"Nothing to analyze"* | Widen Look back, tick more job statuses, or check Job IDs are valid 24-hex strings. |
| USD row says `no_pxr` | USD libraries not available — use the packaged `.exe` (bundles them). |
| Textures show *"no data"* for a RenderMan job | RenderMan doesn't log texture stats by default. Tick **Inspect USD scenes**. |
| Lots of `unreachable` textures | Texture paths point to drives not mapped on this machine — the job itself is fine, the tool just can't measure those files from here. |
# 12. What it can't tell you
- It reads **reports of renders that already ran** — it can't predict an unrun job.
- It diagnoses; it does **not** fix jobs or scenes. Use it to produce a report, then correct the scene in Houdini / Prism yourself.
- Nuke comp jobs and Prism `cleanup` jobs are intentionally skipped — they aren't the renders that crash the farm.
---
# 🔗 Related
- [[../DEADLINE MOC|DEADLINE MOC]].
- [[Deadline Error List|Deadline Error List]] — manual reference.
- [[Deadline For Alternatives Productions|Deadline for Alternatives Productions]].
- [[Reconfigure Deadline|Reconfigure Deadline]].
- [[../../TOOLS/Notes/Deadline Renderman Denoiser - Loris Eck|Deadline RenderMan Denoiser]].
- [[../../PROTOCOL/OPTIMISATION/OPTIMISATION MOC|OPTIMISATION MOC]] — root causes the analyser detects.