> [!abstract] Summary
> Deadline is the render farm manager used in PROTOCOL. Artists submit jobs via **Prism**, and render wranglers monitor the farm. Always test locally first (FML + 1/4 rez), always use **network paths**, never exceed **priority 50**, and check the Monitor regularly. Logs are your most important debugging tool.
---
# What is Deadline?
Deadline is a **render farm management and job scheduling system** — the central nervous system of the studio's computing infrastructure. Instead of rendering on a single workstation (which could take days), Deadline distributes work across multiple machines, dramatically accelerating the workflow.
## Is Deadline Running on Your Machine?
- Press **Ctrl + Shift + Esc** and check for a process called **"Deadline Worker"** running at high CPU while nothing else is open.
- ![[image-129.png]]
- You can also look for the Deadline icon in the system tray — if visible, Deadline is active.
- ![[image-130.png]]
---
# Core Concepts
| Concept | Definition | Key Point |
| ---------- | ------------------------------- | --------------------------------- |
| **Job** | Complete submission to Deadline | 1 job = 1 entire submission |
| **Task** | Chunk of frames within a job | 1 task = N frames (configurable) |
| **Frame** | Single image in sequence | Smallest unit, 1 frame = 1 image |
| **Worker** | Render machine on farm | Executes tasks, renders frames |
| **Pool** | Priority queue for jobs | Controls scheduling order |
| **Group** | Worker categorization | Controls which workers can render |
| **Pulse** | Farm maintenance service | Keeps farm healthy and optimized |
### Jobs, Tasks, and Frames in Depth
A **Job** is created each time you submit work from Houdini. Its frame range is split into **Tasks** (chunks) distributed to individual workers.
**Frames Per Task** controls chunk size:
| Frames Per Task | Tasks for 100 Frames | Parallelization |
| --------------- | -------------------- | --------------- |
| 1 | 100 tasks | Maximum |
| 5 | 20 tasks | Balanced |
| 10 | 10 tasks | Lower overhead |
| 100 | 1 task | None |
> [!tip] Choosing Frames Per Task
> - **Fast renders** → 5 frames per task
> - **Heavy/memory-intensive renders** → 1–2 frames per task
>
> If a task is cancelled, **all frames in that task must re-render**, even completed ones. Higher chunk sizes also risk running out of memory since the scene stays loaded between frames.
---
# Submitting a Render
## With Prism (Standard Workflow)
1. Replace the USD Render ROP with a **PRISM LOP RENDER** node — give it a meaningful name for identifier creation.
2. Check **Submit Job**.
3. In Save to Disk, check **USD files from disk generated from node input**.
- Check **Write Stage to disk before render (-submission)**
- Uncheck **Flush data after each frame**
- ![[image-134.png]]
4. Click **Execute**.
![[image-135.png]]
5. Configure submission settings:
| Setting | Guidance |
| ---------------- | ------------------------------------------------------------------------------------------------- |
| **Priority** | See priority table below — never exceed 50 |
| **Frames/Task** | 1–5 for heavy renders, 5–10 for light ones |
| **Task Timeout** | ~100 min per frame as baseline — increase slightly above your FML render time for complex shots |
| **Pool** | Target your assigned pool |
| **Machine Limit**| Divide your group's allocated machines by the number of active jobs |
| **Submit Suspended** | **Uncheck** — submit active unless your render wrangler says otherwise |
6. Double-check in the **Deadline Monitor** that your render has been spooled.
---
# Pre-Render Checklist
> [!danger] Never skip this before submitting to the farm
1. **Test locally first** — never submit a scene that hasn't rendered on your machine.
2. **Do an FML render** (First, Middle, Last frame — full resolution). This also tells you how much RAM your shot needs.
- ==If you need more than 32 GB of RAM, don't send it to a 32 GB machine — it will fail or be very slow.==
3. **Do a 1/4 resolution full frame range** render.
4. Only then submit the **final full frame range**.
5. **Use network paths only** — assets on your local machine won't exist on render nodes.
- Always use: `\\server\projects\...` (mapped paths should already be configured)
6. **Don't monopolize the farm** — priority max is 50, use machine limits.
---
# Priority Reference
| Priority | Use Case | Example |
| -------- | ------------------------ | ------------------------------------ |
| 10 | Debugging | Testing a broken render |
| 20 | Low priority experiments | Look-dev tests, previs, turntable |
| 40 | Nuke render | Compositing jobs |
| **50** | **Normal work** | **Regular render passes** |
> [!warning] Priority cap
> ==Never submit above priority 50.== Higher priority locks the farm for everyone else.
---
# The Deadline Monitor
## Interface Panels
1. **Jobs Panel** — all submitted jobs, their status and progress
2. **Tasks Panel** — frame breakdown for the selected job
3. **Workers Panel** — status of all render nodes
- ![[image-131.png]]
4. **Reporting** — logs, error reports, render times
- ![[image-132.png]]
## Job Color Indicators
| Color | Meaning |
| --------------- | ------------------------- |
| Green | Rendering, progressing |
| Yellow | Waiting in queue |
| Orange / Brown | Job has errors |
| Red | Job failed |
| Blue | Job completed |
## Stopping or Pausing a Job
1. Open the Deadline Monitor.

2. Search for your machine name (or another artist's).

3. Right-click the machine → **Disable Worker** → **Kill Worker** and **Kill Worker If Necessary**. The machine is now in disabled mode.

You can also **right-click a job → Pause Job** to suspend it temporarily.
---
# Reading Logs
> [!tip] Logs are your most important debugging tool — always check them first when something goes wrong.
## Task Log
Click a job → double-click a task → select the task to view its **LOG**. Logs contain render progression, input/output paths, and all errors.
![[image-136.png]]
To diagnose errors, copy-paste the log into an AI assistant (Claude, ChatGPT, Gemini, etc.) for a quick explanation.
## Worker Log
Double-click a worker to see its logs — useful for spotting machine-level issues, repeated failures, or hardware problems.
![[image-137.png]]
---
# Worker Management
## Worker Status Reference
| Status | Meaning | Action |
| ----------- | ---------------------------- | ------------------------------------------------------------- |
| **Offline** | Not currently available | Pulse-start or reboot the machine manually |
| **Stalled** | Hasn't updated in 5+ minutes | System may auto-restart — otherwise Pulse-start or reboot |
| **Disabled**| Manually disabled | System may auto-restart — otherwise Pulse-start or reboot |
| **Idle** | Online, waiting for tasks | No action needed |
| **Rendering**| Processing tasks | Let it work |
At a scheduled time, all machines are put on Deadline — verify this in the Monitor. If a machine isn't responding:
### Enable a Disabled Worker
Right-click on worker → **Enable Worker**
![[Screenshot 2025-12-09 134515.png]]
### Restart a Worker Remotely
Right-click on worker → **Remote Control** → **Worker Command** → **Restart Worker**
![[Screenshot 2025-12-09 134609.png]]
---
# Pool & Group Management
## Pools — Job Scheduling
**Pools** are priority queues that control which jobs render first. Workers monitor assigned pools and prefer jobs from pools listed earlier in their assignment order.
**When submitting:**
- **Primary Pool** = your project's assigned pool
- **Secondary Pool** = RAM tier (64 GB, 32 GB, or 16 GB) — match this to your shot's RAM requirements
### Managing Pools
In the Monitor (requires Super User): **Tools → Manage Pools**
1. Click **New** → enter pool name
2. Select pool → select workers → click **Add**
3. Use **Promote/Demote** to adjust priority order
> [!note] Pool Configuration
> **WIP** — pool structure for Alternatives Productions to be defined.
## Groups — Worker Categorization
**Groups** categorize workers by hardware/software capability. Unlike pools, group order does **not** affect scheduling.
| Group | Purpose | Workers |
| -------------- | ------------------ | ------------------------------- |
| gpu_available | Has GPU cards | RTX-equipped machines |
| houdini_farm | Houdini installed | All render nodes with Houdini |
| high_memory | 64 GB+ RAM | Specialized simulation machines |
| fast_ssd | NVMe storage | Fast I/O machines |
| interactive | For real-time work | Studio playback machines |
---
# Common Mistakes to Avoid
> [!bug] Don't do these
- **Forgetting dependencies** — submit upstream jobs first. Let Deadline handle execution order. ==Never manually re-render dependent passes, and never delete the cleanup job that removes USD cache files.==
- **Infinite retries on broken jobs** — disable auto-retry for debugging. Set limited retries (2–3) for production. Fix and resubmit manually.
- **Not monitoring render progress** — check the Monitor regularly. ==Don't submit and forget.== Address errors immediately by contacting your render wrangler.
- **Local file paths** — assets on your local machine don't exist on render nodes. Always use network paths.
- **Monopolizing the farm** — if you see unused machines in another group's pool, communicate in the deadline channel before taking them.
---
# For Render Wranglers Only
> [!note] This section is for render wranglers managing the farm.
**Responsibilities:**
- Launch renders on Deadline (artists submit suspended — you activate them)
- Manage and resolve Deadline errors
- Communicate farm usage and worker repartition across groups
- Educate artists about render mistakes and optimization
**Key Deadline Components:**
| Component | Role |
| -------------------- | ----------------------------------------------------------------------- |
| **Deadline Client** | Software on artist workstations and render nodes, talks to Repository |
| **Deadline Launcher**| Background service running on each machine |
| **Deadline Worker** | Physical or virtual machine that executes render tasks |
| **Deadline Monitor** | UI for tracking all farm activity |
| **Pulse** | Maintenance service — auto-restarts stalled workers |
**Farm communication — escalate to the deadline channel when:**
- No one is using the farm and you want to claim extra machines beyond your quota
- You see a group taking too many machines
- You encounter unknown errors
- You see too many stalled workers and need to coordinate reboots