ParallelOS documentation
ParallelOS is a distributed GPU operating system. It pools idle GPUs from independent operators into a single network, and runs containerized AI workloads across them — settled in $PLOS on Solana.
There are two ways to use the network, and this documentation is organized around both:
Use compute
Deploy a container or template, run it on network GPUs, and collect your results.
Quickstart →Provide GPU
Install the Agent on your machine, pair it, and earn $PLOS for completed work.
Run a node →Quickstart
Run your first job in under a minute. You can deploy from the web console, the CLI, or the SDK — all three produce the same deployment on the network.
# 1 · install & sign in curl -fsSL https://get.parallelos.network | sh parallelos login # opens Phantom to authorize # 2 · deploy a template parallelos deploy \ --template comfyui-sdxl \ --gpu "RTX 4090" --count 2 --region us-east \ --input ipfs://bafy…/refs.zip # 3 · follow logs, then pull results parallelos logs dep_7a3f9c --follow parallelos pull dep_7a3f9c -o ./out
# pip install parallelos from parallelos import Client client = Client(api_key="plos_sk_live_…") job = client.deployments.create( image="docker.io/parallelos/comfyui:2.4", command="python main.py --batch /inputs", gpu="RTX 4090", count=2, region="us-east", inputs="ipfs://bafy…/refs.zip", max_budget=40, # PLOS ) result = job.wait() # blocks until completed print(result.artifacts) # → [renders_batch.zip, …]
curl https://api.parallelos.network/v1/jobs \ -H "Authorization: Bearer plos_sk_live_…" \ -H "Content-Type: application/json" \ -d '{ "image": "docker.io/parallelos/comfyui:2.4", "command": "python main.py --batch /inputs", "resources": { "gpu": "RTX 4090", "count": 2, "region": "us-east" }, "inputs": "ipfs://bafy…/refs.zip", "limits": { "maxBudget": 40 } }'
Architecture
ParallelOS has two layers. The Core System is the orchestration plane: it accepts jobs, splits them into work units, schedules them onto suitable nodes, routes data, and assembles results. The Node Layer is every machine running the Agent — they execute work units and stream telemetry back. The Core System never runs your code; it only coordinates.
A job is described by a manifest. The Core System validates it, finds nodes that match the requested GPU, region and capacity, and opens a lease on each. Inputs are mounted from object storage, the container runs, and outputs are uploaded back as downloadable artifacts.
| Layer | Responsibility | Runs your code? |
|---|---|---|
| Core System | Validation, decomposition, scheduling, routing, fault tolerance, result assembly, billing | No |
| Node Layer | Pulls the image, executes work units, returns artifacts, streams telemetry | Yes |
Nodes & the Agent
A node is a machine that has joined the network by running the ParallelOS Agent — a lightweight daemon (or Docker container) that benchmarks the GPU, maintains a secure outbound tunnel to the Core System, pulls assigned work, and reports health. Operators never expose inbound ports; all coordination is over an authenticated reverse tunnel.
What the Agent does
- Registers the device and runs a one-time hardware benchmark (Proof-of-Work / capacity test).
- Reports live telemetry: GPU utilization, temperature, power draw, VRAM, uptime.
- Receives work units, runs them in isolated containers, and returns artifacts.
- Earns $PLOS for verified, completed work — credited to the paired wallet.
Deployments
A deployment is one job submitted to the network. Internally it is split into work units — independent slices that nodes process in parallel. You pay per second of actual GPU time, capped by the budget you set, and only for units that complete.
Execution modes
| Mode | How it works | Best for |
|---|---|---|
| Data-parallel | Independent units run concurrently across many nodes | Batch inference, embeddings, rendering |
| Pipeline | Each node owns one stage; intermediate data flows between stages | Multi-stage & distributed training |
Job lifecycle
Every deployment moves through a deterministic set of states. Funds are escrowed when the lease opens and settled per completed unit; unfinished units are refunded.
| State | Meaning |
|---|---|
queued | Manifest accepted and validated, awaiting a matching node. |
provisioning | Lease opened; node is pulling the image and mounting inputs. |
running | Work units executing; logs and progress stream live. |
completed | All units done, artifacts uploaded, fees settled (exit code 0). |
failed | Non-zero exit or unrecoverable error; unfinished units refunded. |
Manifest reference
A deployment manifest is the full description of a job — the payload sent to the network. The web builder, CLI and SDK all compile down to this object.
apiVersion: parallelos/v1 kind: Deployment name: sdxl-product-shots image: docker.io/parallelos/comfyui:2.4 command: "python main.py --batch /inputs" resources: gpu: RTX 4090 count: 2 region: us-east limits: maxRuntime: 2h maxBudget: 40 # PLOS, hard cap inputs: ipfs://bafy…/refs.zip # mounted at /inputs env: STEPS: "30" CFG: "7.5" outputs: /outputs # uploaded as artifacts
| Field | Description | |
|---|---|---|
name | required | Human-readable deployment name. |
image | required | Container image reference (Docker Hub, GHCR, or a template image). |
command | optional | Entrypoint override. Defaults to the image command. |
resources.gpu | required | GPU model — see GPU types. |
resources.count | optional | GPUs per node. Default 1. |
resources.region | optional | Preferred region; omit to let the scheduler choose. |
limits.maxRuntime | optional | Wall-clock cap, e.g. 2h, 30m. |
limits.maxBudget | required | Hard spend cap in $PLOS. The job stops before exceeding it. |
inputs | optional | IPFS URI or HTTPS URL mounted read-only at /inputs. |
env | optional | Environment variables passed to the container. |
outputs | optional | Directory collected and uploaded as artifacts. Default /outputs. |
CLI reference
The CLI is the fastest way to script deployments and manage nodes. Install once, then authorize with your wallet.
parallelos login # authorize via Phantom parallelos deploy -f deployment.yaml # deploy from a manifest parallelos ps # list your deployments parallelos logs <id> --follow # stream logs parallelos get <id> # status & metrics parallelos pull <id> -o ./out # download artifacts parallelos stop <id> # cancel & settle # provider parallelos agent start --token <token> # bring a device online parallelos agent status # local agent health
API & SDK
The REST API and the official Python SDK expose the same surface. Authenticate with an API key from Settings → API keys using a bearer token.
Create a deployment
| Body field | Type | Description |
|---|---|---|
image | string | Container image reference. |
command | string | Entrypoint override. |
resources | object | { gpu, count, region } |
inputs | string | IPFS / HTTPS source for /inputs. |
limits | object | { maxRuntime, maxBudget } |
Retrieve a deployment
{
"id": "dep_7a3f9c",
"status": "running",
"progress": 0.58,
"node": "node-870e · us-east",
"cost": 7.12,
"artifacts": []
}
Other endpoints
| Method | Path | Description |
|---|---|---|
| GET | /v1/jobs | List deployments. |
| DEL | /v1/jobs/:id | Cancel a running deployment. |
| GET | /v1/nodes | List your paired provider devices. |
Requirements
To provide GPU you need supported hardware, a recent driver stack and Docker. More VRAM and stable bandwidth mean larger units routed your way — and higher earnings.
| Component | Minimum | Recommended |
|---|---|---|
| GPU | NVIDIA, 8 GB VRAM, CUDA 11+ | RTX 4090 / A100 / H100 |
| Driver | NVIDIA driver ≥ 535 | latest stable |
| Runtime | Docker + NVIDIA Container Toolkit | — |
| Network | 50 Mbps up/down | wired, low jitter |
| OS | Linux, Windows, or macOS | Ubuntu 22.04 |
Install the Agent
Run one command on the machine you want to contribute, then pair it with the token shown in Connect Device.
curl -fsSL https://get.parallelos.network | sh parallelos agent start --token plos_xxxxxxxx_xxxxxxxx
# PowerShell (admin) irm https://get.parallelos.network/win | iex parallelos agent start --token plos_xxxxxxxx_xxxxxxxx
docker run --gpus all -d --restart unless-stopped \ parallelos/agent:latest \ --token plos_xxxxxxxx_xxxxxxxx
After it connects, the Agent runs a one-time benchmark. The device first appears as pending, then becomes online once verified and hired while running a job.
Rewards & reputation
Operators earn $PLOS for verified, completed work. Payouts accrue per device and can be claimed to your wallet anytime from Earnings.
| Factor | Weight | Effect |
|---|---|---|
| Completed work | High | Primary reward driver — paid per unit. |
| Compute performance | High | Faster, larger GPUs earn more and get bigger units. |
| Uptime & reliability | Medium | Consistent availability raises your score. |
| Reputation | Compounding | Higher reputation lifts scheduling priority over time. |
GPU types & pricing
Compute is metered per second at the rate of the GPU you request. Indicative network rates:
| GPU | VRAM | Rate (PLOS / GPU-hr) | Typical use |
|---|---|---|---|
| RTX 3090 | 24 GB | 0.22 | Inference, speech, vision |
| RTX 4080 | 16 GB | 0.28 | Image gen, light inference |
| RTX 4090 | 24 GB | 0.34 | Image gen, embeddings, rendering |
| L40S | 48 GB | 0.78 | Mid-size serving |
| A100 80GB | 80 GB | 1.10 | LLM serving, fine-tuning |
| H100 80GB | 80 GB | 2.40 | Distributed training |
A 5% network fee applies on top of compute. Rates are illustrative and set by the live market.
Billing & $PLOS
All compute is paid in $PLOS, the network's SPL token on Solana. Your wallet balance funds deployments; when a lease opens, the maximum budget is escrowed and the actual per-second cost is settled on completion. Unused escrow is returned.
- Per-second billing — you pay only for GPU time consumed, capped by
maxBudget. - Auto-reload — optionally top up automatically when your balance drops below a threshold.
- Invoices — every settlement is recorded and exportable from Billing.
$PLOS is also used for node incentives, governance and premium scheduling. See the full tokenomics.
Security
Workloads run in isolated containers on operator hardware. The Core System validates every manifest, monitors node behavior, enforces execution policies and keeps an audit log of leases and settlements.
- Isolation — each job runs in its own container; operators get no access to your data beyond the mounted inputs.
- Authenticated tunnels — nodes connect outbound only; no inbound ports are exposed.
- Verification — proof-of-work and uptime checks gate reward eligibility and catch misbehaving nodes.
- Encrypt sensitive inputs client-side when handling private data.
Limitations
ParallelOS parallelizes work into independent units; it does not merge VRAM across machines into one address space. Pick workloads accordingly.
FAQ
Does ParallelOS combine GPU memory across nodes?
No. It splits work into independent units rather than pooling VRAM. Ultra-low-latency shared-memory workloads aren't a fit.
What exactly do I pay for?
Per second of GPU time at the requested GPU's rate, plus a 5% network fee, capped by maxBudget. Units that fail are refunded.
Which wallet is supported?
Phantom on Solana. Connect it in the console to deploy, pay and receive provider earnings in $PLOS.
Can one machine both provide and consume?
Yes — the same wallet can run the Agent on devices and deploy jobs; they're tracked independently.
Glossary
| Core System | The orchestration plane that schedules and assembles — it never runs your code. |
| Node | A machine running the Agent that executes work units. |
| Agent | The daemon/container that connects a device to the network. |
| Deployment | One submitted job, described by a manifest. |
| Work unit | An independent slice of a deployment run by a single node. |
| Manifest | The full job spec — image, command, resources, limits, inputs. |
| Artifact | An output file produced by a completed deployment. |
| $PLOS | The network's SPL token used for fees, rewards and governance. |