Docs / Introduction

ParallelOS documentation

ParallelOS is a distributed GPU operating system. It pools idle GPUs from independent operators into a single network, and runs containerized AI workloads across them — settled in $PLOS on Solana.

There are two ways to use the network, and this documentation is organized around both:

Use compute

Deploy a container or template, run it on network GPUs, and collect your results.

Quickstart →

Provide GPU

Install the Agent on your machine, pair it, and earn $PLOS for completed work.

Run a node →

The web console is the control plane — you connect a Solana wallet to deploy, monitor and manage. Actual compute runs in containers on operator hardware via the ParallelOS Agent; nothing trains or infers in your browser.

Quickstart

Run your first job in under a minute. You can deploy from the web console, the CLI, or the SDK — all three produce the same deployment on the network.

deploy a job

# 1 · install & sign in
curl -fsSL https://get.parallelos.network | sh
parallelos login            # opens Phantom to authorize

# 2 · deploy a template
parallelos deploy \
  --template comfyui-sdxl \
  --gpu "RTX 4090" --count 2 --region us-east \
  --input ipfs://bafy…/refs.zip

# 3 · follow logs, then pull results
parallelos logs dep_7a3f9c --follow
parallelos pull dep_7a3f9c -o ./out

# pip install parallelos
from parallelos import Client

client = Client(api_key="plos_sk_live_…")

job = client.deployments.create(
    image="docker.io/parallelos/comfyui:2.4",
    command="python main.py --batch /inputs",
    gpu="RTX 4090", count=2, region="us-east",
    inputs="ipfs://bafy…/refs.zip",
    max_budget=40,            # PLOS
)
result = job.wait()              # blocks until completed
print(result.artifacts)        # → [renders_batch.zip, …]

curl https://api.parallelos.network/v1/jobs \
  -H "Authorization: Bearer plos_sk_live_…" \
  -H "Content-Type: application/json" \
  -d '{
    "image": "docker.io/parallelos/comfyui:2.4",
    "command": "python main.py --batch /inputs",
    "resources": { "gpu": "RTX 4090", "count": 2, "region": "us-east" },
    "inputs": "ipfs://bafy…/refs.zip",
    "limits": { "maxBudget": 40 }
  }'

No CLI? Do the same thing visually in the web console — pick a template in the Marketplace and hit Deploy.

Architecture

ParallelOS has two layers. The Core System is the orchestration plane: it accepts jobs, splits them into work units, schedules them onto suitable nodes, routes data, and assembles results. The Node Layer is every machine running the Agent — they execute work units and stream telemetry back. The Core System never runs your code; it only coordinates.

A job is described by a manifest. The Core System validates it, finds nodes that match the requested GPU, region and capacity, and opens a lease on each. Inputs are mounted from object storage, the container runs, and outputs are uploaded back as downloadable artifacts.

Layer	Responsibility	Runs your code?
Core System	Validation, decomposition, scheduling, routing, fault tolerance, result assembly, billing	No
Node Layer	Pulls the image, executes work units, returns artifacts, streams telemetry	Yes

Nodes & the Agent

A node is a machine that has joined the network by running the ParallelOS Agent — a lightweight daemon (or Docker container) that benchmarks the GPU, maintains a secure outbound tunnel to the Core System, pulls assigned work, and reports health. Operators never expose inbound ports; all coordination is over an authenticated reverse tunnel.

What the Agent does

Registers the device and runs a one-time hardware benchmark (Proof-of-Work / capacity test).
Reports live telemetry: GPU utilization, temperature, power draw, VRAM, uptime.
Receives work units, runs them in isolated containers, and returns artifacts.
Earns $PLOS for verified, completed work — credited to the paired wallet.

One wallet can pair many devices. Each appears separately in Devices with its own telemetry, reputation and earnings.

Deployments

A deployment is one job submitted to the network. Internally it is split into work units — independent slices that nodes process in parallel. You pay per second of actual GPU time, capped by the budget you set, and only for units that complete.

Execution modes

Mode	How it works	Best for
Data-parallel	Independent units run concurrently across many nodes	Batch inference, embeddings, rendering
Pipeline	Each node owns one stage; intermediate data flows between stages	Multi-stage & distributed training

Job lifecycle

Every deployment moves through a deterministic set of states. Funds are escrowed when the lease opens and settled per completed unit; unfinished units are refunded.

queued provisioning running completed failed

State	Meaning
`queued`	Manifest accepted and validated, awaiting a matching node.
`provisioning`	Lease opened; node is pulling the image and mounting inputs.
`running`	Work units executing; logs and progress stream live.
`completed`	All units done, artifacts uploaded, fees settled (exit code 0).
`failed`	Non-zero exit or unrecoverable error; unfinished units refunded.

Manifest reference

A deployment manifest is the full description of a job — the payload sent to the network. The web builder, CLI and SDK all compile down to this object.

deployment.yaml

apiVersion: parallelos/v1
kind: Deployment
name: sdxl-product-shots
image: docker.io/parallelos/comfyui:2.4
command: "python main.py --batch /inputs"
resources:
  gpu: RTX 4090
  count: 2
  region: us-east
limits:
  maxRuntime: 2h
  maxBudget: 40          # PLOS, hard cap
inputs: ipfs://bafy…/refs.zip   # mounted at /inputs
env:
  STEPS: "30"
  CFG: "7.5"
outputs: /outputs            # uploaded as artifacts

Field		Description
`name`	required	Human-readable deployment name.
`image`	required	Container image reference (Docker Hub, GHCR, or a template image).
`command`	optional	Entrypoint override. Defaults to the image command.
`resources.gpu`	required	GPU model — see GPU types.
`resources.count`	optional	GPUs per node. Default `1`.
`resources.region`	optional	Preferred region; omit to let the scheduler choose.
`limits.maxRuntime`	optional	Wall-clock cap, e.g. `2h`, `30m`.
`limits.maxBudget`	required	Hard spend cap in $PLOS. The job stops before exceeding it.
`inputs`	optional	IPFS URI or HTTPS URL mounted read-only at `/inputs`.
`env`	optional	Environment variables passed to the container.
`outputs`	optional	Directory collected and uploaded as artifacts. Default `/outputs`.

CLI reference

The CLI is the fastest way to script deployments and manage nodes. Install once, then authorize with your wallet.

common commands

parallelos login                       # authorize via Phantom
parallelos deploy -f deployment.yaml     # deploy from a manifest
parallelos ps                          # list your deployments
parallelos logs <id> --follow          # stream logs
parallelos get <id>                    # status & metrics
parallelos pull <id> -o ./out          # download artifacts
parallelos stop <id>                   # cancel & settle

# provider
parallelos agent start --token <token>  # bring a device online
parallelos agent status                 # local agent health

API & SDK

The REST API and the official Python SDK expose the same surface. Authenticate with an API key from Settings → API keys using a bearer token.

Create a deployment

POST/v1/jobs

Body field	Type	Description
`image`	string	Container image reference.
`command`	string	Entrypoint override.
`resources`	object	`{ gpu, count, region }`
`inputs`	string	IPFS / HTTPS source for `/inputs`.
`limits`	object	`{ maxRuntime, maxBudget }`

Retrieve a deployment

GET/v1/jobs/:id

response

{
  "id": "dep_7a3f9c",
  "status": "running",
  "progress": 0.58,
  "node": "node-870e · us-east",
  "cost": 7.12,
  "artifacts": []
}

Other endpoints

Method	Path	Description
GET	`/v1/jobs`	List deployments.
DEL	`/v1/jobs/:id`	Cancel a running deployment.
GET	`/v1/nodes`	List your paired provider devices.

Requirements

To provide GPU you need supported hardware, a recent driver stack and Docker. More VRAM and stable bandwidth mean larger units routed your way — and higher earnings.

Component	Minimum	Recommended
GPU	NVIDIA, 8 GB VRAM, CUDA 11+	RTX 4090 / A100 / H100
Driver	NVIDIA driver ≥ 535	latest stable
Runtime	Docker + NVIDIA Container Toolkit	—
Network	50 Mbps up/down	wired, low jitter
OS	Linux, Windows, or macOS	Ubuntu 22.04

Install the Agent

Run one command on the machine you want to contribute, then pair it with the token shown in Connect Device.

install & start the agent

curl -fsSL https://get.parallelos.network | sh
parallelos agent start --token plos_xxxxxxxx_xxxxxxxx

# PowerShell (admin)
irm https://get.parallelos.network/win | iex
parallelos agent start --token plos_xxxxxxxx_xxxxxxxx

docker run --gpus all -d --restart unless-stopped \
  parallelos/agent:latest \
  --token plos_xxxxxxxx_xxxxxxxx

After it connects, the Agent runs a one-time benchmark. The device first appears as pending, then becomes online once verified and hired while running a job.

Keep the GPU dedicated while online. The network periodically verifies utilization; unauthorized use of a hired GPU can block the device.

Rewards & reputation

Operators earn $PLOS for verified, completed work. Payouts accrue per device and can be claimed to your wallet anytime from Earnings.

Factor	Weight	Effect
Completed work	High	Primary reward driver — paid per unit.
Compute performance	High	Faster, larger GPUs earn more and get bigger units.
Uptime & reliability	Medium	Consistent availability raises your score.
Reputation	Compounding	Higher reputation lifts scheduling priority over time.

GPU types & pricing

Compute is metered per second at the rate of the GPU you request. Indicative network rates:

GPU	VRAM	Rate (PLOS / GPU-hr)	Typical use
RTX 3090	24 GB	0.22	Inference, speech, vision
RTX 4080	16 GB	0.28	Image gen, light inference
RTX 4090	24 GB	0.34	Image gen, embeddings, rendering
L40S	48 GB	0.78	Mid-size serving
A100 80GB	80 GB	1.10	LLM serving, fine-tuning
H100 80GB	80 GB	2.40	Distributed training

A 5% network fee applies on top of compute. Rates are illustrative and set by the live market.

Billing & $PLOS

All compute is paid in $PLOS, the network's SPL token on Solana. Your wallet balance funds deployments; when a lease opens, the maximum budget is escrowed and the actual per-second cost is settled on completion. Unused escrow is returned.

Per-second billing — you pay only for GPU time consumed, capped by maxBudget.
Auto-reload — optionally top up automatically when your balance drops below a threshold.
Invoices — every settlement is recorded and exportable from Billing.

$PLOS is also used for node incentives, governance and premium scheduling. See the full tokenomics.

Security

Workloads run in isolated containers on operator hardware. The Core System validates every manifest, monitors node behavior, enforces execution policies and keeps an audit log of leases and settlements.

Isolation — each job runs in its own container; operators get no access to your data beyond the mounted inputs.
Authenticated tunnels — nodes connect outbound only; no inbound ports are exposed.
Verification — proof-of-work and uptime checks gate reward eligibility and catch misbehaving nodes.
Encrypt sensitive inputs client-side when handling private data.

Limitations

ParallelOS parallelizes work into independent units; it does not merge VRAM across machines into one address space. Pick workloads accordingly.

Not a fit: single models that need a unified VRAM pool larger than one node, or tasks requiring continuous high-frequency synchronization. Great fit: anything embarrassingly parallel or cleanly pipelined — batch inference, rendering, embeddings, data-parallel training. Performance also depends on inter-node latency and bandwidth.

FAQ

Does ParallelOS combine GPU memory across nodes?

No. It splits work into independent units rather than pooling VRAM. Ultra-low-latency shared-memory workloads aren't a fit.

What exactly do I pay for?

Per second of GPU time at the requested GPU's rate, plus a 5% network fee, capped by maxBudget. Units that fail are refunded.

Which wallet is supported?

Phantom on Solana. Connect it in the console to deploy, pay and receive provider earnings in $PLOS.

Can one machine both provide and consume?

Yes — the same wallet can run the Agent on devices and deploy jobs; they're tracked independently.

Glossary

Core System	The orchestration plane that schedules and assembles — it never runs your code.
Node	A machine running the Agent that executes work units.
Agent	The daemon/container that connects a device to the network.
Deployment	One submitted job, described by a manifest.
Work unit	An independent slice of a deployment run by a single node.
Manifest	The full job spec — image, command, resources, limits, inputs.
Artifact	An output file produced by a completed deployment.
$PLOS	The network's SPL token used for fees, rewards and governance.

Guide

Get started, step by step →

Console

Launch the app →