Docs / Introduction

ParallelOS documentation

ParallelOS is a distributed GPU operating system. It pools idle GPUs from independent operators into a single network, and runs containerized AI workloads across them — settled in $PLOS on Solana.

There are two ways to use the network, and this documentation is organized around both:

The web console is the control plane — you connect a Solana wallet to deploy, monitor and manage. Actual compute runs in containers on operator hardware via the ParallelOS Agent; nothing trains or infers in your browser.

Quickstart

Run your first job in under a minute. You can deploy from the web console, the CLI, or the SDK — all three produce the same deployment on the network.

deploy a job
# 1 · install & sign in
curl -fsSL https://get.parallelos.network | sh
parallelos login            # opens Phantom to authorize

# 2 · deploy a template
parallelos deploy \
  --template comfyui-sdxl \
  --gpu "RTX 4090" --count 2 --region us-east \
  --input ipfs://bafy…/refs.zip

# 3 · follow logs, then pull results
parallelos logs dep_7a3f9c --follow
parallelos pull dep_7a3f9c -o ./out
# pip install parallelos
from parallelos import Client

client = Client(api_key="plos_sk_live_…")

job = client.deployments.create(
    image="docker.io/parallelos/comfyui:2.4",
    command="python main.py --batch /inputs",
    gpu="RTX 4090", count=2, region="us-east",
    inputs="ipfs://bafy…/refs.zip",
    max_budget=40,            # PLOS
)
result = job.wait()              # blocks until completed
print(result.artifacts)        # → [renders_batch.zip, …]
curl https://api.parallelos.network/v1/jobs \
  -H "Authorization: Bearer plos_sk_live_…" \
  -H "Content-Type: application/json" \
  -d '{
    "image": "docker.io/parallelos/comfyui:2.4",
    "command": "python main.py --batch /inputs",
    "resources": { "gpu": "RTX 4090", "count": 2, "region": "us-east" },
    "inputs": "ipfs://bafy…/refs.zip",
    "limits": { "maxBudget": 40 }
  }'
No CLI? Do the same thing visually in the web console — pick a template in the Marketplace and hit Deploy.

Architecture

ParallelOS has two layers. The Core System is the orchestration plane: it accepts jobs, splits them into work units, schedules them onto suitable nodes, routes data, and assembles results. The Node Layer is every machine running the Agent — they execute work units and stream telemetry back. The Core System never runs your code; it only coordinates.

Web · CLISDK · API CORE SYSTEMscheduler · routerfault toleranceassembler GPU node · us-east GPU node · eu GPU node · ap Object storageinputs · artifacts (IPFS)

A job is described by a manifest. The Core System validates it, finds nodes that match the requested GPU, region and capacity, and opens a lease on each. Inputs are mounted from object storage, the container runs, and outputs are uploaded back as downloadable artifacts.

LayerResponsibilityRuns your code?
Core SystemValidation, decomposition, scheduling, routing, fault tolerance, result assembly, billingNo
Node LayerPulls the image, executes work units, returns artifacts, streams telemetryYes

Nodes & the Agent

A node is a machine that has joined the network by running the ParallelOS Agent — a lightweight daemon (or Docker container) that benchmarks the GPU, maintains a secure outbound tunnel to the Core System, pulls assigned work, and reports health. Operators never expose inbound ports; all coordination is over an authenticated reverse tunnel.

What the Agent does

  • Registers the device and runs a one-time hardware benchmark (Proof-of-Work / capacity test).
  • Reports live telemetry: GPU utilization, temperature, power draw, VRAM, uptime.
  • Receives work units, runs them in isolated containers, and returns artifacts.
  • Earns $PLOS for verified, completed work — credited to the paired wallet.
One wallet can pair many devices. Each appears separately in Devices with its own telemetry, reputation and earnings.

Deployments

A deployment is one job submitted to the network. Internally it is split into work units — independent slices that nodes process in parallel. You pay per second of actual GPU time, capped by the budget you set, and only for units that complete.

Execution modes

ModeHow it worksBest for
Data-parallelIndependent units run concurrently across many nodesBatch inference, embeddings, rendering
PipelineEach node owns one stage; intermediate data flows between stagesMulti-stage & distributed training

Job lifecycle

Every deployment moves through a deterministic set of states. Funds are escrowed when the lease opens and settled per completed unit; unfinished units are refunded.

queued provisioning running completed failed
StateMeaning
queuedManifest accepted and validated, awaiting a matching node.
provisioningLease opened; node is pulling the image and mounting inputs.
runningWork units executing; logs and progress stream live.
completedAll units done, artifacts uploaded, fees settled (exit code 0).
failedNon-zero exit or unrecoverable error; unfinished units refunded.

Manifest reference

A deployment manifest is the full description of a job — the payload sent to the network. The web builder, CLI and SDK all compile down to this object.

deployment.yaml
apiVersion: parallelos/v1
kind: Deployment
name: sdxl-product-shots
image: docker.io/parallelos/comfyui:2.4
command: "python main.py --batch /inputs"
resources:
  gpu: RTX 4090
  count: 2
  region: us-east
limits:
  maxRuntime: 2h
  maxBudget: 40          # PLOS, hard cap
inputs: ipfs://bafy…/refs.zip   # mounted at /inputs
env:
  STEPS: "30"
  CFG: "7.5"
outputs: /outputs            # uploaded as artifacts
FieldDescription
namerequiredHuman-readable deployment name.
imagerequiredContainer image reference (Docker Hub, GHCR, or a template image).
commandoptionalEntrypoint override. Defaults to the image command.
resources.gpurequiredGPU model — see GPU types.
resources.countoptionalGPUs per node. Default 1.
resources.regionoptionalPreferred region; omit to let the scheduler choose.
limits.maxRuntimeoptionalWall-clock cap, e.g. 2h, 30m.
limits.maxBudgetrequiredHard spend cap in $PLOS. The job stops before exceeding it.
inputsoptionalIPFS URI or HTTPS URL mounted read-only at /inputs.
envoptionalEnvironment variables passed to the container.
outputsoptionalDirectory collected and uploaded as artifacts. Default /outputs.

CLI reference

The CLI is the fastest way to script deployments and manage nodes. Install once, then authorize with your wallet.

common commands
parallelos login                       # authorize via Phantom
parallelos deploy -f deployment.yaml     # deploy from a manifest
parallelos ps                          # list your deployments
parallelos logs <id> --follow          # stream logs
parallelos get <id>                    # status & metrics
parallelos pull <id> -o ./out          # download artifacts
parallelos stop <id>                   # cancel & settle

# provider
parallelos agent start --token <token>  # bring a device online
parallelos agent status                 # local agent health

API & SDK

The REST API and the official Python SDK expose the same surface. Authenticate with an API key from Settings → API keys using a bearer token.

Create a deployment

POST/v1/jobs
Body fieldTypeDescription
imagestringContainer image reference.
commandstringEntrypoint override.
resourcesobject{ gpu, count, region }
inputsstringIPFS / HTTPS source for /inputs.
limitsobject{ maxRuntime, maxBudget }

Retrieve a deployment

GET/v1/jobs/:id
response
{
  "id": "dep_7a3f9c",
  "status": "running",
  "progress": 0.58,
  "node": "node-870e · us-east",
  "cost": 7.12,
  "artifacts": []
}

Other endpoints

MethodPathDescription
GET/v1/jobsList deployments.
DEL/v1/jobs/:idCancel a running deployment.
GET/v1/nodesList your paired provider devices.

Requirements

To provide GPU you need supported hardware, a recent driver stack and Docker. More VRAM and stable bandwidth mean larger units routed your way — and higher earnings.

ComponentMinimumRecommended
GPUNVIDIA, 8 GB VRAM, CUDA 11+RTX 4090 / A100 / H100
DriverNVIDIA driver ≥ 535latest stable
RuntimeDocker + NVIDIA Container Toolkit
Network50 Mbps up/downwired, low jitter
OSLinux, Windows, or macOSUbuntu 22.04

Install the Agent

Run one command on the machine you want to contribute, then pair it with the token shown in Connect Device.

install & start the agent
curl -fsSL https://get.parallelos.network | sh
parallelos agent start --token plos_xxxxxxxx_xxxxxxxx
# PowerShell (admin)
irm https://get.parallelos.network/win | iex
parallelos agent start --token plos_xxxxxxxx_xxxxxxxx
docker run --gpus all -d --restart unless-stopped \
  parallelos/agent:latest \
  --token plos_xxxxxxxx_xxxxxxxx

After it connects, the Agent runs a one-time benchmark. The device first appears as pending, then becomes online once verified and hired while running a job.

Keep the GPU dedicated while online. The network periodically verifies utilization; unauthorized use of a hired GPU can block the device.

Rewards & reputation

Operators earn $PLOS for verified, completed work. Payouts accrue per device and can be claimed to your wallet anytime from Earnings.

FactorWeightEffect
Completed workHighPrimary reward driver — paid per unit.
Compute performanceHighFaster, larger GPUs earn more and get bigger units.
Uptime & reliabilityMediumConsistent availability raises your score.
ReputationCompoundingHigher reputation lifts scheduling priority over time.

GPU types & pricing

Compute is metered per second at the rate of the GPU you request. Indicative network rates:

GPUVRAMRate (PLOS / GPU-hr)Typical use
RTX 309024 GB0.22Inference, speech, vision
RTX 408016 GB0.28Image gen, light inference
RTX 409024 GB0.34Image gen, embeddings, rendering
L40S48 GB0.78Mid-size serving
A100 80GB80 GB1.10LLM serving, fine-tuning
H100 80GB80 GB2.40Distributed training

A 5% network fee applies on top of compute. Rates are illustrative and set by the live market.

Billing & $PLOS

All compute is paid in $PLOS, the network's SPL token on Solana. Your wallet balance funds deployments; when a lease opens, the maximum budget is escrowed and the actual per-second cost is settled on completion. Unused escrow is returned.

  • Per-second billing — you pay only for GPU time consumed, capped by maxBudget.
  • Auto-reload — optionally top up automatically when your balance drops below a threshold.
  • Invoices — every settlement is recorded and exportable from Billing.

$PLOS is also used for node incentives, governance and premium scheduling. See the full tokenomics.

Security

Workloads run in isolated containers on operator hardware. The Core System validates every manifest, monitors node behavior, enforces execution policies and keeps an audit log of leases and settlements.

  • Isolation — each job runs in its own container; operators get no access to your data beyond the mounted inputs.
  • Authenticated tunnels — nodes connect outbound only; no inbound ports are exposed.
  • Verification — proof-of-work and uptime checks gate reward eligibility and catch misbehaving nodes.
  • Encrypt sensitive inputs client-side when handling private data.

Limitations

ParallelOS parallelizes work into independent units; it does not merge VRAM across machines into one address space. Pick workloads accordingly.

Not a fit: single models that need a unified VRAM pool larger than one node, or tasks requiring continuous high-frequency synchronization. Great fit: anything embarrassingly parallel or cleanly pipelined — batch inference, rendering, embeddings, data-parallel training. Performance also depends on inter-node latency and bandwidth.

FAQ

Does ParallelOS combine GPU memory across nodes?

No. It splits work into independent units rather than pooling VRAM. Ultra-low-latency shared-memory workloads aren't a fit.

What exactly do I pay for?

Per second of GPU time at the requested GPU's rate, plus a 5% network fee, capped by maxBudget. Units that fail are refunded.

Which wallet is supported?

Phantom on Solana. Connect it in the console to deploy, pay and receive provider earnings in $PLOS.

Can one machine both provide and consume?

Yes — the same wallet can run the Agent on devices and deploy jobs; they're tracked independently.

Glossary

Core SystemThe orchestration plane that schedules and assembles — it never runs your code.
NodeA machine running the Agent that executes work units.
AgentThe daemon/container that connects a device to the network.
DeploymentOne submitted job, described by a manifest.
Work unitAn independent slice of a deployment run by a single node.
ManifestThe full job spec — image, command, resources, limits, inputs.
ArtifactAn output file produced by a completed deployment.
$PLOSThe network's SPL token used for fees, rewards and governance.