AI models without loading.

AI models
without loading.

Run AI models faster with the fastest short-term memory layer designed for AI models and tensors.



Swap models, manage KV cache, reduce cold starts, update automatically, all on your own infrastructure.

Run AI models faster with the fastest short-term memory layer designed for AI models and tensors.



Swap models, manage KV cache, reduce cold starts, update automatically, all on your own infrastructure.

Contact us at: info@outerport.com

Contact us at: info@outerport.com

Backed by

Combinator

No more cold starts.

Serve customers faster with

parallel loading and page locking
for optimized model loading.

outerport.load("outerport/llama3.1-8b")

a diagram to show various models being loaded into GPUs
a diagram to show various models being loaded into GPUs

Hot-swap models from the cache.

Rust-based daemon process built for resilience,

running on your compute nodes to keep models

up-to-date and ready-to-go. See a live demo.

Accelerate workflows.

Accelerate multi-model workflows,

whether on agent builders or ComfyUI

through fast model swapping.

a diagram to show how workflows can be accelerated
a diagram to show how model saving and checkpointing can be done asynchronously

Asynchronous checkpointing.

Enable true asynchronous model and tensor

movement tasks through the daemon.

Deploy updates without downtime.

Update new model weight updates

without container updates or downtime.

a diagram to show deployment of AI models from a remote repository
a graph to show autoscaling behavior
a graph to show autoscaling behavior

Save 40% on GPU costs.

Deploy multi-model services to

simplify load-balancing, amortizing out

auto-scaling costs to maximize

GPU utilization and reduce costs.

Bring your own cloud & infra.

Self-hosted software that can be deployed

anywhere, on any cloud or on-premises.

an image to show the registry interacting with various cloud providers
sample image of the Outerport model catalog

Upload models in one place.

Backed by your S3 buckets, but with

clear-cut access policies, audit logs,

and test quantization / compression

policies in one place.

Get access immediately.

Built by a team that built AI

infrastructure and GPU systems at

NVIDIA, Meta, LinkedIn.

Contact us at: info@outerport.com

© 2024 Genban, Inc.

© 2024 Genban, Inc.