Skip to main content
Oso Fallback runs a read-only copy of your authorization service in your infrastructure. Your applications continue authorizing requests even when Oso Cloud becomes unreachable.

Why Use Oso Fallback?

Authorization sits on your application’s critical path. Oso Cloud delivers 99.99% uptime across multiple regions and availability zones. But network partitions happen. Oso Fallback ensures your applications keep running when they can’t reach Oso Cloud.

How Oso Fallback Works

Oso Fallback nodes are read-only instances that sync your policies and facts every 30 minutes. They may lag behind Oso Cloud during the sync interval. Use fallback nodes as last resort backup only. Continue making authorization calls to Oso Cloud during normal operations. Deploy multiple fallback nodes behind a load balancer for horizontal scaling. Each node operates independently.
Configure load balancers with session affinity. Different nodes may have facts that are up to one minute out of sync.

System Requirements

Minimum specs per node:
  • CPU: 1 core
  • Memory: 256MB RAM
  • Disk: 3x your environment size

Check Your Environment Size

Find your environment size on the Fallback status page. Allocate 3x this size for growth headroom and atomic updates.

Storage Types

  • Ephemeral storage: Node downloads facts on startup. Fails to start if Oso Cloud is unreachable.
  • Persistent storage: Node starts from existing facts on disk. Survives Oso Cloud outages and restarts.

Quick Start

Deploy Oso Fallback in three steps:
  1. Pull the Docker image
    docker pull public.ecr.aws/osohq/fallback:latest
    
  2. Set required environment variables
    export OSO_API_TOKEN="your_read_only_api_key"
    export OSO_ENVIRONMENT="your_environment_id"  
    export OSO_TENANT="your_tenant_id"
    
  3. Run the container
    docker run -p 8080:8080 --env-file=.env public.ecr.aws/osohq/fallback:latest
    
Your fallback node runs on http://localhost:8080.

Update Your Client SDK

Configure your Oso Cloud client to use the fallback automatically:
const oso = new Oso("https://api.osohq.com", apiKey, { 
  fallbackUrl: "http://localhost:8080" 
});
The client tries Oso Cloud first. On specific connection failures or HTTP 5xx errors, it automatically switches to your fallback node. Fallback triggers (JavaScript SDK error codes):
  • ECONNREFUSED - Connection refused (server unreachable)
  • ETIMEDOUT - Connection timed out
  • AbortError - Request aborted
  • HTTP 5xx errors
Other language SDKs handle equivalent connection failures and timeouts.
DNS failures do not trigger fallback. This is an intentional design decision. Client-side DNS misconfigurations (such as typos in the Oso Cloud URL) would result in NXDOMAIN errors that don’t represent an actual Oso Cloud outage. Falling back on DNS errors would mask these configuration mistakes, causing the client to silently use the fallback node when the primary URL is simply incorrect.In the JavaScript SDK, this appears as ENOTFOUND. Other SDKs have equivalent DNS resolution errors that are similarly excluded from fallback triggers.If you need to handle DNS failures in your application, you can catch these errors and implement custom fallback logic.
Using fetchBuilder with fallbackUrl: If you configure a custom fetchBuilder for connection pooling or other HTTP customizations, it only applies to requests to Oso Cloud—not to fallback nodes. Fallback requests use the default HTTP client. However, connection pooling and keep-alive are still enabled for fallback requests through the default client.

Offline snapshots for guaranteed startup

Export snapshots to ensure new nodes start even when Oso Cloud is down:
# Export snapshot
docker run \
  --env-file=.env \
  --volume "${PWD}:/export" \
  public.ecr.aws/osohq/fallback:latest \
  --export /export/oso-data.snapshot

# Import snapshot on new node
docker run \
  --env-file=.env \
  -p 127.0.0.1:8080:8080 \
  --volume "${PWD}:/import" \
  public.ecr.aws/osohq/fallback:latest \
  --import /import/oso-data.snapshot
Nodes always try downloading latest facts first. The --import option provides a fallback when Oso Cloud is unreachable.

Configuration Reference

Required Environment Variables

VariableDescriptionWhere to Find
OSO_API_TOKENRead-only API key for your environmentCreate new API key
OSO_ENVIRONMENTYour environment IDEnvironments page
OSO_TENANTYour tenant IDSettings page

Optional Environment Variables

VariableDefaultOptionsDescription
OSO_CLIENT_IDRandom UUIDa-zA-Z0-9.-Unique identifier per node. Must be unique per restart.
OSO_PORT80800-65535Port for the fallback node
OSO_DIRECTORY./.osoValid pathFacts storage directory
OSO_LOG_LEVELerrorinfo, warn, errorLog verbosity
OSO_DISABLE_TELEMETRYfalsetrue, falseDisable usage data collection
OSO_DISABLE_ERROR_LOGGINGfalsetrue, falseDisable error log collection

Load Balancer Configuration

Health check endpoint: /healthcheck Returns 200 OK only after loading valid policies and facts. Configure a health check grace period for initial startup.
Session affinity recommended. Prevents inconsistent responses from nodes with different sync states.

Scaling Guidelines

CPU Scaling

Scale horizontally by adding nodes or vertically by adding cores per node.
  • Monitor: Average CPU across all nodes
  • Scale trigger: >80% CPU utilization
  • Consider: Startup time for downloading facts when autoscaling

Memory Sizing

Fallback nodes use minimal baseline memory plus in-memory caches for query performance.
  • Monitor: Maximum memory utilization across all nodes
  • Scale trigger: >80% memory utilization
  • Cache size depends on: Query diversity in your application

Disk Sizing

Nodes download new facts while keeping current facts active. This temporarily doubles disk usage.
  • Monitor: Disk utilization across all nodes
  • Scale trigger: >65% disk utilization
  • Recommendation: Use low-latency, high-throughput volumes

Monitoring

Health Checks

  • Endpoint: /healthcheck
  • Healthy response: 200 OK
  • Ready when: Valid policies and facts loaded

Prometheus Metrics

  • Endpoint: /metrics

Key metrics

oso_fallback_snapshot_age_ms - Age of your policies and facts in milliseconds
oso_fallback_snapshot_age_ms 624611
oso_fallback_http_requests - HTTP response latency histogram
oso_fallback_http_requests_bucket{path="/api/authorize",status_code="200",le="0.005"} 119
oso_fallback_http_requests_bucket{path="/api/authorize",status_code="200",le="0.01"} 119
# ... additional buckets
Target sync frequency: Every 30 minutes
Monitor for: Increasing snapshot age indicates sync issues

Testing Your Setup

Verify fallback activation with chaos testing:
  1. Block Oso Cloud access using one of these methods:
    • Firewall rules - Block outbound connections to cloud.osohq.com
    • Proxy configuration - Route Oso Cloud traffic to a non-responsive endpoint
    • iptables/network policies - Drop packets to Oso Cloud IP addresses
    DNS-based blocking does not trigger fallback. Methods like returning NXDOMAIN for cloud.osohq.com will cause DNS resolution errors (e.g., ENOTFOUND in JavaScript), which the SDK does not treat as fallback-eligible failures. Only connection-level failures (e.g., ECONNREFUSED, ETIMEDOUT in JavaScript) and request aborts trigger automatic fallback.This is intentional—DNS errors often indicate client-side misconfigurations (like typos in the URL) rather than actual Oso Cloud outages.
  2. Make authorization requests through your application
  3. Check the oso_fallback_http_requests metric to confirm fallback usage
Test regularly to ensure fallback readiness when you need it.