Introducing the Inferable Architecture

Inferable provides a robust, distributed architecture designed for building reliable AI-powered applications. Unlike traditional frameworks that require co-location of model calls and function execution, Inferable takes a fundamentally different approach, separating these concerns while maintaining persistent state throughout the entire application lifecycle.

Architectural Overview

At a high level, Inferable’s architecture consists of three primary components:

  1. Control Plane: The central nervous system of Inferable, orchestrating all component interactions
  2. Workflows: Stateful orchestration units that coordinate complex, multi-step processes
  3. Client SDKs: Language-specific libraries that connect your functions to the Inferable ecosystem

These 3 components underpin all LLM-native abstractions like structured outputs, human-in-the-loop, agents, tool use and etc.

The Control Plane: State Persistence at its Core

The Control Plane is the foundation of Inferable’s architecture, providing critical services including:

Persistent State Management

Unlike many AI orchestration frameworks, Inferable’s Control Plane includes a built-in persistence layer that automatically maintains the complete state of all running workflows:

Key benefits of this design:

  • Automatic Checkpointing: Every state change is persisted without developer intervention
  • Transactional Integrity: State updates use ACID transactions to prevent corruption
  • Zero-Loss Recovery: Workflows resume from exactly where they left off, even after system failures
  • Temporal Durability: State can be maintained for any duration - minutes, hours, or months

Distributed Job Queue

The Control Plane incorporates a sophisticated distributed job queue that routes function calls to the appropriate execution environments:

This job queue provides:

  • Fair Distribution: Balances workloads across available worker pools
  • Prioritization: Handles urgent tasks ahead of lower-priority ones
  • Backpressure Management: Gracefully handles capacity constraints
  • Routing Intelligence: Directs jobs to workers with appropriate capabilities

Long Polling: No Network Configuration Required

One of Inferable’s most distinctive architectural features is its use of long polling for communication between the Control Plane and function execution environments:

This approach offers significant advantages:

  • Zero Inbound Ports: Your execution environments need no open inbound ports
  • Firewall-Friendly: Works with standard outbound HTTPS (port 443)
  • NAT Traversal: Functions behind NAT can still participate
  • No VPN Required: Connect from any network with internet access
  • No DNS Configuration: No need to set up DNS records or certificates

For organizations with strict security requirements, this means:

  • Functions can run in private subnets with no internet-accessible endpoints
  • No need to expose internal services to the public internet
  • No complex network configuration or tunneling required
  • Works seamlessly with existing security boundaries

Workflows: Orchestrating Complex Processes

Workflows are the coordination layer in Inferable, responsible for:

  • Maintaining execution context across multiple steps
  • Managing transitions between different processing stages
  • Handling retries, timeouts, and error conditions
  • Coordinating parallel execution paths when appropriate

Each workflow maintains its own state machine, automatically persisted by the Control Plane:

This persistence enables powerful capabilities:

  • Indefinite Pausing: Workflows can pause for any duration while awaiting input
  • Context Preservation: Complete state is maintained during pauses
  • Seamless Resumption: Execution continues with all context intact
  • Multi-Step Processes: Complex multi-stage workflows maintain coherence