Architecture
Learn how Inferable works through its core components: Tools, Agents, and Workflows, all connected via the Control Plane
Introducing the Inferable Architecture
Inferable provides a robust, distributed architecture designed for building reliable AI-powered applications. Unlike traditional frameworks that require co-location of model calls and function execution, Inferable takes a fundamentally different approach, separating these concerns while maintaining persistent state throughout the entire application lifecycle.
Architectural Overview
At a high level, Inferable’s architecture consists of three primary components:
- Control Plane: The central nervous system of Inferable, orchestrating all component interactions
- Workflows: Stateful orchestration units that coordinate complex, multi-step processes
- Client SDKs: Language-specific libraries that connect your functions to the Inferable ecosystem
These 3 components underpin all LLM-native abstractions like structured outputs, human-in-the-loop, agents, tool use and etc.
The Control Plane: State Persistence at its Core
The Control Plane is the foundation of Inferable’s architecture, providing critical services including:
Persistent State Management
Unlike many AI orchestration frameworks, Inferable’s Control Plane includes a built-in persistence layer that automatically maintains the complete state of all running workflows:
Key benefits of this design:
- Automatic Checkpointing: Every state change is persisted without developer intervention
- Transactional Integrity: State updates use ACID transactions to prevent corruption
- Zero-Loss Recovery: Workflows resume from exactly where they left off, even after system failures
- Temporal Durability: State can be maintained for any duration - minutes, hours, or months
Distributed Job Queue
The Control Plane incorporates a sophisticated distributed job queue that routes function calls to the appropriate execution environments:
This job queue provides:
- Fair Distribution: Balances workloads across available worker pools
- Prioritization: Handles urgent tasks ahead of lower-priority ones
- Backpressure Management: Gracefully handles capacity constraints
- Routing Intelligence: Directs jobs to workers with appropriate capabilities
Long Polling: No Network Configuration Required
One of Inferable’s most distinctive architectural features is its use of long polling for communication between the Control Plane and function execution environments:
This approach offers significant advantages:
- Zero Inbound Ports: Your execution environments need no open inbound ports
- Firewall-Friendly: Works with standard outbound HTTPS (port 443)
- NAT Traversal: Functions behind NAT can still participate
- No VPN Required: Connect from any network with internet access
- No DNS Configuration: No need to set up DNS records or certificates
For organizations with strict security requirements, this means:
- Functions can run in private subnets with no internet-accessible endpoints
- No need to expose internal services to the public internet
- No complex network configuration or tunneling required
- Works seamlessly with existing security boundaries
Workflows: Orchestrating Complex Processes
Workflows are the coordination layer in Inferable, responsible for:
- Maintaining execution context across multiple steps
- Managing transitions between different processing stages
- Handling retries, timeouts, and error conditions
- Coordinating parallel execution paths when appropriate
Each workflow maintains its own state machine, automatically persisted by the Control Plane:
This persistence enables powerful capabilities:
- Indefinite Pausing: Workflows can pause for any duration while awaiting input
- Context Preservation: Complete state is maintained during pauses
- Seamless Resumption: Execution continues with all context intact
- Multi-Step Processes: Complex multi-stage workflows maintain coherence
Was this page helpful?