Idle Process Management

DeployStack Satellite implements intelligent idle process management for stdio subprocess MCP servers. This system automatically terminates processes that remain inactive for extended periods and respawns them on-demand, optimizing memory usage while maintaining instant availability for users.

Purpose: Idle process management reduces memory consumption from constantly running MCP servers by terminating inactive processes. When a client needs a dormant server, the system automatically respawns it within 1-3 seconds, providing a balance between resource efficiency and user experience.

Overview

Idle process management works through three coordinated systems:

Idle Process Cleanup Job: Monitors running processes and terminates those exceeding idle timeout
Dormant State Tracking: RuntimeState maintains configurations of terminated processes for quick respawning
Automatic Respawning: ProcessManager respawns dormant processes when clients request them

Idle Detection & Termination

Idle Timeout Configuration

Processes are considered idle based on inactivity duration: Default Timeout: 180 seconds (3 minutes) Configuration:

# Set custom idle timeout (in seconds)
export MCP_PROCESS_IDLE_TIMEOUT_SECONDS=300  # 5 minutes

Activity Tracking:

lastActivity timestamp updated on every message sent or received
Idle duration calculated as: now - lastActivity
Only stdio transport processes are subject to idle termination

Idle Process Cleanup Job

The cleanup job runs automatically every 30 seconds: Operation Flow:

Retrieve all running stdio processes from ProcessManager
Check each process against idle criteria
Terminate idle processes and mark as dormant
Update RuntimeState with dormant configurations

Idle Criteria Checks:

Process status must be ‘running’ (skips ‘starting’, ‘terminating’, etc.)
Process age must exceed spawn grace period (60 seconds)
No active requests in flight
Idle duration exceeds configured timeout (180 seconds default)

Spawn Grace Period

Newly spawned processes receive protection from immediate termination: Grace Period: 60 seconds (configurable) Configuration:

# Set custom grace period (in seconds)
export MCP_PROCESS_SPAWN_GRACE_PERIOD_SECONDS=90

Protection Rules:

Processes younger than grace period cannot be marked idle
Allows time for MCP handshake completion
Prevents termination during tool discovery
Ensures processes become fully operational before idle monitoring begins

Grace Period Purpose: Without the grace period, processes could be terminated during initialization, causing race conditions where the handshake completes but the process is immediately marked idle and terminated. The 60-second default provides ample time for handshake, tool discovery, and initial activity.

Termination Process

When a process exceeds the idle timeout: Steps:

Log idle duration and last activity timestamp
Store process configuration in dormant map (RuntimeState)
Emit mcp.server.dormant event to Backend
Execute graceful process termination
Tools remain cached for fast respawn (not cleared)
Remove from active process tracking maps

Event Emission:

{
  type: 'mcp.server.dormant',
  data: {
    server_id: string,
    server_slug: string,
    team_id: string,
    process_id: number,
    idle_duration_seconds: number,
    last_activity_at: string (ISO 8601)
  }
}

Dormant State Management

Dormant Configuration Storage

RuntimeState maintains a separate map of dormant process configurations: Stored Information:

Complete MCPServerConfig (command, args, env, installation details)
Allows identical respawn without Backend communication
Remains in memory until process respawns or satellite restarts

Map Structure:

// Installation name → MCPServerConfig
Map<string, MCPServerConfig>

Dormant vs Active Tracking

Active Processes:

Tracked in ProcessManager maps (by ID, by name)
Have active ProcessInfo with status, metrics, handlers
Consume memory for process overhead and buffers

Dormant Processes:

Only configuration stored in RuntimeState
No active process or handlers
Minimal memory footprint (~1-2KB per config)
Tools remain in cache for instant availability

Dormant Process Queries

RuntimeState provides methods for dormant process inspection: Query Methods:

getDormantConfig(installationName): Retrieve specific config
getDormantCount(): Count total dormant processes
getAllDormantConfigs(): List all dormant configurations

Heartbeat Reporting:

Dormant count included in heartbeat data
Enables Backend visibility into idle process management
Tracks dormant vs active process ratio

Automatic Respawning

Respawn Trigger

Dormant processes respawn automatically when clients request them: Trigger Points:

MCP client calls tools/list and process is dormant
MCP client calls tools/call for tool on dormant server
Any MCP request targeting dormant installation name

Detection:

// ProcessManager checks for dormant config
const dormantConfig = runtimeState.getDormantConfig(installationName);
if (dormantConfig) {
  // Respawn process
}

Respawn Process Flow

The respawn process follows the same path as initial spawn:

Request → Check Active → Check Dormant → Spawn Process → Handshake → Ready (tools already cached)
    │           │              │              │              │              │
  Client    Not Found    Config Found    child_process   Initialize   Serve

Timing:

Respawn duration: 1-2 seconds (faster - no tool discovery needed)
Includes handshake only (tools already cached)
Client experiences minimal delay on first request after dormancy

Concurrent Respawn Prevention

Multiple concurrent requests to the same dormant process are handled safely: Respawn Lock Mechanism:

First request initiates respawn and stores Promise in map
Subsequent requests await the same Promise
All requests resolve when respawn completes
Prevents duplicate spawning of same process

Implementation:

// ProcessManager tracks in-progress respawns
private respawningProcesses = new Map<string, Promise<ProcessInfo>>();

Post-Respawn Cleanup

After successful respawn: Cleanup Operations:

Remove configuration from dormant map
Add ProcessInfo to active tracking maps
Emit mcp.server.respawned event
Tools already cached (no rediscovery needed)
Remove respawn Promise from tracking

Event Emission:

{
  type: 'mcp.server.respawned',
  data: {
    server_id: string,
    server_slug: string,
    team_id: string,
    process_id: number,
    dormant_duration_seconds: number,
    respawn_duration_ms: number
  }
}

Performance Characteristics

Memory Savings

Idle process management provides significant memory benefits: Per-Process Savings:

Active process: ~10-20MB (base Node.js + application)
Dormant config: ~1-2KB (configuration only)
Reduction: ~99% memory per idle process

Example Scenario:

100 MCP servers installed
10 actively used (100MB memory)
90 dormant (180KB memory)
Total: ~100MB vs ~1.5GB if all active

Timing Impact

User Experience:

Active process: Instant response (~10-50ms latency)
Dormant process first request: 1-2 second delay (faster - no tool discovery)
Subsequent requests: Instant (process remains active)

Respawn Timing Breakdown:

Process spawn: 500-1000ms
MCP handshake: 500-1000ms
Total: 1000-2000ms (tools already cached, no discovery needed)

Cleanup Efficiency

Job Performance:

Runs every 30 seconds
Checks all processes: <1ms per process
Termination overhead: ~100ms per process
Minimal CPU impact during normal operation

Configuration Best Practices

Idle Timeout Selection

Choose timeout based on usage patterns: Short Timeout (60-120 seconds):

High memory constraints
Predictable usage patterns
Acceptable respawn delay
Many infrequently used servers

Medium Timeout (180-300 seconds) (Default):

Balanced memory vs experience
Mixed usage patterns
Occasional bursty activity
Recommended for most deployments

Long Timeout (600+ seconds):

Ample memory available
Continuous or frequent usage
Minimal respawn tolerance
Mission-critical low latency

Grace Period Tuning

Adjust grace period based on environment: Shorter Grace Period (30-45 seconds):

Fast network connections
Simple MCP servers (quick handshake)
Aggressive memory optimization

Standard Grace Period (60 seconds) (Default):

Recommended for production
Accounts for npx package downloads
Handles slow network conditions
Prevents initialization race conditions

Longer Grace Period (90-120 seconds):

Slow network environments
Complex MCP servers (large dependencies)
Extra safety margin

Monitoring Recommendation: Track mcp.server.dormant and mcp.server.respawned events to tune idle timeout. If processes frequently dormant but respawn shortly after, increase timeout. If processes stay dormant for long periods, timeout is well-tuned.

Monitoring & Observability

Event-Based Monitoring

Track idle process management through Backend events: Key Metrics:

mcp.server.dormant: Count of processes entering dormant state
mcp.server.respawned: Count of successful respawns
idle_duration_seconds: Time process was inactive before termination
dormant_duration_seconds: Time process spent dormant before respawn

Log Analysis

Important log operations to monitor: Idle Detection:

[DEBUG] Skipping process in grace period: server-name (age: 25s)
[DEBUG] Skipping process with active requests: server-name
[INFO] Idle check: terminated 2 idle process(es)

Dormant Marking:

[INFO] Marking process as dormant due to inactivity: server-name
[INFO] Process marked as dormant and terminated: server-name

Respawning:

[INFO] Respawning dormant process: server-name
[INFO] Dormant process respawned successfully: server-name

Heartbeat Data

Satellite heartbeat includes dormant process information: Reported Metrics:

Total active process count
Total dormant process count
Processes by status (running, starting, terminating, failed)

Error Handling

Respawn Failures

If respawn fails, standard error handling applies: Failure Scenarios:

Process spawn error
Handshake timeout
Invalid configuration

Error Flow:

Respawn attempt logs error
Dormant config remains in map
Next request triggers another respawn attempt
Auto-restart logic applies (3 attempts max)

Termination Failures

Idle termination handles edge cases: Process Not Found:

Logs warning (non-critical)
Process already terminated externally
Cleanup continues normally

Graceful Shutdown Timeout:

SIGTERM sent first (10-second wait)
SIGKILL sent if timeout exceeded
Force termination ensures cleanup

Development Testing

Manual Idle Testing

Test idle process management locally: Force Idle Termination:

# Set short idle timeout for testing
export MCP_PROCESS_IDLE_TIMEOUT_SECONDS=30

# Start satellite
npm run dev

# Spawn process via Backend command
# Wait 30 seconds without activity
# Process should terminate and become dormant

Test Respawning:

# After process is dormant, make MCP request
curl -X POST http://localhost:3001/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":"1","method":"tools/list","params":{}}'

# Process should respawn automatically
# Check logs for respawn confirmation

Grace Period Testing

Verify grace period protection:

# Set very short idle timeout and standard grace period
export MCP_PROCESS_IDLE_TIMEOUT_SECONDS=10
export MCP_PROCESS_SPAWN_GRACE_PERIOD_SECONDS=60

# Spawn process
# Process should NOT be terminated within first 60 seconds
# Even if idle timeout is only 10 seconds

Production Considerations

Memory Planning

Calculate expected memory usage: Formula:

Total Memory = (Active Processes × 15MB) + (Dormant Processes × 2KB)

Example:

200 total MCP servers
20 actively used (20 × 15MB = 300MB)
180 dormant (180 × 2KB = 360KB)
Total: ~300MB vs ~3GB without idle management

Monitoring Alerts

Configure alerts for idle process health: Alert Conditions:

High respawn rate (>10 per minute): Timeout too aggressive
High dormant count (>80% of total): Consider longer timeout
Respawn failures: Configuration or resource issues
Grace period violations: System overload or timing bugs

Resource Limits

Coordinate with nsjail resource limits: Production Limits:

Per-process memory: 50MB (nsjail limit)
System memory: Plan for peak active processes
Dormant processes: Negligible memory impact

Process Management - Core stdio subprocess management
Background Jobs - Job system architecture
Event System - Event emission and tracking
Tool Discovery - How tool caching interacts with dormancy

Basics

Advanced

MCP Server Management

Backend Communication

​Idle Process Management

​Overview

​Idle Detection & Termination

​Idle Timeout Configuration

​Idle Process Cleanup Job

​Spawn Grace Period

​Termination Process

​Dormant State Management

​Dormant Configuration Storage

​Dormant vs Active Tracking

​Dormant Process Queries

​Automatic Respawning

​Respawn Trigger

​Respawn Process Flow

​Concurrent Respawn Prevention

​Post-Respawn Cleanup

​Performance Characteristics

​Memory Savings

​Timing Impact

​Cleanup Efficiency

​Configuration Best Practices

​Idle Timeout Selection

​Grace Period Tuning

​Monitoring & Observability

​Event-Based Monitoring

​Log Analysis

​Heartbeat Data

​Error Handling

​Respawn Failures

​Termination Failures

​Development Testing

​Manual Idle Testing

​Grace Period Testing

​Production Considerations

​Memory Planning

​Monitoring Alerts

​Resource Limits

​Related Documentation

Idle Process Management

Overview

Idle Detection & Termination

Idle Timeout Configuration

Idle Process Cleanup Job

Spawn Grace Period

Termination Process

Dormant State Management

Dormant Configuration Storage

Dormant vs Active Tracking

Dormant Process Queries

Automatic Respawning

Respawn Trigger

Respawn Process Flow

Concurrent Respawn Prevention

Post-Respawn Cleanup

Performance Characteristics

Memory Savings

Timing Impact

Cleanup Efficiency

Configuration Best Practices

Idle Timeout Selection

Grace Period Tuning

Monitoring & Observability

Event-Based Monitoring

Log Analysis

Heartbeat Data

Error Handling

Respawn Failures

Termination Failures

Development Testing

Manual Idle Testing

Grace Period Testing

Production Considerations

Memory Planning

Monitoring Alerts

Resource Limits

Related Documentation