Extended Services

Extended Services Architecture #

Purpose #

This document defines the extended management services layer within the Deevnet management plane.

Extended management services provide observability, automation, and access tooling. They are additive to Core Services—the minimal set of services required for the substrate to function on its own.

These services may be rebuilt entirely from the builder and core services. If the extended services tier is lost, core provisioning and network services remain operational.

For core management plane architecture (DNS authority, naming, provisioner role, OOB services), see Core Services.


1. Service Overview #

Services that run in the extended management tier:

ServiceDescription
ObservabilityMetrics collection, log aggregation, alerting
AutomationBuild automation runners, image factory helpers
AccessJump hosts, out-of-band tooling

2. Isolation Model #

Extended management services are isolated from tenant workloads.

AttributeValue
Workload typeInfrastructure-critical
Change cadenceSlow and deliberate
Blast radiusIsolated from tenants

This separation ensures:

  • Tenant rebuilds cannot disrupt core services
  • Observability and access remain available during failures
  • Platform recovery paths are always reachable

3. Design Principles #

Extended management services follow a strict set of principles:

  • Stability over velocity
  • Explicit configuration over convenience
  • Recoverability over optimization
  • Isolation from tenant experimentation

The extended services tier is intentionally boring. That is a feature.


4. Service Characteristics #

AttributeRequirement
AvailabilityHigh (relative to lab scale)
IdentityStable and deterministic
Network addressingStatic via DHCP reservations
BackupMandatory
Rebuild supportMust assist rebuilds, not depend on them

5. Provisioning Model #

Extended services are provisioned from the builder:

  • Post-install configuration via build automation
  • Management-plane VMs are created using build automation tooling
  • Simplicity and traceability are prioritized

Terraform is intentionally not used for management-plane workloads.

For the current platform and tooling used to host extended services, see Implementation & Tooling.


6. Network Identity #

All management-plane hosts:

  • Have stable, predictable network identities
  • Receive static address assignments
  • Have deterministic DNS records

7. Failure Philosophy #

If something breaks:

  • Tenant workloads may be destroyed and rebuilt
  • Management-plane services must still be reachable
  • Observability must continue to function
  • Recovery tooling must remain online

If a service is required to recover the platform, it belongs in the extended services tier.

Page last modified: March 8, 2026