Kubernetes Says Healthy.
Your System Is Broken.
CI/CD pipeline succeeded. Kubernetes reports Ready. But the deployment just broke PubSub permissions, the image tag doesn't exist, and a container is OOMKilled every 30 seconds.
DeployGuard detects deployment failures that infrastructure reports as healthy.
One command. Immediate detection of deployment correctness failures.
The Question No One Answers
"Did this deployment break production even though Kubernetes says it's healthy?"
You merged the PR. CI passed. kubectl rollout status completed. Kubernetes reports all pods Ready. But the system is broken — and nobody knows yet.
12 minutes later, Slack lights up:
"Is production down?"
It is. Kubernetes passed readiness probes. CI/CD pipeline returned exit code 0.
The container has been crash-looping since deployment. Nobody was told.
What DeployGuard Detects (v0.1)
Infrastructure and dependency correctness failures — immediately after deploy
CrashLoopBackOff
Container starts, crashes, restarts. Kubernetes keeps retrying. Pipeline says "success."
ImagePullBackOff
Wrong tag, expired registry credentials, or image deleted. Pods stuck in Pending.
OOMKilled
Container exceeded memory limit. Killed silently. Users see 502.
ProgressDeadlineExceeded
Rollout timed out. Old pods still serving. New version never came up.
CreateContainerConfigError
Missing ConfigMap, Secret, or invalid volume mount. Container cannot start.
ExternalDependency
Broken permissions, missing queues, unreachable topics, failed scheduling.
What DeployGuard Does NOT Detect
DeployGuard guards deployment correctness — not application correctness.
The gap between "deployed" and "actually working" is where incidents live.
DeployGuard closes it.
What DeployGuard Is — and Is Not
DeployGuard answers one specific question:
"Did this deployment break production even though Kubernetes says it's healthy?"
DeployGuard is NOT
DeployGuard IS
DeployGuard doesn't collect numbers about your system. It produces structured incidents when a deployment introduces an infrastructure or dependency correctness failure. It detects the failure, tracks its lifecycle, and confirms when it resolves.
Why Not Prometheus / Grafana?
Prometheus collects numbers. DeployGuard produces incidents.
Prometheus says:
Something is wrong. You need to investigate which pod, what error, which deployment caused it, and whether it's still happening.
DeployGuard says:
Container: payment | Since: 14:32
Commit: abc123 by @dev
This deployment broke this specific workload. Here's the failure type, the commit that caused it, and it's still active.
| Tool | What It Produces | DeployGuard Produces |
|---|---|---|
| Prometheus / Grafana | Shows that container_restarts_total increased | Tells you "payment-service is in CrashLoopBackOff since 14:32, caused by deploy abc123" |
| Kubernetes Probes | Reports "Pod is Ready" | Reports "Pod passed readiness but is crash-looping every 30s" |
| CI/CD Pipeline | Reports "Deployment succeeded" (exit code 0) | Reports "Rollout timed out — ProgressDeadlineExceeded in prod/checkout" |
| APM / Sentry | Catches application exceptions after a request is served | Catches infrastructure failures before requests can be served |
Prometheus answers "is something wrong?"
DeployGuard answers "this deployment broke PubSub permissions in prod/checkout at 14:32"
How DeployGuard Works
Six steps. No sidecars. No code changes. No instrumentation. No config files.
┌─────────────────────────────────────────────────────────────────┐
│ Your Kubernetes Cluster │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ DeployGuard Agent │ │
│ │ │ │
│ │ ✓ Watches: Pods / Deployments / Events / Nodes / PVCs │ │
│ │ ✓ Detects: CrashLoopBackOff, OOMKilled, etc. │ │
│ │ ✓ Produces: Typed incidents with full context │ │
│ │ ✓ Resolves: Auto-detects recovery │ │
│ │ │ │
│ │ 🔒 Read-only RBAC — Cannot modify your cluster │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ │
└──────────────────────────────│───────────────────────────────────┘
│ HTTPS (outbound only)
▼
┌─────────────────────────────────────────────────────────────────┐
│ DeployGuard Control Plane (SaaS) │
│ │
│ ├── Incident Ingestion & Deduplication │
│ ├── Failure Lifecycle (active → resolved) │
│ └── Notification Delivery (Slack, Webhook) │
└─────────────────────────────────────────────────────────────────┘Create Account
Sign up at app.deployguard.net. Get your API key.
Register Cluster
Name your cluster. Select environment (dev / staging / prod).
kubectl apply Agent
One command installs the agent. Creates namespace, RBAC, and deployment. Read-only.
Agent Detects Failures
The agent watches Pods, Deployments, Events, Nodes, and PVCs. Detects correctness failures in real-time.
Structured Incident Appears
Not a log line. A typed incident: CrashLoopBackOff, namespace, workload, severity, timestamp. Immediately.
Explicit Resolution
When the workload recovers, the agent detects it and resolves the incident. Full lifecycle tracked.
What a DeployGuard Incident Looks Like
{
"failureType": "CrashLoopBackOff",
"severity": "error",
"namespace": "prod",
"workloadType": "Deployment",
"workloadName": "payment-service",
"container": "payment",
"message": "Back-off restarting failed container payment",
"firstSeen": "2026-02-16T14:32:01Z",
"status": "active"
}Not a metric. Not a log line. A semantic incident with type, severity, and context.
Install in 10 Minutes
From zero to first incident detection. No Helm charts. No values.yaml. No operator to manage.
Step 1: Create Account
2 minSign up at app.deployguard.net. Get your API key.
Step 2: Register Cluster
30 secName your cluster and select environment (dev / staging / prod).
Step 3: Get Install Command
10 secGenerate a one-time install URL with your API key baked in.
Step 4: kubectl apply
2 minOne command. Creates namespace, ServiceAccount, ClusterRole, and agent Deployment. Read-only RBAC.
Step 5: Agent Detects Failures
< 1 secAgent watches Pods, Deployments, Events, Nodes, and PVCs. Deployment correctness failures produce structured incidents.
Step 6: Incident + Resolution
real-timeTyped incident appears immediately. When the workload recovers, the agent resolves it automatically.
# One command to install
$ kubectl apply -f https://agent.deployguard.net/install/YOUR_TOKEN
# Creates: namespace, serviceaccount, clusterrole, deployment
What You Get
Deployment correctness verification. Not another tool to configure.
Detection in Seconds
Know about deployment failures immediately. Before users. Before on-call escalation. Before anyone checks Slack.
Semantic Incidents
Not numbers. Not log lines. Typed incidents: CrashLoopBackOff, OOMKilled, ProgressDeadlineExceeded — with namespace, workload, and severity.
Zero Code Changes
Works with any language, framework, or runtime. The agent watches Kubernetes resource state, not your application code.
Read-Only Agent
Agent has zero write permissions. Cannot modify deployments, pods, or any cluster resource. RBAC enforced.
Commit-Level Context
Each failure is tied to a specific deployment, namespace, and git commit. No guessing which change broke things.
Full Lifecycle Tracking
Every incident has a first_seen, last_seen, and resolved_at. Active → Resolved. Full audit trail.
Security & Trust
We know you're protective of your infrastructure. So are we.
Read-Only RBAC
The agent only has get, list, and watch permissions. It cannot create, update, delete, or patch any resource in your cluster.
rules:
- apiGroups: ["", "apps"]
resources: ["pods", "deployments", "events",
"replicasets", "nodes", "pvcs"]
verbs: ["get", "list", "watch"] # No write access
- apiGroups: ["authorization.k8s.io"]
resources: ["selfsubjectaccessreviews"]
verbs: ["create"] # RBAC self-check onlyNo Cluster Credentials
DeployGuard never receives your kubeconfig, API server URL, or cloud provider credentials. The agent runs inside your cluster using a ServiceAccount.
Short-Lived Agent JWTs
Agent JWTs expire in 1 hour and auto-refresh. Bootstrap tokens are single-use and expire in 24 hours. API keys use SHA-256 hashing — plaintext is never stored.
Trust Architecture
Customer Network Internet DeployGuard
│ │ │
│ HTTPS (outbound) │ │
│─────────────────────────>│─────────────────────>│
│ │ │
│ HTTPS response │ │
│<─────────────────────────│<─────────────────────│
│ │ │
│ No inbound traffic │ │
│ ✗ │ │Stop Discovering Broken Deployments From Users
Kubernetes says Ready. CI/CD says success. But the deployment broke something.
DeployGuard tells you in seconds — not when a user files a ticket.
No credit card required. No sales call. Install in 10 minutes.