AlertINT logoAlertINTGitHub soon

Getting started

Quickstart

Get AlertINT running as a single self-hosted binary, then connect your AI agent over the Model Context Protocol (MCP) to start analyzing incidents with production context.

1.

Download the Binary

Get the latest release for your architecture. AlertINT is distributed as a static go binary with zero external dependencies.

2.

Create Configuration

Define your environment variables in a config.yaml file. This allows AlertINT to authenticate with your alert source and AI backend. See Configuration section for more details.

config.yaml
3.

Start the Server

Launch the core service to start ingestion. This will begin listening for incoming webhooks and indexing incident history.

alertint serve --config config.yaml
4.

Start sending alerts

Point AlertINT service to start receiving and analyzing alerts. See Integrations section for more info.

5.

Connect an MCP Client

Open your preferred tool (Claude Code, Cursor, or Windsurf) and point it at the local MCP endpoint. Verify connectivity by asking a specific operational question:

List recent AlertINT incidents and summarize the most critical one.

Getting started

Configuration

AlertINT runs from a single config.yaml plus environment-based secrets.

Alertmanager webhook bearer token.
Anthropic API key env var.
Local state path.
Slack webhook URL env var, optional.
Prometheus base URL and optional bearer token env var.
alertmanager:
  webhook_token: ${ALERTINT_WEBHOOK_TOKEN}
anthropic:
  api_key: ${ANTHROPIC_API_KEY}
state:
  path: ./.alertint
slack:
  webhook_url: ${SLACK_WEBHOOK_URL}
prometheus:
  base_url: http://localhost:9090
  bearer_token: ${PROMETHEUS_BEARER_TOKEN}

Concepts

Architecture

One self-hosted binary sits between Alertmanager and your AI agent, and turns raw alerts into context worth investigating.

Phase: Ingest
1

Webhook Transmission

Alertmanager fires a POST to the AlertINT webhook receiver over HTTP(S). The payload is the standard Alertmanager webhook JSON — no custom format required.

Component:Alertmanager → AlertINT webhook receiver
Protocol:HTTP POST · Alertmanager webhook v2
Auth:Bearer token (ALERTINT_WEBHOOK_TOKEN)
2

Data Persistence & Deduplication

Received alerts are written to local state (SQLite by default). Duplicate firings of the same alert fingerprint within the dedup window are collapsed — one record per logical alert.

Component:AlertINT ingest pipeline
Storage:Local SQLite (configurable path)
Dedup key:Alertmanager alert fingerprint
3

Correlation Engine

Alerts that fire within a configurable time window and share common label dimensions (cluster, namespace, service, alertname) are grouped into a single incident record. Grouping is deterministic and re-evaluated as new alerts arrive.

Component:AlertINT correlation engine
Grouping keys:cluster · namespace · service · alertname
Window:Configurable — default 5 min
4

AI Synthesis

Once an incident is stable (no new alerts for the settle window), AlertINT sends the grouped alert payloads to the configured LLM (Anthropic Claude). The model returns a structured finding: probable cause, severity assessment, and suggested next checks. The finding is stored locally alongside the incident.

Component:AlertINT AI worker
Model:Anthropic Claude (claude-3-5-sonnet default)
Auth:ANTHROPIC_API_KEY env var
Output:Structured finding · stored in local state
5

Outbound Notification

The AI finding is posted to the configured Slack webhook. The message links back to the incident and includes the probable cause summary so on-call engineers know something is worth opening.

Component:AlertINT notifier
Target:Slack incoming webhook (optional)
Auth:SLACK_WEBHOOK_URL env var
Fallback:stdout when Slack is not configured
Phase: Investigate
6

Agent Entry via MCP

An engineer opens their MCP-capable AI client (Claude Code, Cursor, Windsurf, or any MCP-compatible tool). The client is pointed at the AlertINT MCP server endpoint, which runs as part of the same binary. No separate daemon is needed.

Component:AlertINT MCP server
Protocol:Model Context Protocol (MCP)
Clients:Claude Code · Cursor · Windsurf · any MCP client
Transport:stdio or HTTP-SSE (configurable)
7

Evidence Query

The agent calls AlertINT MCP tools to list recent incidents, retrieve alert payloads, and read the stored AI finding. All data is served from local state — no external calls at this stage.

MCP tools:list_incidents · get_incident · get_alerts · get_finding
Data source:Local AlertINT state (read-only)
Writes:None
8

Telemetry Context

The agent issues read-only PromQL queries through AlertINT MCP tools. AlertINT proxies the query to the configured Prometheus instance and returns the result. The agent can inspect CPU, memory, latency, error rate, or any metric stored in Prometheus — scoped to the incident time window.

MCP tools:query_prometheus · query_prometheus_range
Backend:Prometheus HTTP API (read-only)
Auth:PROMETHEUS_BEARER_TOKEN env var (optional)
Writes:None — PromQL is read-only
9

Decision Point

The agent synthesizes alert payloads, the stored AI finding, and live metric context into a response. The engineer decides the next action — re-query, escalate, or begin remediation — with full context already in the conversation.

Actor:Engineer + AI agent
AlertINT role:Context provider only
Next step:Engineer-controlled
Ready to Remediate

The engineer acts with full incident context — alert history, correlated findings, and live metrics — all sourced read-only from within the local runtime.

MCP-first investigation

An MCP server is the primary way you and your agent interact with AlertINT.

List recent AlertINT incidents.
Open the latest critical incident and summarize the evidence.
Show the alert labels and annotations for this incident.
Query Prometheus for CPU and memory around the incident window.
Compare the finding with the metric trend and suggest next checks.

Concepts

Scope and limits

These are deliberate boundaries, not missing features.

Read-only by design — AlertINT observes and reports, and never touches your infrastructure.
Self-hosted and local — your alert data and incident context stay on your machine.
Open-core — the core runtime is open source; enterprise features come later, on top.

What it does not do

  • No remediation, silences, or routing changes.
  • No Alertmanager, Kubernetes, or infrastructure writes.
  • No ticketing or paging integrations (PagerDuty, Jira, Linear).
These are deliberate boundaries. AlertINT is read-only so teams can adopt it without risk. Operator-controlled, approval-gated write workflows are a far-future direction — not something AlertINT does today.

Reference

FAQ

Short answers for the decisions that matter before you install it.

Do I have to host it myself?

Yes — AlertINT is self-hosted. It runs as a single binary with local state, so your alert data stays with you.

Does it replace Alertmanager?

No. Alertmanager still routes and manages alerts. AlertINT receives a webhook copy and builds investigation context.

Does it create silences or change routing?

No. AlertINT is read-only — it never changes Alertmanager, Kubernetes, or your infrastructure.

Why MCP?

Because engineers increasingly investigate through agentic tools. MCP lets those tools inspect AlertINT state and query telemetry without copy-pasting alert text into chat.

Why Prometheus?

Alert payloads alone are shallow. Read-only Prometheus queries let the connected agent inspect metrics around the incident window.

Can it remediate incidents?

No. AlertINT is read-only by design. Operator-controlled, approval-gated write workflows are a far-future direction.