RTBH mitigation
When the engine reports an attack, Kapkan mitigates it by announcing a remotely-triggered
blackhole (RTBH) route for the targeted host: a /32 (IPv4) or /128 (IPv6) prefix carried
over an embedded GoBGP speaker, tagged with your RTBH
community and pointed at a discard next-hop. Your edge routers match the community and drop all
traffic to that host, dropping the flood before it reaches your network's core.
Mitigation is destination-based: the blackhole kills traffic to the target. For an incoming flood this stops the attack; for an outgoing attack from a compromised host it takes that host offline (which usually stops the abuse) and stops the outbound flood only where your edge also drops sources in blackholed prefixes, for example with uRPF.
!Dry-run by default
Until you explicitly set dry_run: false, every would-be blackhole is computed, logged and
exposed through the API — but never announced to your routers. Validate detection and BGP
peering against production telemetry before any route can be sent. See Going live.
Mitigation methods
blackhole (RTBH) is the default method and the subject of the rest of this page. Two other
methods are available via the mitigation key, each overridable per hostgroup:
- FlowSpec (
mitigation: flowspec) — drop only the attack vector with BGP FlowSpec rules (RFC 8955/8956) instead of blackholing the whole victim. - Escalation ladders (
escalation:) — step the response up the longer an attack persists, from alert to FlowSpec to blackhole.
Both share the same ban lifecycle, TTL, unban hysteresis and max_active_bans cap described below.
BGP configuration
The bgp block defines Kapkan's BGP identity, the blackhole next-hops, the RTBH community, and
the eBGP peers it announces to.
bgp:
local_asn: 65000 # Kapkan's local ASN
router_id: "198.51.100.1" # must be a valid IPv4 address
next_hop: "192.0.2.1" # IPv4 discard next-hop
next_hop6: "100::1" # IPv6 discard next-hop (optional)
community: "65000:666" # RTBH community, ASN:value
# communities: ["65000:666", "65000:777"] # full set; overrides `community`
# local_pref: 100 # optional LOCAL_PREF for iBGP peers; 0 = omit
neighbors:
- address: "198.51.100.254"
remote_asn: 65001
port: 179 # optional; defaults to 179, override for testing
| Key | Meaning |
|---|---|
local_asn | Kapkan's local ASN for the eBGP sessions. Required. |
router_id | The BGP router ID. Must be a valid IPv4 address. |
next_hop | The IPv4 discard next-hop announced with each /32 blackhole. |
next_hop6 | The IPv6 discard next-hop announced with each /128 blackhole. Optional; when unset, Kapkan falls back to 100::1 (the RFC 6666 discard prefix). |
community | The RTBH community in ASN:value form (for example 65000:666). Parsed once at load into the wire value sent with every route. |
communities | Optional list of communities that overrides community when set, for upstreams that expect a full set. |
local_pref | Optional LOCAL_PREF path attribute attached to blackholes (meaningful to iBGP peers). Default 0 omits it. |
neighbors[] | The eBGP peers Kapkan announces to. Each entry takes an address, a remote_asn, and an optional port (default 179, intended for testing). |
community, communities, next_hop, next_hop6 and local_pref can all be overridden per
hostgroup — see Per-hostgroup BGP attributes.
The blackhole route Kapkan builds for a target reads as <prefix> next-hop <next_hop> community <community> — exactly the string surfaced in the route field of the ban object.
IPv6 targets use next_hop6 (or the 100::1 fallback) in place of next_hop.
iPeering happens in dry-run
The BGP speaker peers with your neighbors even while dry_run is true — only route
announcements are gated on the flag. This lets you confirm sessions reach ESTABLISHED (logged
as bgp peer state) and validate connectivity before you announce a single route.
Per-hostgroup BGP attributes
The global bgp block sets the default blackhole next-hops and RTBH community. A
hostgroup can override any of them, so different customers signal their own
upstreams:
bgp:
next_hop: "192.0.2.1"
community: "65000:666" # the default RTBH community
hostgroups:
- name: customer-a
networks: ["203.0.113.64/26"]
bgp:
communities: ["65000:100", "65001:200"] # customer-A's own blackhole signal
next_hop: "192.0.2.50" # and discard next-hop
local_pref: 250
Each field left unset inherits the global bgp value, so a group can override just its community
while sharing the global next-hop. The resolved attributes are frozen on each ban when it is
created: a config reload changes only future bans, never the route a live ban already announced.
The per-ban next_hop, community and local_pref are visible in /api/v1/bans and in the
route field.
The ban lifecycle
A ban is the unit Kapkan tracks for each blackhole decision. Its lifecycle is bounded at every step by the safety model — there is no path to a permanent or runaway ban.
- Announce on detection. When the engine reports a new per-host attack and the host's policy
permits banning, Kapkan announces the blackhole route (or, in dry-run, records the would-be
route) and the ban enters the
activestate. A second attack — for example an incoming and an outgoing flood on the same host at once — shares the one RTBH route rather than announcing twice. - Auto-withdraw on TTL. Every announcement carries a TTL. After
ban.ttl_secondsthe route is auto-withdrawn even if the attack is still ongoing; a persisting attack refreshes the TTL, but never beyond a fresh TTL from now. There are no permanent bans. - Unban with hysteresis. A ban driven by an attack is withdrawn only after traffic stays
below threshold for
ban.unban_hysteresis_seconds. This anti-flap delay prevents a borderline attack from rapidly announcing and withdrawing the same route. - Hard cap.
ban.max_active_bansis an absolute ceiling on simultaneous active bans. Past the cap, new bans are refused (returned as a ban in therejectedstate with reasonmax_active_bans reached) and operators are alerted — so Kapkan can never blackhole half your network in a broad attack.
ban:
ttl_seconds: 1800 # every announcement auto-withdraws after this
unban_hysteresis_seconds: 60 # traffic must stay below threshold this long before withdraw
max_active_bans: 100 # hard cap on simultaneous bans
A ban is also withdrawn early if a config reload removes its target from the protected
networks — Kapkan will not keep a route up for address space it no longer protects.
Dry-run
In dry-run mode (dry_run: true, the default) Kapkan does everything except send the route.
The target prefix, next-hop, community and the full route string are computed, the decision is
logged as DRY-RUN: would announce blackhole route (not sent), and the ban is tracked in the API
with dry_run: true and the live active/withdrawn states. TTL expiry and hysteresis run
exactly as they would in production, so the entire lifecycle is observable before any route
reaches a router.
Nothing is announced until you set dry_run: false and reload (SIGHUP or
POST /api/v1/config/reload). Because peering happens in dry-run, the only
behavior that changes when you flip the flag is the announce/withdraw itself.
iValidate first
Run against production telemetry in dry-run, confirm detections fire on the right prefixes and
the route fields are correct, then turn off dry-run. The full procedure is in
Going live.
Manual bans
Operators can blackhole or release a host directly through the REST API:
POST /api/v1/ban
Content-Type: application/json
{"ip": "203.0.113.66"}
POST /api/v1/unban
Content-Type: application/json
{"ip": "203.0.113.66"}
A manual ban is created with manual: true and follows the same TTL and hysteresis as an
automatic one. It honors every safety rule with no exceptions: a whitelisted target is refused
(returns 409, reason whitelisted), a target outside the configured networks is refused
(returns 409, reason outside configured networks), and exceeding max_active_bans is refused
(returns 409). Both POST endpoints require the application/json content type. See the
API reference for the full request and response shapes.
The ban object
The API returns each ban as a JSON object. These are the fields exactly as serialized by the mitigator:
| Field | Type | Meaning |
|---|---|---|
target | string | The blackholed host address (IPv4 or IPv6). |
prefix | string | The host prefix announced: /32 for IPv4, /128 for IPv6. |
metric | string | The metric that triggered the attack-driven ban (omitted for manual bans). |
rate | number | The observed rate at ban time, in the metric's units (omitted when zero). |
threshold | number | The threshold the rate crossed (omitted when zero). |
next_hop | string | The discard next-hop announced with the route. |
community | string | The RTBH community in ASN:value form. |
local_pref | number | The LOCAL_PREF attached to the route (omitted when zero). |
route | string | The full route as <prefix> next-hop <next_hop> community <community>, or a flowspec: ... summary for FlowSpec bans. |
state | string | One of active, withdrawn, or rejected. |
dry_run | boolean | Whether the ban was made in dry-run mode (route not actually sent). |
manual | boolean | true for operator-requested bans, false for automatic ones. |
started_at | timestamp | When the ban was created. |
expires_at | timestamp | When the TTL auto-withdraws the route. |
withdrawn_at | timestamp | When the route was withdrawn (omitted while active). |
reason | string | Why the ban was withdrawn or rejected (omitted while active). |
method | string | The mitigation method that produced the ban: blackhole or flowspec. |
flowspec | array | The generated FlowSpec rules, when the method is flowspec. |
escalation | array | The configured escalation ladder, when one is set. |
escalation_step | number | The index of the ladder's current rung. |
A rejected ban is never announced — it records a refusal (whitelisted, outside networks, or
cap reached) so the decision is auditable in the API and notifications.
Related
- FlowSpec mitigation — surgical drops that spare the victim's other traffic.
- Escalation ladders — step the response up the longer an attack persists.
- Safety model — the dry-run, TTL, hysteresis, cap and whitelist rules enforced in code.
- Going live — validate detection and peering, then turn off dry-run.
- REST API — the ban endpoints, request shapes and response codes.