Event Enrichment FAQ

  • What is Event Enrichment?

Event Enrichment is the process of classifying and enhancing events with critical thinking 300x205 Event Enrichment FAQinformation that accelerates remediation efforts. Enrichments typically include the name / contact details for the person / team associated with the event source (application, server, router, switch etc), as well as any known steps for triage and remediation.

  • Why would I use Event Enrichment?

To provide a live framework for your IT Operations Runbook and to decrease remediation time for network events. Most IT Operations Runbooks are static snapshots which are not continually updated. An outdated Runbook, missing critical information, causes complications, frustration, and increased time to repair at the worst possible time.

  • Give me an example of the Event Enrichment process

 An event arrives from a Network Management System (NMS) denoting the failure of a critical interface on a router. Upon receipt, we add escalation and remediation information to the event. The enriched event is then forwarded on to the Network Operations Center / On-call engineer responsible for the event source for remediation.

  • What does an enriched event look like?

[NAGIOS]

Notification Type: PROBLEM
HOST: dbserver1
State: DOWN
Address: 70.86.17.12
Info: CRITICAL – Host Unreachable (70.86.17.12)
Date/Time: Sat Jan 16 11:09:23 JST 2013

ESCALATION:
This is a CRITICAL alert which needs immediate escalation to the site DEVOPS team. Use the Pagerduty DEVOPS service.

REMEDIATION:
1) Attempt to ping the host from the nagios server
2) If ping is successful, attempt to ssh to the host (
ops1@70.86.17.12)
3) if ssh is not successful , initiate DB_HOST_DOWN recipe sequence

  • How do I implement the Event Enrichment process?

Use the Event Enrichment cycle:

Triage your events:

Categorize events! Events are either actionable or noise. If they are actionable, they need enrichment. If they are noise, eliminated them.

Suppress noise:

deming eventenrichment 300x201 Event Enrichment FAQ

The Event Enrichment Cycle

Get rid of the noise! If noise is overwhelming your team (Yes! NMS systems absolutely excel at generating noise), then critical events are lost.

Enrich your events:

Add critical information! Now that you know that an event is actionable, do your team a giant favor, and add critical escalation/remediation information to the event before it arrives at the NOC. If groggy engineers / NOC operators  get middle-of-the-night-event, why should they start looking for the escalation and remediation information? If it is already in the event then they can immediately start working on resolving the problem.

What's your opinion?