IT Event Management Overview
Companies use IT monitoring tools such as Datadog, Grafana, New Relic, AWS Cloudwatch, or GCP Stackdriver to monitor IT Applications and IT Infra. Users of these tools can configure monitors to send out an email when a certain metric threshold exceeds the acceptable value, i,e if the disk space usage has exceeded 80% of allocated disk space.
These alerts can be classified as “error”, ”warning”, ”fatal “ or “recovered” messages. On a day-to-day basis, these tools generate thousands of messages, where 95% of them are “warning” messages and only 5% of them need attention such as “fatal” messages. Even in these “fatal” messages, most of the modern applications or IT infra is self recoverable. When an application or IT infra recovers automatically, monitoring does send a follow-up recovery message to inform service back to normal. These messages create lots of noise to on-call support personnel, they have to go through each message and realize when to act on them or when an issue becomes an incident.
Z-Suite IT Event Management provides an “event intelligent solution” to minimize the noise and activate support teams only when they are needed so that support teams’ time can be used more productively.