A given incident might have multiple root causes: for example, perhaps it was caused by a combination of insufficient process automation, software that crashed on bogus input, and insufficient testing of the script used to generate the configuration. Root cause A defect in a software or human system that, if repaired, instills confidence that this event won’t happen again in the same way. Respectively, these alerts are classified as tickets, email alerts, 22 and pages. Alert A notification intended to be read by a human and that is pushed to a system such as a bug or ticket queue, an email alias, or a pager. ![]() The dashboard might also display team information such as ticket queue length, a list of high-priority bugs, the current on-call engineer for a given area of responsibility, or recent pushes. A dashboard may have filters, selectors, and so on, but is prebuilt to expose the metrics most important to its users. Dashboard An application (usually web-based) that provides a summary view of a service’s core metrics. Black-box monitoring Testing externally visible behavior as a user would see it. White-box monitoring Monitoring based on metrics exposed by the internals of the system, including logs, interfaces like the Java Virtual Machine Profiling Interface, or an HTTP handler that emits internal statistics. Monitoring Collecting, processing, aggregating, and displaying real-time quantitative data about a system, such as query counts and types, error counts and types, processing times, and server lifetimes. Even within Google, usage of the following terms varies, but the most common interpretations are listed here. ![]() There’s no uniformly shared vocabulary for discussing all topics related to monitoring. This chapter offers guidelines for what issues should interrupt a human via a page, and how to deal with issues that aren’t serious enough to trigger a page. Google’s SRE teams have some basic principles and best practices for building successful monitoring and alerting systems.
0 Comments
Leave a Reply. |