Site Reliability Engineering (SRE) Practices Online Course
Join our virtual, live instructor-led session and master Site Reliability Engineering (SRE) Practices Training from anywhere in the world.
Upcoming Virtual Training Schedules
Join from anywhere in the world with our live instructor-led sessions
| Code | Start Date | End Date | Duration | Fee | |
|---|---|---|---|---|---|
| SRE-05 | Mon - Fri (5 Days) | USD 850 | Reserve my seat → Register my team → | ||
| SRE-05 | Mon - Fri (5 Days) | USD 850 | Reserve my seat → Register my team → | ||
| SRE-05 | Mon - Fri (5 Days) | USD 850 | Reserve my seat → Register my team → | ||
| SRE-05 | Mon - Fri (5 Days) | USD 850 | Reserve my seat → Register my team → | ||
| SRE-05 | Mon - Fri (5 Days) | USD 850 | Reserve my seat → Register my team → | ||
| SRE-05 | Mon - Fri (5 Days) | USD 850 | Reserve my seat → Register my team → |
Here's What You'll Learn
Each module tackles real challenges you face in your role
SRE foundations and service targets
Observability with Prometheus and Grafana
Incident response and postmortems
Automation and closed-loop remediation
Capacity planning and load resilience
Reliability governance and reporting
SRE roadmap and executive communication
Market-specific guidance for Viet Nam
A country-aware view of the pressures, proof points, and practical tools that shape how this course applies locally.
Tools and platforms relevant to this field
5Field-relevant examples that may be featured in training where they support the confirmed scope. Exact coverage depends on participant needs and delivery format.
-
Datadog DatadogUsed to centralize infrastructure and application monitoring, alerts, dashboards, and incident investigation for production systems.
-
Grafana Grafana LabsUsed to build operational dashboards that help teams track SLIs, service health, and incident trends.
-
Prometheus The Prometheus AuthorsUsed to collect time-series metrics for alerting, SLO tracking, and service-level observability.
-
PagerDuty PagerDutyUsed to route alerts, manage on-call rotations, and coordinate incident response workflows.
-
Jira Service Management AtlassianUsed to manage incidents, postmortem actions, and operational tasks across engineering and support teams.
Where this course runs
Site Reliability Engineering (SRE) Practices Training is delivered in the cities below — pick the one that fits your schedule.























