Skip to content
Back to course
Expert75 to 100 minutesInstructor visible

M365 outage incident and RCA

A simulated M365 service issue impacts sign-in, Exchange, and Teams collaboration for a global user population.

Business context

Operations needs triage, communications, change control, and RCA discipline.

Technical objective

Diagnose the outage, create an incident report, define monitoring requirements, and produce RCA actions.

Student instructions

  1. 1Review service health, simulated alerts, and user-impact reports.
  2. 2Build a timeline with detection, escalation, mitigation, resolution, and prevention.
  3. 3Define monitoring, support routing, capacity, and change-control improvements.
  4. 4Validate RCA completeness.

Troubleshooting

  • If RCA lacks prevention, add owner, due date, metric, and release-management control.

Cleanup

  • Export RCA and operational runbook updates.
Launch flow

Provisioning readiness

Pending
Waiting for launch

Click Launch lab to start the provisioning flow and watch each stage complete.

0%
  1. Request accepted
  2. Capacity reserved
  3. Templates queued
  4. Validation running
  5. Workspace ready
m365-governance-policy-complete
Pending

Required templates

  • Microsoft 365 simulated tenant layer - defined

Validation checks

  • M365 governance policy complete: Governance deliverable covers RBAC, least privilege, audit, retention, DLP, release management, exception handling, and review cadence.

Expected result

Incident report and RCA are complete enough for executive and support handoff.

Reset policy: Outage scenarios can be re-injected. Teardown policy: Simulated tenant state expires with the course lab TTL.