A survey of 505 software developers, IT professionals and IT decision makers (ITDMs) in the U.S. finds that 70% of respondents work for organizations that hold developers responsible for deployments, with only 22% practicing conducting blameless post-mortems.
Conducted by Atlassian, the survey also finds only 60% of respondents work for organizations where developers are part of the team managing IT incidents, with 57% reporting developers are on call when needed. A total of 70%, however, report it is easy to bring in the right teammates when needed.
Most respondents (86%) are also running postmortem or post-incident reviews (PIR) after an incident. A full 97% also report having procedures, processes, or runbooks in place for managing incidents, with 78% using wargames or some other form of incident management training.
Overall, the survey finds IT incidents are nearly all managed by an IT operations team (95%). Besides developers, other teams involved include engineering (53%), executives (43%) and site reliability engineers (37%).
However, the survey also finds only 15% of respondents have cross-functional teams that include developers, operations staff and other IT professionals and only 35% said their organization emphasizes DevOps as a cultural shift that merges development and operations teams. Only 30% highlighted the use of automation and continuous integration/continuous delivery (CI/CD) practices to streamline workflows, reduce manual operations, and ensure faster and more reliable software delivery.
Additionally, only a quarter (25%) associate DevOps with improving the speed and efficiency of software development, and even less (20%) are focused on the role DevOps plays in improving software quality and reliability through automated testing, continuous monitoring and real-time performance tracking. Only 15% said security and compliance is a top priority within their DevOps practices.
Nevertheless, 96% said they feel development and operations teams have the visibility needed to do their jobs effectively in a way that minimizes disruption. A full 70% said their organization can access incident history, recent deployments or recent changes for context during incident response but only 55% have access to live service health information.
More than two-thirds of respondents (68%) work for organizations that proactively manage IT incidents, the survey finds. Nearly all (99%) are using monitoring tools, with 86% relying on them to discover incidents. Nearly three-quarters (73%) said they are using those tools to proactively discover incidents before they are reported by end users or customers.
Capacity planning (80%), artificial intelligence (AI) for incident trending (74%) and user transaction monitoring (73%) are the most widely used incident tools employed, the survey finds.
Kate Clavet, product marketing manager for IT service management (ITSM) at Atlassian, noted that as artificial intelligence (AI) continues to evolve it will only become easier to incorporate simulations into incident response training.
In terms of metrics tracked, most respondents track meantime to resolve (80%), while 71% track meantime to acknowledge and 55% track meantime to respond. A total of 61% use the tickets created by ITSM platforms as the single source of truth for managing incidents, while 39% rely on some type of chat tool. A full 60% said they are not using a configuration management database (CMBD) to help manage their IT environments. A full 97% said they do, however, have change management practices in place to minimize potential disruptions
Despite some recent notorious outages, incident management appears to have reached a level of advanced maturity but in the absence of tighter integrations between DevOps and ITSM teams, there is also still plenty of room for improvement.