Get Rootly's Incident Communications Playbook

Don't let an incident catch you off guard - download our new Incident Comms Playbook for effective incident comms strategies!

By submitting this form, you agree to the Privacy Policy and Terms of Use and agree to sharing your information with Rootly and Google.

Back to Blog
Back to Blog

January 22, 2025

7 mins

RescueOps - Ep. 5: Scalability and Flexibility

From hiking gear to SRE playbooks, scaling requires thoughtful preparation at every level. Learn why robust foundations, adaptable tools, and tested protocols are your best defense—whether facing a blizzard or a system outage.

 Claire Leverne
Written by
Claire Leverne
RescueOps - Ep. 5: Scalability and FlexibilityRescueOps - Ep. 5: Scalability and Flexibility
Table of contents

In this series, Claire Leverne—outdoor rescue expert and engineer—shares insights that SREs can draw from rescue operations. Check out the previous parts of the series for more context:

Whether there will be Weather

In the first episode of this series I introduced the SAR Incident Command System, which at its core provides structure, communication, and most importantly modular organization to allow an operation to scale.

There’s a lot to be said about ICS and how it enables an operation to adapt and scale, but in this post I want to shake it up a little, take a step back to look at an every-day scaling scenario that applies to SAR teams, outdoor professionals, and every-day hikers: weather.

When planning for a trip outdoors, weather forecasting in remote places is notoriously unreliable. Weather stations and remote sensors are farther apart, and weather patterns become unpredictable in mountain environments where local climates and rain pockets can form.

It’s hard to gauge what kind of weather one will encounter, but it’s reasonably easy to predict the likelihood that some kind of bad weather will occur. As the Whether Man, not the Weather Man, in Phantom Tollbooth liked to say, “after all, it's more important to know whether there will be weather than what the weather will be!”

Preparation at all Levels

So how do we prepare for events of uncertain origin, duration, or intensity? How do we prepare for the unknown? We need a strategy that is highly flexible and allows us to scale our response quickly to match the magnitude of the event.

For the outdoorsman and the engineer, the scaling solution is layers. Ask any outdoor professional what their layering strategy is, and they will be able to tell you in excruciating detail why they chose the clothes they pack, why they’re better than other options, and even where they’re stored in their backpack (with rationale). Anyone who spends a lot of time outdoors puts rigorous thought into these decisions because it’s the single most important survival skill: regulating the temperature of your body in response to the elements—and the elements are unpredictable.

In New Mexico, part of the state requirement for being certified as a SAR member is a gear check that includes base layers, mid layers, thermal and shell layers. This is essential equipment that ensures the rescuer is ready for the unexpected, can continue to be a contributing member of the team, and will not become another patient in need of rescue. Most rescue operations are sent after day-hikers—the demographic most likely to be unprepared for a change in weather, temperature, injury, or any other extended stay away from resources.

Layers, layers, layers

Scaling isn’t so much about making decisions on-the-fly; it’s a matter of having assembled a toolbelt of resources and protocols [Episode 3] ready to deploy when the scenario changes, and exercising good judgement when picking the right tool for the job. Scalability and flexibility in both SAR operations and SRE incident response hinge on the thoughtful layering of strategies and tools. As an outdoor professional relies on a well-organized layering system to adapt to unpredictable weather, SRE teams depend on a layered approach to manage and respond to incidents of varying complexity and scale.

The Base Layer

Counter-intuitively, the base layer is the first line of defense against the elements, managing moisture and regulating body temperature. Base layers can look different, depending on whether you’re in sub-freezing or hot weather, but they’re meant to keep you dry and comfortable. Their work is low-profile and essential: they regulate the micro-fluctuations in temperature to minimize water-loss (dehydration) and prevent other layers from getting damp (risk of hypothermia). Furthermore, without an effective base layer, even the best midlayers and shells can’t perform optimally.

For the engineer, the base layer can be equated to automated monitoring and alerting systems. These tools continuously observe the health of your services, collecting metrics, logs, and traces to detect anomalies. Just as a high-quality base layer ensures you stay dry, robust monitoring ensures that issues are identified promptly before they escalate into critical incidents.

The Working Layer

Mid-layers are the working horse of your outdoor wardrobe. They catch every-day scratches, scruffs and scrapes; they’re flexible and breathable; they cover a wider range of temperature fluctuations.

Just like hiking mid-layers absorb daily wear and tear, engineering mid-layer tools manage the routine operations and minor incidents that occur regularly. These include agile tools and cloud infrastructure; they handle resource allocation, load balancing, and automated scaling, ensuring that your services remain performant and reliable under varying loads. This layer is made for maintaining operational stability.

Thermals and Shells

Shell layers are for extreme weather and temperature fluxes: wind, snow, rain, or sun. The shell layers tend to get all the glory; they come in cool colors, they look pro, and when weather rolls in they’re flexing at the one job they’re made to do – but let’s not forget, the layers underneath are what really allow a shell to shine.

This layer can be analogous to modular teams trained to operate under high-pressure conditions, coordinate across multiple departments, and implement comprehensive recovery strategies. Other examples could be more tool-focused, such as playbooks with detailed instructions on how to bundle and deploy tools, coordinate communication, and execute recovery procedures. Like an outer shell ensures that all other layers remain functional, these frameworks ensure that incident response efforts are cohesive, well-organized, and effective in mitigating the impact of severe incidents.

Conclusion

Whether you're an outdoor enthusiast or an SRE professional, the takeaway is clear: layering your strategies and tools is essential for managing complexity and ensuring readiness for whatever challenges may arise. Invest time in building a solid foundation, equip yourself with the right tools for everyday operations, and have robust plans in place for when the unexpected occurs. In doing so, you'll be well-prepared to handle incidents with the same resilience as a seasoned outdoorsman in a blizzard.

Practical advice for adventurers: never underestimate how wet you can get when the weather turns. Bring good GoreTex that you’ve recently tested for weatherproofing (rain protection can decay over time, even sitting in a closet). If you come to a rolling creek and think, “My boots are already soaking, maybe I should just wade,” reconsider. It is, in fact, possible for your boots to get even wetter! Best hiking advice my dad (lifelong forester) ever gave me: always pack extra socks. When you have to ford a creek, take your boots off and keep your socks on to protect your feet. Change your socks at the other side and enjoy those (subjectively) dry boots.

Make good choices, and remember to pack snacks!

Rootly_logo
Rootly_logo

AI-Powered On-Call and Incident Response

Get more features at half the cost of legacy tools.

Bood a demo
Bood a demo
Rootly_logo
Rootly_logo

AI-Powered On-Call and Incident Response

Get more features at half the cost of legacy tools.

Bood a demo
Bood a demo
Rootly_logo
Rootly_logo

AI-Powered On-Call and Incident Response

Get more features at half the cost of legacy tools.

Book a demo
Book a demo
Rootly_logo
Rootly_logo

AI-Powered On-Call and Incident Response

Get more features at half the cost of legacy tools.

Bood a demo
Bood a demo
Rootly_logo
Rootly_logo

AI-Powered On-Call and Incident Response

Get more features at half the cost of legacy tools.

Book a demo
Book a demo