Practical Chaos Engineering: breaking things on purpose to make them more resilient against failure

With the wide adoption of micro-services and large-scale distributed systems, architectures have grow increasingly complex and hard to understand.

Worse, the software systems running them have become extremely difficult to debug and test, increasing the risk of outages. With these new challenges, new tools are required and since failures have become more and more chaotic in nature, we must turn to chaos engineering in order to reveal failures before they become outages. In this talk, I will first introduce chaos engineering and show the audience how to start practicing chaos engineering on the AWS cloud. I will walk through the tools and methods they can use to inject failures in their architecture in order to make them more resilient to failure.