Project post by the LitmusChaos maintainers
The Chaos Engineering community is growing exponentially day by day and the LitmusChaos community is grateful to be receiving massive participation and immense engagement in recent times to help the Chaos Engineering community prosper and contribute back to its development.
We have decided to share monthly updates with the community from June 2022 onwards to update the community on the latest happenings and updates around the LitmusChaos project.
About LitmusChaos
LitmusChaos is a dynamic open source chaos engineering platform that enables teams to identify weaknesses and potential outages in infrastructures by inducing chaos engineering tests/experiments in a controlled manner. LitmusChaos is driven by the principles of Cloud-Native innovation and gave rise to the principles of Cloud-Native Chaos Engineering. Chaos engineering verifies the resilience of business services and helps DevOps pipelines proactively build code that is more resilient against software and infrastructure faults.
The LitmusChaos project was started in late 2017 to provide simple chaos jobs in Kubernetes. It became a CNCF sandbox project in 2020 and was promoted as a CNCF incubating project in January 2022. Today, it has maintainers from 5 different organizations across cloud-native vendors, solution providers, and end-users.
The project is used in production by more than 30 organizations, including large end-users like Adidas, FIS, iFood, Cyren, Intuit, Lenskart, Orange, and more as well as technology organizations like Red Hat and VMware.
Website: https://litmuschaos.io
GitHub: https://github.com/litmuschaos/litmus
LitmusChaos Releases 2.10.0
LitmusChaos version 2.10.0 was released on the 15th of June with some amazing new updates to the core components, the chaos center, and the Litmusctl.
The community is excited about the addition of “http” chaos experiments. The first one to be added is the Pod-http-latency experiment.
Pod-http-latency contains chaos to disrupt http requests of Kubernetes pods. This experiment can inject random http response delays on the app replica pods.
- Causes flaky access to application replica by injecting http response delay using toxiproxy.
- The application pod should be healthy once chaos is stopped. Service requests should be served despite the chaos.
Further, in the upcoming releases, we are looking forward to the addition of the following http chaos experiments:
- pod-http-reset-peer: It simulates TCP reset (connection reset by peer error) into the pod which stops outgoing http requests by closing the connection and then reverts back to the original state after the specified duration
- pod-http-modify-status-code: It can modify the http response code for the http request.
- pod-http-modify-header: It can modify/add/remove http response or request headers.
- pod-http-modify-body: It can modify the complete body of the http response or request.
Some notable additions also include the release of the first beta version of the LitmusChaos m-agent which will help curate chaos for Non-Kubernetes targets, refactoring the GraphQL server, and added Envoy proxy support, new litmusctl commands and much more.
Check out the release notes for deeper details on the release:
Release Notes (2.10.0)
Core Component Updates –
- Adds HTTP chaos experiment for Kubernetes applications using toxiproxy. This will allow you to introduce latency in the target application service and check the application availability.
- Introduces the first Beta version for m-agent (machine-agent), enabling us to run chaos on non-k8s target. It also includes a new CPU-stress experiment that can run CPU chaos on the target VM(s).
- Adds the missing –stress-image parameter for the pod-io-stress experiment that enables us to add a custom stress image for the experiment when using Pumba lib.
- Enhanced the recovery of node cordon experiment when app status check fails during the chaos.
- Fixes the chaos result verdict update for GCP disk loss by label experiment during the different stages of chaos.
- Fixes node level e2e check for every build on a pull request for the litmus-go repository.
- Adds docs for the new HTTP chaos experiment, updates the GCP experiment docs and introduces some more examples in docs for pod network latency experiment with jitter.
- Adds chaos charts for AWS AZ down experiment in the hub this will help to get the manifest for the experiment workflow preparation.
- Fixes the GCP and m-agent e2e pipeline to run the automated tests seamlessly.
ChaosCenter Updates –
- Refactored graphql-server for extracting queries, mutations, and subscriptions to the respective schema files
- Added support for Envoy proxy when using frontend Nginx.
- Added UI enhancement for allowing scrolling to the invitations tab after clicking on the Invitations button.
- Fixed issues with httpProbe and promProbe in tune workflow section due to the addition of httpProbe/inputs: {} when adding multiple probes.
- Added check for invalid schedule type when trying to proceed in workflow construction wizard without selecting schedule type.
- Fixed issue in GitOps when updating the git repository configuration.
- Added CHAOS_CENTER_UI_ENDPOINT env for specifying a one-time UI endpoint for the control plane, so that all external agents can be provided with the same (Available for cluster and namespace scope).
- Added support for automatically adding imagePullSecrets in for Engine, Runner & Experiments pods from configured image registry.
Litmusctl Updates:
- Added commands like litmusctl get workflows, litmusctl create workflow, litmusctl describe workflow, litmusctl delete workflow and litmusctl get workflow-runs for workflow CRUD operations.
- Renamed litmusctl create agent command to litmusctl connect agent
- Added a new command disconnect agent for disconnecting agents from the Control plane.
- Enhanced logging for better debugging.
Note:
– For using newly added commands, users will have to download the v0.11.0 version of litmusctl.
– litmusctl v0.11.0 only supports litmus v2.10.0 or higher versions
New Contributors
- @QAInsights made their first contribution in #3612
Latest from the LitmusChaos Community
Community Adopters –
Adoption is the key to any community’s success in the open source ecosystem. The LitmusChaos community is stoked to have added two formal adopters in Adidas and Cyren in the month of June.
Check out the Adidas story here: https://github.com/litmuschaos/litmus/blob/master/adopters/organizations/adidas.md
“Chaos Engineering is an awesome method to train engineers the cloud-native principles and boost their confidence while responding to production failures.”
–Eran Levy (Engineering Leader, Cyren)
Cyren chose LitmusChaos to run chaos experiments and build confidence in their infrastructure while taking care of real production incidents. Eran Levy, Engineering Leader at Cyren authored this insightful blog to share Cyren’s Chaos Engineering journey with LitmusChaos: https://www.infoq.com/articles/chaos-engineering-cloud-native/
Community Content –
A lot of amazing community content to cherish for the LitmusChaos lovers. Check out all the latest content curated by the community for the community:
LitmusChaos was at KubeCon EU 2022, Check out the highlights from LitmusChaos at KubeCon EU ‘22: https://litmuschaos.medium.com/litmuschaos-at-kubecon-eu-2022-32dc33166feb
LitmusChaos maintainer Karthik S joined AWS Containers from the Couch to inculcate a getting started guide for new users pursuing LitmusChaos.
Check out this latest episode: https://youtu.be/5CI8d-SKBfc
The LitmusChaos Community meetings continue as a monthly cadence call to discuss the latest updates, happenings, and questions from the community. They are hosted every 3rd Wednesday of the month. Check out the latest from our last community meeting held on June 15th: https://youtu.be/wkaG1Tm5czU
LitmusChaos Community Member NaveenKumar Namachivayam aka QAInsights has created an amazing series of blogs and videos on LitmusChaos that give you access to a whole new world of ideas and content on LitmusChaos. Check out here:
Blogs:
- Chaos Engineering with LitmusChaos on AWS – Cross-Account Implementation
- Chaos Engineering with LitmusChaos on AWS EKS using IRSA
Videos:
Learn Chaos Engineering Series – E2 LitmusChaos Demo on AWS EKS | https://youtu.be/_EFnRlme3hQ |
Learn Chaos Engineering Series – E3 Installing LitmusChaos on AWS EKS | https://youtu.be/TykfTbC_W1E |
Learn Chaos Engineering Series – E4 LitmusChaos on AWS EKS using IRSA | https://youtu.be/w4ChJCIejvE |
Learn Chaos Engineering Series – E5 – Running LitmusChaos Experiments on two AWS Accounts | https://youtu.be/dNevrEZEimg |
Learn Chaos Engineering Series – E6 – Resiliency Score in LitmusChaos | https://youtu.be/YiJqaF-nh-Y |
Learn Chaos Engineering Series – E7 – GitOps and Event-Triggered Chaos Injection | https://youtu.be/LSJUwCKVg8g |
Learn Chaos Engineering Series – E8 – Probes | https://youtu.be/_aXrhJwM1YA |
Learn Chaos Engineering Series – E9 – Installing LitmusChaos in DigitalOcean | https://youtu.be/zV1jCOU8_zA |
In the end…
The LitmusChaos community continues to grow with amazing contributions (issues, suggestions, PRs) from the community and looks forward to more members joining in and contributing to the growth of the project.
Join the #litmus channel on the Kubernetes Slack to become a part of the community. Learn, Ask, and Contribute by being a part of the community.
Check out the Contributing Guide to get started with contributions
Subscribe to the LitmusChaos YouTube Channel for the latest videos.
Follow @LitmusChaos on Twitter for the latest social updates.
Check out the LitmusChaos blogs to learn more about LitmusChaos and you can write one too by using the tag #litmuschaos on DEV.to