Guest post originally published on Chronosphere‘s blog by Amanda Mitchell, Senior Content Marketing Manager at Chronosphere
Hello sunny LA and happy KubeCon + CloudNativeCon North America 2021-eve! Team Chronosphere has set up shop on the KubeCon show floor and we are thrilled to be here – live and in-person!! – after nearly two years of 100 percent virtual gatherings.
With an anticipated 15 thousand folks attending KubeCon online, and a few thousand physically here, most attendees won’t get a chance to experience the event vibe for themselves. Fear not: We’ll be live-blogging from the show floor. Follow along as we recap some of the daily highlights and bring some of the in-person feel to your home office.
We’ll listen in and report back on highlights from the back-to-back keynotes and sessions, and we’ll be on the lookout for key trends.
What’s happened so far
PromCon North America 2021 – Monday
Here are highlights from two PromCon sessions. We have some live shots to entertain you for you now, and we’ll also share the video replays when those go live.
In the first session, Chronosphere co-founder and CTO, Rob Skillington, gave a keynote on Aggregating, Alerting and Graphing on Millions of Prometheus Timeseries.
At the top of the talk, Rob explains to the audience that the title of his opening slide, “Millions of timeseries? Why?”, seems far fetched, but it’s actually not. He goes on to paint a picture of how you easily get to millions of timeseries while monitoring just a few hundred pods in a cloud-native world.
He lays out an example of how quickly timeseries escalate:
- 50 microservices
- 200 average pods per service
- Each service has, on average, 20 HTTP endpoints and GRPC methods
- 5 common status codes
- 30 histogram buckets.
- Boom: You quickly get to 30 million unique time series
The result is that dashboards and queries can be very slow to load; he spends the rest of the session discussing the tactics that can help mitigate this.
Where do we go from here:
- Rob walks through several approaches to solving this using both, Recording Rules and M3 Aggregation, and outlines the challenges with each.
- In closing, Rob tees up his next PromCon session with Gibbs Cullen – a demo explaining how you can use streaming aggregation with M3 to query metrics, at almost any scale, without increasing the load of heavy recording rules.
In another PromCon session on Monday, Chronosphere Developer Advocate, Gibbs Cullen, led the session, with a special appearance from Rob Skillington: Streaming Recording Rules for Prometheus, Thanos, and Cortex Using the M3 Coordinator (slides are available here)
In their talk, Rob and Gibbs cover:
- Aggregation using Prometheus recording rules.
- Streaming aggregation using Prometheus in the M3 coordinator.
- A demo by Rob showing how M3 does the streaming aggregation.
- Questions and answers
KubeCon Day One – Wednesday
Many virtual attendees beamed in from their home offices for the first day of KubeCon, BUT plenty of us watched Wednesday morning’s keynotes live and in-person. It was so nice to be safely gathering and mingling again.
Here are few highlights from the KubeCon Wednesday keynotes:
- In the past six months, the CNCF has seen incredible growth, says CNCF General Manager, Priyanka Sharma. Also the types of companies and the types of individuals who are joining the foundation are changing as well.
- CNCF Ambassador and tech comic creator, Kaslin Fields, laid out why you need multiple kubernetes clusters:
- Run in multiple geographies for HA/DR
- Easier cost management
- Isolate multi-tenant workloads
OpenTelemetry updates:
- White heavy check markTracing is GA
- White heavy check markMetrics in beta (Prometheus remote-write compliant)
- White heavy check markLogs in beta
On inclusivity:
We need to keep pushing forward with our efforts to be more diverse, equal, and inclusive because, as keynote speaker Tim Pepper shared: Diversity is being invited to the party; inclusion is being asked to dance. Read Tim’s blog connecting KubeCon and Indigenous People’s Day, A Native Welcome to KubeCon, for a full understanding of his talk.
Tim also introduced two guests to the stage to share the phrase Huutokre, used by the Tongva (original people of Los Angeles) to say “I see you.”
theCube is back in-person
Chronosphere co-founder and CEO, Martin Mao, and key investor, Jerry Chen from Greylock Partners, sat down with theCube on Wednesday to break down what separates Chronosphere from the rest of the observability market and to discuss our new $200 million Series C funding and new distributed tracing capabilities. Some highlights from the conversation:
- What makes Chronosphere a true observability platform.
- The significance of being built for cloud-native environments.
- Martin and co-founder Rob Skillington (creators of Uber’s open source observability platform, M3) already solved the observability problem at Uber so Chronosphere is already ahead of the cloud-native monitoring trend.
- Why Chronosphere’s ability to scale is unique.
- How the cloud-native adoption wave is just beginning.
Highlights are never enough – watch the quick 10-minute video here.
Panels at KubeCon
Cloud Native and Kubernetes Observability Panel: The State of Union
This panel was moderated by Rags Srinivas from Datastax/InfoQ and included some top thinkers in observability discussing topics such as:
- What does forensics mean in the context of observability? How do you prepare for the unknown-unknowns?
- Is Kubernetes adoption helping the OpenTelemetry efforts? Or making it more complex?
- How does observability technology fit into the culture?
- How can OpenTelemetry address edge cases and different development platforms?
- What do you say to folks who think that more observability data is better? Is there a point where there is too much data?
- What’s the most difficult unsolved problem in observability that needs to be solved in the next 5 years?
The panelists had insightful answers such as:
- Actionable observability: Observability insights need to lead to YOU doing something that makes something better. You need to ask yourself if you really need advanced forensics? Do you need to know everything, or can you define a set of patterns/methodologies that every component has (i.e., RED metrics).
- Instrument first, ask questions later: Data doesn’t come from nowhere, someone instrumented it, but now you can pull up data that you didn’t anticipate being useful in that situation.
- Kubernetes is both a help and hindrance: Kubernetes makes complex systems easier to create, which puts more pressure on observability. The adoption of both servicemesh and sidecars alongside Kubernetes has also accelerated distributed tracing adoption.
- Barriers to observability adoption: The biggest challenge that observability needs to address in the next five years is cultural. Many companies don’t do ANY monitoring yet (although they may think that they do).
Panel: Women in Tech – how to gain influence
At Wednesday’s EmpowerUs panel, there was a lively discussion about some of the many challenges women face in the tech workplace where they represent but a small percentage. According to The NewStack, the percentage of women in tech jobs hovers in the mid-20s to low-30s, and women make up less than 5% of professional developers.
The New Stack Features Editor, Heather Joslyn, moderated the session, throwing out some great conversation-provoking questions to the *panelists, with a goal of figuring out how women can gain influence in tech.
- What have you found to be the biggest obstacles to exerting influence in the organizations you’ve worked in?
- Have you ever experienced “imposter syndrome” and what advice would you give about avoiding that feeling?
- How do you find support amongst other women, and among allies in an organization?
- How many of you feel you’ve encountered unconscious bias in the workplace? If so, how do you decide when to call that out, and what do you do in such situations?
- How do you balance saying how you feel with remaining professional?
Some of the advice from panelists included:
- Foster allies among your colleagues.
- Know what strengths you bring to a conversation (remove self-doubt).
- Reprioritize the conversation when tasked with actions outside your purview (i.e. a senior manager being asked to order lunch for your boss).
- Attend meetups and other venues where you can share strategies and expand your universe of allies.
- If you’re new to a space, especially tech, embrace it rather than feel “less than” – we’re all learning all the time.
- Know you can speak up when made to feel uncomfortable, but also know it’s not your responsibility to educate people who have made “othering” remarks.
(*Panelists included: Chronosphere Engineering Manager, Elenore Bastian; Google Research Analyst, Sophia Vargas; The New Stack Digital Marketing Manager, Colleen Coll; and Arize AI Product Marketing Manager, Krystal Kirkland).
We will share the replay link when it becomes available.
KubeCon Day Two – Thursday
Highlights from the Thursday keynote
It was nice to gather again … in person, even if it was a smaller group compared with pre-Covid times. Who knew such a once-unnoticeable concept could be so satisfying.
On keeping the CNCF fresh and relevant, KubeCon 2020 co-chair and keynote speaker, Constance Caramanolis, spoke some tough, but constructive, truths about CNCF today:
- There are a lot of projects (over 130!) in the CNCF today.
- Projects are often hard to use, put a lot of burden on the user, and require you to be a deep expert.
- When you adopt one project, you don’t get a lot of freebies when adopting another.
- It’s time to reframe how we speak about adopting new CNCF projects: Ask the more informative question, “What is the biggest issue I’m facing now?” rather than simply, “Should I adopt project-X?”
Another guest speaker, Kasten Co-founder and CTO, Vaibhav Kamra, continued on a similar vein, saying that the very fact that the KubeCon event is seeing such growth (as also noted during Wednesday’s keynote) means that the majority of people in the cloud-native community are new to it.
The solution, he says, is to get involved!
SIG updates!
- Security: hardening guide, pod security admission
- Release: new release managers, software supply chain security (SLSA, that’s a new acronym for me!)
- Storage: CSI Windows now GA w/ 1.22
- API expression: Server-side Apply now GA w/ 1.22
- Naming: Dissolved! Replaced by the Inclusive Naming Initiative (side note, inclusive naming is a topic near and dear to our hearts. If this of interest for you too, check out Chronosphere’s Chris Ward’s talk at All Things Open)
Sessions – Day Two
Session: Stream vs. Batch Aggregation for Metrics
Chronosphere Developer Advocate, Gibbs Cullen, held a talk on “Stream vs. Batch: Leveraging M3 and Thanos for Real-Time Aggregation” during which she laid out the pros and cons of each. Around 45 people attended – sizable by 2021 standards.
The session was so good, one attendee stopped by the booth to say it was “the best talk on observability I’ve seen.” Great job, Gibbs! Definitely worth checking out. In the meantime, here are some highlights:
Gibbs sums up the “stream vs. batch” debate through the lens of two well-known Prometheus remote storage solutions, M3 and Thanos. Along with an overview of stream and batch aggregation, she shared how:
- M3 performs in-memory aggregation via roll-up rules using the M3 Coordinator and M3 Aggregator.
- Thanos performs batch aggregation via Prometheus recording rules using the Thanos Query and Thanos Ruler.
- Both solutions use aggregation to improve performance of querying high cardinality metrics at scale.
- Pros and cons of each approach.
KubeCon Day Three – Friday
Session: Kubernetes Operations with Temporal and M3
Last but not least, on the final day of KubeCon, in one of the final time slots (3:25pm PT), Chronosphere’s Matt Schallert and Temporal’s Dominik Tornow held a talk on Declarative and Imperative Kubernetes Operations with Temporal and M3. During the talk, the pair walked through everything from the Kubernetes basics to showing how to craft scalable and reliable operation automation for your Kubernetes applications and clusters.
You can watch a replay of this session in your MeetingPlay app now, but here are some highlights in the meantime.
Dominik’s part of the talk:
- Answered the question, “What is Kubernetes? Conclusion: An operation automation platform.
- Defined controllers and resources.
- Explained how Temporal – an open source platform for reliable workflow execution – guarantees your workflow execution cannot fail.
- Handed off to Matt to show Temporal in action.
Matt’s part of the session covered how Chronosphere uses Temporal, explaining:
- How Chronosphere’s Kubernetes Operator makes it easier to manage M3DB, the open source time series database that Chronosphere relies upon.
- On detection and mitigation: Rather than limiting yourself to remaining in a declarative model, by combining operators with Temporal workflows you get the best of both worlds:
- Your detection loops are still declarative, and the actual mitigation steps can be performed imperatively as a workflow.
- Said Matt, “This is what we’ve done with our Operator, and we’ve been super happy with it.”
Check out our final post from KubeCon 2021, which sums up (with photos) six key takeaways from the in-person event.