Originally published on Medium by Khyati Soneji
My Internship experience with Thanos, a project aimed at unlocking highly available and long term Prometheus storage capabilities.
This article is about my whole journey as an Intern working on Thanos project with help from my mentors Bartłomiej Płotka and Giedrius Statkevičius.
I am Khyati Soneji, currently pursuing my bachelor’s degree in Computer Engineering from Pandit Deendayal Petroleum University, Gandhinagar.
As part of the CommunityBridge platform, CNCF is encouraging Open source contributors to get paid while working on various cloud-native projects like Kubernetes, Prometheus, Thanos, Fluentd, etc. The application process is explained in detail here.
In this blog, I would like to share the following:
a. How I got to know about CNCF and got selected as an Intern
b. How I got started with learning more about Thanos project
c. Sync up with my mentors
d. Major Learnings from the project
Getting into the program
I got to know about CNCF from my brother who is also a Software Developer. He actually has a Prometheus sticker (the only one sticker on his laptop). I asked him about Prometheus, so he explained to me about the world of cloud-native technologies and observability.
My initial plan was to apply for a CNCF organisation during the Google Summer of Code. CNCF maintains all their project ideas in a git repository. Any new ideas for upcoming internship programs are added here.
I had subscribed to changes for this repository and that’s how I got to know that they are offering an internship program from December to March.
The projects are listed on the CommunityBridge platform. In order to apply for those, one has to answer questions about their profile where you can list your experience working on software projects. I had applied for Thanos, Prometheus, Cortex and Fluentd project. My motivation was to apply for monitoring projects. I was not inclined towards a specific project, I would be happy to work on any one of the projects.
After applying, I got an email from maintainers of the Thanos project asking the following questions:
a. Why did I apply for Thanos project specifically?
b. Any similar projects that I have worked on in the past.
c. The Thanos project I would like to work on from the available ideas.
d. How many hours I would be able to dedicate to the project
I would strongly encourage every applicant to answer these as honestly as possible. Answers to these questions set the expectations about your progress with the mentors and it will be helpful in setting correct expectations of your knowledge and time dedication.
In my answers, I had mentioned that I was excited to work on cloud-native monitoring solutions like Prometheus and Thanos. I wanted to learn more about the Distributed systems and the project would be an excellent opportunity for me to learn as well as contribute to open source.
The project that I wanted to work on was to Improve Read Write Coordination for Object Storage. The project is about defining a consistent way for readers and writers to access Object storage since the Object storage can be eventually consistent.
More information regarding the project can be found here.
Shortly after this, I received a confirmation email from the project mentors that I was selected as an intern and they added detailed information on how to get started with the project.
Learning more about the Project
Since I had close to zero prior experience with Go programming language, I first decided to learn more about Go and focus on learning about different components in the Thanos project.
Here is my list of online tutorials I found to be really helpful:
https://www.youtube.com/watch?v=YS4e4q9oBaU
https://play.golang.org/
https://gobyexample.com/
I decided to document my understanding of the project so that I can refer to it later. This really helped me a lot when I was indecisive about where I can keep a particular function, what conventions I should follow etc.
I will publish a document about my notes on Thanos soon.
To get more hands-on with the codebase, I was assigned one of the beginner level issues which would give me a chance to understand some basic implementation details. While submitting my first PR, I was really happy that I was able to understand and contribute to the project.
Sync up with my mentors
It was a really great learning experience, getting to know more about the project details from my mentors Bartłomiej Płotka and Giedrius Statkevičius.
We had decided to have a sync up call every Thursday to understand if I am facing any blockers and a catchup in general. I was also encouraged to organise sync up sessions when I am facing an issue and needed their help.
We kept an agenda of the meeting prior to the meeting, and each of the discussion points was discussed at length including different approaches to take and why a particular approach would be better to take.
Apart from the immense technical learnings, I really feel heartfelt about their feedback on how to improve as an open-source contributor in general.
I would recommend every open-source developer to go through Bartek’s blog about becoming a better OSS maintainer.
Both of my mentors were very approachable and they made sure to dedicate some time in helping me with the project. They were very prompt in providing detailed feedback points for me to improve on and open to suggestions from my side. Their support was the driving factor for me to gain more confidence in working on the project. I greatly appreciate their contributions and help with the project.
Major Learnings from the Project
During this project, I had a lot of important learnings and takeaways that would certainly help me become a better open-source contributor.
First of all, I realised the importance of documentation since keeping documentation helped me in revisiting the concepts when required. We were keeping a note of my understanding of Thanos components, meeting agenda and discussion points. My mentors also made sure that all tech conversations and doubts related to the project are asked on #thanos-dev channel on CNCF slack, so that other community members can contribute to the discussions and we have one place to find all decisions made about the project.
I learned a lot of Go mechanisms like channels, goroutines, error handling, slices, interfaces, unit testing, table tests, benchmarking, etc.
I also learned a lot about practical challenges around distributed systems. I had a course this semester around distributed systems and having a possibility to practically experience issues like eventual consistency, fault tolerance and partitioning were really helpful. I have also seen the challenges of working on a huge project like Thanos.
It’s true that people best learn from mistakes. We had our mistakes as well. I had taken up a task to schedule blocks for deletion rather than straight-away deleting them. This PR certainly started small with a few modifications and some changes requested.
The problem was that changes kept on growing slowly and in the end, we were staring at a PR with 27 file changes, 383 discussions on Github, over 30 days of work on the PR and over 400 discussions over messages on #thanos-dev channel. This PR was also in active development around the time I had my mid-semester exams and I couldn’t work on it for an entire week’s time.
I found it really challenging to address the feedback points on the PR. I was also feeling bad that my mentors would have a very difficult time reviewing the PR as well.
There even came one time when my mentor Bartek was not able to provide feedback points on the PR
There are a lot of good learnings from the PR which are:
- Keep changes in one PR as specific as possible. If the changes are big, break the PR into multiple smaller PRs.
- Add unit tests for various scenarios, this helps in capturing possible bugs before going for a review
- Address the concerns in each review before asking for a fresh PR review. Even if the point made is already addressed or has to be handled in a different way, mention that in the review point before resolving the discussion.
- Check the conventions followed in the codebase. If you are unsure of what conventions to follow or which approach to take, It is better to first check the current conventions and check if the same can be applied in your case.
Final Thoughts
It was a great learning and fun experience working with Thanos project. This project gave me a lot of knowledge of distributed systems, Go, and Cloud-native Monitoring solutions.
I would like to thank my mentors Bartłomiej Płotka and Giedrius Statkevičius for being very helpful, patient and open to feedback.
The project would not have been completed without their inputs and I wholeheartedly appreciate their inputs and efforts in helping me.