October 17th, 2019
A Full-Time ML Role, 1 Million Blog Views, 10k Podcast Downloads: A Community Taught ML EngineerRSS Share Category: Data Science, Machine Learning Interpretability, Makers, Personal
By: Sanyam Bhutani
Content originally posted in HackerNoon and Towards Data Science
15th of October, 2019 marks a special milestone, actually quite a few milestones. So I considered sharing it in the form a blog post, on a publication that has been home to all of my posts 🙂
The online community has been too kind to me and these blog posts have been a method for me to celebrate any achievements big or small with the amazing people that I got know via slack groups: KaggleNoobs, ODS.AI, TWiMLAI, Data Science Network, the fast.ai forums or even Twitter as well as share my failures, unfiltered although with a hand scratching the back of my head and the amazing community was nevertheless still too kind to me, even when I had failed one of my biggest dream interviews.
So, it’d be rude if I didn’t share that I finally got an amazing full-time opportunity at one of the companies that are the leaders in the “AI” space. I’m sure the company needs no introduction, so I’ll skip the praise here.
As I’m writing this post, I’ll be starting a full-time role as a Machine Learning Engineer and AI Content creator at H2O.ai. Even though a full-time role is something that I always envisioned to work towards, it’s still hard for me to wrap my head around the fact. Also given a few other milestones that my blog posts, as well as the Chai Time Podcast, have achieved, I’m really excited and extremely grateful to the ML Community online, which has always been very warm and welcoming to me.
I think this post will also allow me to do a quick recap of my journey, as you might know, I’ve always been open to documenting my failures and successes openly via this blog, with an aim to leave an open journey documented so that anyone who is in a similar situation as me can leverage from these posts and not repeat my mistakes.
The journey comes before the milestones 🙂
In my previous post marking a few milestones, I had shared my rejection with the Google AI Residency, but even before that my journey really started when I went to the internet to find more about interesting ideas and “latest” topics to study in college. I figured either I could complain about the University syllabus being too dated and not the most in sync with industry or I could go out and figure out what the Industry actually is, what is “Machine Learning”, what are “models”. And you can imagine the list.
I took to online courses, did quite a wild number of them while still being enrolled in college. Then on 29th September. I got this email from halfway around the globe and I remember reading it at 4 AM (Yes, I wake up stupid early and then the first thing I do is read email sometimes)
This has been my dearest “Certificate” fun fact, in the “Chai Time Data Science” Studio:
This is the only framed printout that exists. Although I’m not sure if it’d be visible in the RGB Fun there.
This led me to another discovery cycle because Jeremy is the best professor that I had ever come across. I ended spending a huge number of hours just “watching” the course. In retrospect, which should have spent on coding and “training lots of models” (Jeremy’s top advice from the Amazing AI Podcast by Lex Fridman)
At the same time, I got the courage to visit kaggle dot com (I wanted to emphasize the dot, sounds better, doesn’t it?) once or twice and I was convinced that I should invest in a “DL Laptop” (Since I was in college so a box wasn’t possible). This in itself led to the creation of a long trail of a thread of discussions on the forums and I finally got started with Kaggle in late 2018. Also, I needed quite a sum to get the laptop so I took again to the internet since I still had another year of college to go and I couldn’t do a “job-job” while being in college. That’s how my “Freelance”/Contractual work started.
Then during the early new year, I got an email from Google that absolutely blew my mind. I made it to the final rounds of the AI Residency interviews and then got rejected. By the time the letter had come through, I had ticked off a few of my “2019 resolutions”.
- Becoming a Kaggle Expert
- Creating a good Kaggle Kernel
- Writing 1 blog post per week 🙂
- Implementing a few papers.
So I decided to spend some time in the community, with no end goals in mind. I really wanted to give back to the community and since I had graduated from college, I had the time to talk to many amazing people and offer help via slack, by hosting study groups and even workshops.
First “Remote Job”, Community Activities
By the end of April, I had an amazing job interview experience.
One of the amazing people that I knew via their community work: Aakash N S had posted about a role in their company. So I pinged them and had the coolest interview experience. A 45-minute video call, rather a friendly discussion about the responsibilities and my long term goals. The next call was the onboarding one.
Wait, let that sink in for a second.
That’s what I mean when I mention that fast.ai community is really amazing. A company, founded by two fast.ai fellows (Aakash N S and Siddhant Ujjain), meet another fast.ai fellow in another city and after a small discussion, they invite him to work on the same team. HOW AWESOME IS THAT!
The company is titled Jovian.ml and really builds on top of a few fast.ai philosophies, it was really a natural fit but the founding team’s goals of community efforts and kaggle really echoed with me and the complete board. So the team let me work remotely and part-time so that I could also split time on Kaggle and Community work.
One of my really proud moments have been having contributed to one of India’s largest communities: Data Science Network and having organised one of India’s largest KaggleDaysMeetup till date.
One of the most amazing aspects of fast.ai is the emphasis on blogging as a portfolio or project building exercise. The complete community and even Jeremy Howard himself really emphasize blogging and that’s how I started blogging.
At one point in time, I requested a few friends of mine: Dominic Monn and Tuatini GODARD for an interview about their journey since both of them had been really helpful to me in answering my stupid questions and their journey was very inspiring to me. These interviews were very well received by the community so next I landed at an intersection:
Either I could continue experimenting with technical articles, explaining concepts such as a CNN, an LSTM, Transformer, etc. or really focus on sharing these interviews because I felt this was really missing in the community. So I chose to leave creating tutorials and posts such as “PyTorch basics in 4 minutes” to people smarter and wiser than me and reached out to a huge number of “Machine Learning Heroes” that I really admired.
Even though it was scary since I was moving slightly away from sharing engineering and code-walkthroughs something that’d be expected of me, for the role of Data Science and ML that I was aspiring for but I still stuck to the interview series.
Now that I had graduated and found more bandwidth, I decided to re-start the series but this time in a podcast: Video, Audio, and Blog format.
I’d like to clarify one point, I’ve never monetized these posts, neither will I ever intend to, nor did I expect the Chai Time Data Science Podcast to be a “business”, it was really a new medium for me to act as a medium (Too meta, isn’t it?) sharing these stories across 3 different formats to really allow the community all options to consume these amazing stories of “My machine learning heroes”
Chai Time Data Science Podcast
Again with the podcast, I was at an interesting crossroad where I chose to release the video, audio of the interviews, handling the editing (The most effort demanding task), interviews and releases.
Since I was really enjoying these interviews, I chose to make time by terminating a few freelance contracts/agreements and decided for a few months to give up 90% of my income and focus on making these amazing journeys more reachable. I’m really grateful to the Jovian team that let me continue the podcast while also working on “Jovian”. In fact, Aakash was also kind enough to setup a Zoom account via which the interviews happened. Without that, there wouldn’t quite literally have been the first set of interview releases or calls.
I really enjoyed the process and now being able to talk to my ML Heroes for an hour or sometimes two. The podcast calls were really one of the best activities that I got to do. Even though this also made me take a trade-off between Kaggle competitions and practising more coding, which I still somehow managed to squeeze in a few hours of, every day. I think I really stuck to Tim Dettmer’s advice for fresh graduates: “You have time. Take it easy, enjoy the process, Enjoy learning. There will be a stage in time when your tasks become repetitive so really take it slow and understand your passion”. I’m really happy that I did that 🙂
I really want to share three milestones, a few points that I never really expected to happen:
Hitting 1 Million Views across my blogposts
Yes, 1 Million! Imagine, starting an activity that an online course suggested, an online community taught you and then being able to share it with that NUMBER of people!
I’m really grateful to Hackernoon where all of these interviews and all of my posts were warmly accepted and to David (CEO of Hackernoon) who helped me a lot with feedback during the initial days of my experimental posts.
This has been my guideline for every single post.
I’m also really thankful to everyone that read my posts and especially to the “Machine Learning Heroes” who kindly shared their journey with a Noob just starting his journey.
10k Podcast downloads
Chai Time Data Science hit 10k downloads after 1.5 months of going live! I’m not a podcaster neither a good stats person but that is a huge number for me and I’m really happy that the podcast as a medium reached a good number of audiences and shared these great stories, advice. And I hope to continue maintaining the crazy release schedule that I had come with.
Another crazy fact, Chai Time Data Science, at the time of writing, has been streamed across 80 countries. WHAT!?
I really love the internet (or just the ML Community on it 🙂 )
I saved the best part of my journey for the last section. 🙂
Of all the stats that I mentioned above, this is probably the one that’s I’m the proudest of, maybe even more than my Kaggle Ranks (Maybe) (Although I’d like to point out completely that all of Kaggle “medals” or “ranks” have been thanks to being in a team where I got the chance to meet and work amazing kagglers, so it completely wouldn’t be possible at all for me to achieve them without the amazing people that I got the chance to team up with. I’m not a smart kaggler and I’ll not pretend to be, but I definitely believe that Kaggle is a great learning platform and the true home of data science and I don’t think I can emphasize this enough)
Dear fast.ai Team: Jeremy Howard, Rachel Thomas and Sylvain Gugger and also the fast.ai family, (It’s an online community that I like to call a family). One of the aspects of being self-taught or community taught is, my education would have been only as good as the community’s resources and advice.
In my previous post, I wasn’t unfortunately in a position to really thank the fast.ai team but finally getting really a dream job, that I couldn’t even dream of (Too meta, again?) I think I’m in a position finally to give all the credit of my achievements to the fast.ai course, the library, and the community.
Many of the things that I learned: competing on kaggle, blogging, software engineering and deep learning, even little things such as how to read a paper, how to sift through a math equation and not get lost in Greek but figure out the code. Any of these things that I’m able to do, to an extent or maybe more, is all thanks to the exposure I got via the fast.ai experience. I’d confess that I may not have been the best student since I’m yet to complete the swift for TF lessons, even though I’m already working on a few blog posts on Swift and I’m yet to watch the V2 walkthroughs as well, so maybe I’m not the sincerest student in the fast.ai global classroom. Yet, I’m really really grateful to Jeremy Howard, Rachel Thomas, Sylvain Gugger and all of the amazing people I met via the forums.
I’d also point out that things such as Kaggle are softly highlighted but heavily emphasized in the course, so it’s not a go-to for all of the above points I mentioned, you might have to do your homework as well. But as they say, Nuclear energy is produced by Uranium and not the by-products that happen, I think fast.ai is the uranium of this chain reaction that if you follow correctly, you might get amazing results! 🙂
However, it wouldn’t have really been possible for me if I didn’t join the KaggleNoobs community, DSNet, ODS.AI and TWiMLAI ones as well. All of these have been an amazing learning point for me even though I didn’t contribute to as many conversations or latest paper discussions even since I had committed to the podcast release schedule. Yet, I’m really grateful that these communities and everyone from them were welcoming to me. (Again, I found out about these via the fast.ai forums 🙂 )
Finally, I’d mention that when I say “online or community or self-taught” (I prefer community taught out of the three words), I did follow and pursue a traditional CS Undergrad degree where all of my online learning started. But the real credit for my (very little) knowledge of the Machine Learning world is thanks to online resources. I only engaged in my university courses where I truly enjoyed the syllabus and really skimmed the others, I’m not sure why but I did get recognition with the “IET Prize”, which as they mentioned “is provided by the universities to their most outstanding students”. My only advice to anyone in University would be really, follow your passion and don’t be afraid to ask for help, online or offline.
My next role and next part in my Machine Learning Journey
From the page https://www.h2o.ai/team
I’m incredibly lucky and really excited to be starting a full-time role as a “Machine Learning Engineer and AI Content Creator” at H2O.AI.
This by no means is to classify me as an expert, the company itself is home to many amazing kagglers and world’s top data scientist and I’m really excited about the next part of my journey and ready to learn more.
Thanks to the internet, fast.ai and online ML Communities for defining my path and Thanks to my parents specially who agreed to this completely crazy plan and even supported me when the podcast’s initial growth was slow or helping me get started with my first GPU or even earlier than that when I chose to quit the university campus placements since most of them were around Full Stack Roles and I really wanted to Kaggle instead (having ZERO experience with any sort of competitive data science competing) and spend time following online courses, as crazy as the plan sounded (My parents paid for my university fees, and even for the online courses that I pursued), my parents have always been the most supporting. And even more my mother, who would always put up with a son who’d quite often spend a few days, in a room and consuming endless cups of chai in a high electrical noise-generating GPU situation.
I’ll continue sharing my journey, via these blogposts at Hackernoon (And Medium), unfiltered and regularly. And thanks to everyone who has been a part of this amazing journey with me.