Vol #20 | Nov '22 | Data Engineering Certifications - Databricks
My recent experiences of passing the Databricks Certifications
Hope you all had a great Diwali & spent some good time with your friends & family!
In this month’s post, I’ll write about my experiences of attempting the Databricks certifications.
I’ve been working on & off in Databricks for the last couple of years. But the certifications were not really on top of my mind. (I also wrote a blog post about my other certification’s journey on Medium) However, one fine day, I saw a LinkedIn post about Databrick’s free training ( along with exam vouchers!) & just enrolled for the same. At that time, I was also exploring Lakehouse and found it very intriguing. As I got more interested, I gave these 3 certification exams & cleared all of them.
In this post, I’ve tried to capture my experiences. I’ve added some reference links that you can refer to, but you can also explore the Databricks website/documentation/blogs & other youtube channels.
Data Engineering Associate
This exam is a good starting point for Data Engineers. It will test you on all the basic concepts around Databricks, Lakehouse & Spark.
I don’t have a programming background and was a bit worried about Pyspark programming questions. The exam’s main intention is to validate the candidate about his/her understanding of the core concepts & fundamentals & not just programming skills. So if you are also a DE from an ETL (INFA, DS, Talend) background - don’t worry about the programming stuff. You should be fine as far as you understand lakehouse & Spark concepts.
Topic Breakup - https://www.databricks.com/learn/certification/data-engineer-associate
I found the exam to be relatively easy - considering I have been working in Databricks for a good amount of time and had explored most of the basic features.
Important Note - Do a lot of hands-on. Create an account on the community edition & complete these notebooks provided by Databricks.
Practice Notebooks provided by Databricks Academy
https://github.com/databricks-academy/data-engineering-with-databricks-english
Data Analysts Associate
This exam is mainly for Data Analysts, but in the modern data world, I think a Data Engineer is expected to perform all the roles - from analysis, testing, engineering and design. So this is another good cert to add to your profile.
Since I have worked in SQL for a long time, I did not have to prepare much for this exam. Like the Data Engineering Associate exam, I found this to be simpler and enjoyed giving the exam.
Topic Breakup - https://www.databricks.com/learn/certification/data-analyst-associate
As of now, there is NO practice exam for Data Analyst Associate Certification, so I was not sure what to expect in the exam. One section I struggled with was around visualizations and the last-mile ETL.
Important Note - If you are using community edition, then Databricks SQL is not available for practice. You can create a paid account using Azure Databricks. Other option is to execute queries using spark.sql in the notebooks.
Practice Notebooks provided by Databricks Academy
https://github.com/databricks-academy/data-analysis-with-databricks-sql
Data Engineering Professional
This is a tough one!
Professional exams are difficult - be it AWS or Databricks. I found this exam very challenging, even the 2 hours of actual examination needed a lot of concentration and tested the understanding of basic & advanced concepts from their application perspective.
A few years back, I cleared the AWS Data & Analytics Speciality exam. If we compare the complexity of questions, the Databricks DE Pro was very similar to that.
Topic Breakup - https://www.databricks.com/learn/certification/data-engineer-professional
The best part of this exam is the preparation journey. You get to learn a lot of advanced features that you can apply to your projects. E.g. DLT SCD features, CDF for identifying delta changes, CLI & REST API for jobs creation, and Repos for integration. All these are very handy features that you leverage while working on projects.
Important Note - You will need a lot of patience when giving this exam. Attempt it with a relax mindset and a fighting spirit. Do not give-up midway, mark questions to revisit & review them at end if time permits. Plan your time well as 120 minutes might not be enough. Dont spend a lot of time on one question, if you cant answer it in couple of minutes just move ahead.
Practice Notebooks provided by Databricks Academy
https://github.com/databricks-academy/advanced-data-engineering-with-databricks
I found the Databricks DE Professional exam extremely difficult. I might be lucky to clear this one, but more than passing the certification, it’s the preparation that I really value. It has taught me how to leverage advanced features and apply them in real-world projects.
And that’s what I’d like to suggest to you. Don’t worry if you pass or fail any of the certification exams. Certs are just to validate the skills that you already possess. The most important part of the journey is the preparation, training & learning. Enjoy your learning phase, watching great talks on youtube, attending the Data & AI Summit, listen to how others have implemented their data ecosystems using Databricks.
This is what really matters & will help you to apply the concepts in real-time projects!
Reference Links which I referred to while studying
You can also refer to the LinkedIn posts shared by various Databricks team members. They have been doing a great job by sharing information about the latest updates, tech blogs, free training and key events. Thanks to them, I was able to get my hands on some great articles & some very informative trainings/webinars.