Is Software Engineering a Good Shortcut to Machine Learning Engineering?
Yes, a software engineering role can be a good shortcut to a machine learning engineer (MLE) position. However, it's not a simple "yes" or "no" answer. There are important considerations and caveats.
The Importance of SQL and Python
Given that 90% of real-world machine learning involves classification and regression on structured data housed in data warehouses and relational databases, proficiency in SQL is crucial. These data stores all communicate using SQL. If you are a software engineer primarily experienced with languages like C or C++, transitioning directly to an MLE role can be challenging without SQL skills.
Additionally, Python is essential. Applied machine learning heavily relies on Python and its extensive libraries. While Python may be written in C, the vast ecosystem of libraries used daily for tasks from modeling to data cleansing are built for Python.
Assessing Your Skills
-
SQL and Python Proficiency: If you already possess strong SQL skills (complex joins, data sourcing) and Python knowledge, a software engineering background is indeed a valuable shortcut. You can then focus on learning ML libraries, models (e.g., XGBoost), and the end-to-end machine learning pipeline.
-
Lack of SQL and Python: If you lack SQL and Python, transitioning will require acquiring these skills. Knowing C or other languages is not a direct substitute, as the core tools and libraries are Python-based.
The Machine Learning Workflow
Understanding the typical workflow in machine learning is essential for software engineers looking to transition:
- Data Sourcing: Extracting data from relational databases or data warehouses. This is often the most challenging part.
- Data Cleansing: Cleaning and preparing the data. This can account for a significant portion (estimated at 95%) of an MLE's daily work.
- Modeling: Applying machine learning models. While important, the top models are well-established, making this phase relatively straightforward.
- Production: Deploying the model. Many companies may restrict access to this step, requiring you to hand off the model and instructions for its use to other teams.
Newsletter Recommendation
For those interested in exploring data roles, consider subscribing to a recommended newsletter. It contains valuable content, including an archive detailing various roles in the data space and a machine learning engineering playbook. It's a free resource for gaining in-depth knowledge.
Data Engineering vs. Machine Learning Engineering
If you are already a data engineer, transitioning to an MLE role may not be necessary. Data engineers are highly sought-after and well-compensated. Top-tier data engineers can earn comparable salaries to MLEs, and the role is considered stable and crucial.