Suggest edit — software-engineering-for-machine-learning

Title

Name

Note

---
visibility: public
---

# Software Engineering for Machine Learning

**repo:** [SE-ML/awesome-seml](https://github.com/SE-ML/awesome-seml)  
**category:** [[computer-science|Computer Science]]
**related:** [[learning|Learning]]

---

# Awesome Software Engineering for Machine Learning [](https://github.com/SE-ML/awesome-seml/blob/master/contributing.md)

Software Engineering for Machine Learning are techniques and guidelines for building ML applications that do not concern the core ML problem  -- e.g. the development of new algorithms -- but rather the surrounding activities like data ingestion, coding, testing, versioning, deployment, quality control, and team collaboration.
Good software engineering practices enhance development, deployment and maintenance of production level applications using machine learning components.

⭐ Must-read

🎓 Scientific publication

<br>
Based on this literature, we compiled a survey on the adoption of software engineering practices for applications with machine learning components.

Feel free to [take and share the survey](https://se-ml.github.io/survey) and to [read more](https://se-ml.github.io/practices)!

## Contents

- [Broad Overviews](#broad-overviews)
- [Data Management](#data-management)
- [Model Training](#model-training)
- [Deployment and Operation](#deployment-and-operation)
- [Social Aspects](#social-aspects)
- [Governance](#governance)
- [Tooling](#tooling)

## Broad Overviews

These resources cover all aspects.
- [AI Engineering: 11 Foundational Practices](https://resources.sei.cmu.edu/asset_files/WhitePaper/2019_019_001_634648.pdf) ⭐
- [Best Practices for [Machine Learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning) Applications](https://pdfs.semanticscholar.org/2869/6212a4a204783e9dd3953f06e103c02c6972.pdf)
- [Engineering Best Practices for Machine Learning](https://se-ml.[github](/@harrisonqian/awesome/wiki/development-environment/github).io/practices/) ⭐
- [Hidden Technical Debt in [Machine Learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning) Systems](https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-[learning](/@harrisonqian/awesome/wiki/programming-languages/learning)-systems.pdf) 🎓⭐
- [Rules of [Machine Learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning): Best Practices for ML Engineering](https://developers.google.com/machine-learning/guides/rules-of-ml) ⭐
- [Software Engineering for [Machine Learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning): A Case Study](https://www.microsoft.com/en-us/research/publication/software-engineering-for-machine-[learning](/@harrisonqian/awesome/wiki/programming-languages/learning)-a-case-study/) 🎓⭐

## Data Management

How to manage the data sets you use in machine learning.

- [A Survey on Data Collection for [Machine Learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning) A [Big Data](/@harrisonqian/awesome/wiki/big-data/big-data) - AI [Integration](/@harrisonqian/awesome/wiki/platforms/integration) Perspective_2019](https://deepai.org/publication/a-survey-on-data-collection-for-machine-[learning](/@harrisonqian/awesome/wiki/programming-languages/learning)-a-big-data-ai-integration-perspective) 🎓
- [Automating Large-Scale Data Quality Verification](http://www.vldb.org/pvldb/vol11/p1781-schelter.pdf) 🎓
- [Data management challenges in production machine learning](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/46178.pdf)
- [Data Validation for Machine Learning](https://mlsys.org/Conferences/2019/doc/2019/167.pdf) 🎓
- [How to organize data labelling for ML](https://www.altexsoft.com/blognp/datascience/how-to-organize-data-labeling-for-machine-[learning](/@harrisonqian/awesome/wiki/programming-languages/learning)-approaches-and-tools/)
- [The curse of [big data](/@harrisonqian/awesome/wiki/big-data/big-data) labeling and three ways to solve it](https://aws.amazon.com/blogs/apn/the-curse-of-big-data-labeling-and-three-ways-to-solve-it/)
- [The Data Linter: Lightweight, Automated Sanity Checking for ML Data Sets](http://learningsys.org/nips17/assets/papers/paper_19.pdf) 🎓
- [The ultimate guide to data labeling for ML](https://www.cloudfactory.com/data-labeling-guide)

## Model Training

How to organize your model training experiments.

- [10 Best Practices for Deep Learning](https://nanonets.com/blog/10-best-practices-deep-learning/#track-model-experiments)
- [Apples-to-apples in cross-validation studies: pitfalls in classifier performance measurement](https://dl.acm.org/doi/abs/10.1145/1882471.1882479) 🎓
- [Fairness On The Ground: Applying Algorithmic FairnessApproaches To Production Systems](https://scontent-amt2-1.xx.fbcdn.net/v/t39.8562-6/159714417_1180893265647073_4215201353052552221_n.pdf?_nc_cat=111&ccb=1-3&_nc_sid=ae5e01&_nc_ohc=6WFnNMmyp68AX95bRHk&_nc_ht=scontent-amt2-1.xx&oh=7a548f822e659b7bb2f58a511c30ee19&oe=606F33AD)🎓
- [How do you manage your [Machine Learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning) Experiments?](https://medium.com/@hadyelsahar/how-do-you-manage-your-machine-[learning](/@harrisonqian/awesome/wiki/programming-languages/learning)-experiments-ab87508348ac)
- [Machine [Learning](/@harrisonqian/awesome/wiki/programming-languages/learning) [Testing](/@harrisonqian/awesome/wiki/testing/testing): Survey, Landscapes and Horizons](https://arxiv.org/pdf/1906.10742.pdf) 🎓
- [Nitpicking [Machine Learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning) Technical Debt](https://matthewmcateer.me/blog/machine-[learning](/@harrisonqian/awesome/wiki/programming-languages/learning)-technical-debt/)
- [On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach](https://link.springer.com/article/10.1023/A:1009752403260) 🎓⭐
- [On human intellect and machine failures: Troubleshooting integrative [machine learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning) systems](https://arxiv.org/pdf/1611.08309.pdf) 🎓
- [Pitfalls and Best Practices in Algorithm Configuration](https://www.jair.org/index.php/jair/article/download/11420/26488/) 🎓
- [Pitfalls of supervised feature selection](https://academic.oup.com/bioinformatics/article/26/3/440/213774) 🎓
- [Preparing and Architecting for Machine Learning](https://www.gartner.com/en/documents/3889770/preparing-and-architecting-for-machine-[learning](/@harrisonqian/awesome/wiki/programming-languages/learning)-2018-upd)
- [Preliminary Systematic Literature Review of [Machine Learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning) System Development Process](https://arxiv.org/abs/1910.05528) 🎓
- [Software development best practices in a [deep learning](/@harrisonqian/awesome/wiki/computer-science/deep-learning) environment](https://towardsdatascience.com/software-development-best-practices-in-a-deep-[learning](/@harrisonqian/awesome/wiki/programming-languages/learning)-environment-a1769e9859b1)
- [Testing and Debugging in Machine Learning](https://developers.google.com/machine-learning/testing-debugging)
- [What Went Wrong and Why? Diagnosing Situated Interaction Failures in the Wild](https://www.microsoft.com/en-us/research/publication/what-went-wrong-and-why-diagnosing-situated-interaction-failures-in-the-wild/) 🎓

## Deployment and Operation

How to deploy and operate your models in a production environment.

- [Best Practices in [Machine Learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning) Infrastructure](https://algorithmia.com/blog/best-practices-in-machine-[learning](/@harrisonqian/awesome/wiki/programming-languages/learning)-infrastructure)
- [Building Continuous [Integration](/@harrisonqian/awesome/wiki/platforms/integration) Services for Machine Learning](http://pages.cs.wisc.edu/~wentaowu/papers/kdd20-ci-for-ml.pdf) 🎓
- [Continuous Delivery for Machine Learning](https://martinfowler.com/articles/cd4ml.html) ⭐
- [Continuous Training for Production ML in the [TensorFlow](/@harrisonqian/awesome/wiki/computer-science/tensorflow) Extended (TFX) Platform](https://www.usenix.org/system/files/opml19papers-baylor.pdf) 🎓
- [Fairness Indicators: Scalable Infrastructure for Fair ML Systems](https://ai.googleblog.com/2019/12/fairness-indicators-scalable.html) 🎓
- [Machine [Learning](/@harrisonqian/awesome/wiki/programming-languages/learning) Logistics](https://mapr.com/ebook/machine-learning-logistics/)
- [Machine [learning](/@harrisonqian/awesome/wiki/programming-languages/learning): Moving from experiments to production](https://blog.codecentric.de/en/2019/03/machine-learning-experiments-production/)
- [ML Ops: [Machine Learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning) as an engineered disciplined](https://towardsdatascience.com/ml-ops-machine-[learning](/@harrisonqian/awesome/wiki/programming-languages/learning)-as-an-engineering-discipline-b86ca4874a3f)
- [Model Governance Reducing the Anarchy of Production](https://www.usenix.org/conference/atc18/presentation/sridhar) 🎓
- [ModelOps: Cloud-based lifecycle management for reliable and trusted AI](http://hummer.io/docs/2019-ic2e-modelops.pdf)
- [Operational Machine Learning](https://www.kdnuggets.com/2018/04/operational-machine-[learning](/@harrisonqian/awesome/wiki/programming-languages/learning)-successful-mlops.html)
- [Scaling [Machine Learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning) as a Service](http://proceedings.mlr.press/v67/li17a/li17a.pdf)🎓
- [TFX: A [tensorflow](/@harrisonqian/awesome/wiki/computer-science/tensorflow)-based Production-Scale ML Platform](https://dl.acm.org/doi/pdf/10.1145/3097983.3098021?download=true) 🎓
- [The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction](https://research.google/pubs/pub46555/) 🎓
- [Underspecification Presents Challenges for Credibility in Modern Machine Learning](https://arxiv.org/abs/2011.03395) 🎓
- [Versioning for end-to-end [machine learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning) pipelines](https://doi.org/10.1145/3076246.3076248) 🎓

## Social Aspects

How to organize teams and projects to ensure effective collaboration and accountability.

- [Data Scientists in Software Teams: State of the Art and Challenges](http://web.cs.ucla.edu/~miryung/Publications/tse2017-datascientists.pdf) 🎓
- [Machine [Learning](/@harrisonqian/awesome/wiki/programming-languages/learning) Interviews](https://github.com/chiphuyen/machine-learning-systems-design/blob/master/build/build1/consolidated.pdf)
- [Managing [Machine Learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning) Projects](https://d1.awsstatic.com/whitepapers/aws-managing-ml-projects.pdf)
- [Principled [Machine Learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning): Practices and Tools for Efficient Collaboration](https://dev.to/robogeek/principled-machine-[learning](/@harrisonqian/awesome/wiki/programming-languages/learning)-4eho)

## Governance
- [A Human-Centered Interpretability Framework Based on Weight of Evidence](https://arxiv.org/pdf/2104.13299.pdf) 🎓
- [An Architectural Risk Analysis Of [Machine Learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning) Systems](https://berryvilleiml.com/docs/ara.pdf) 
- [Beyond Debiasing](https://complexdiscovery.com/wp-content/uploads/2021/09/EDRi-Beyond-Debiasing-Report.pdf)
- [Closing the AI Accountability Gap: Defining an End-to-End Framework for Internal Algorithmic Auditing](https://dl.acm.org/doi/pdf/10.1145/3351095.3372873) 🎓
- [Inherent trade-offs in the fair determination of risk scores](https://arxiv.org/abs/1609.05807) 🎓
- [Responsible AI practices](https://ai.google/responsibilities/responsible-ai-practices/) ⭐
- [Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims](https://arxiv.org/abs/2004.07213)
- [Understanding Software-2.0](https://dl.acm.org/doi/abs/10.1145/3453478) 🎓

## Tooling

Tooling can make your life easier.

We only share open source tools, or commercial platforms that offer substantial free packages for research.

- [Aim](https://aimstack.io) - Aim is an open source experiment tracking tool.
- [Airflow](https://airflow.apache.org/) - Programmatically author, schedule and monitor workflows.
- [Alibi Detect](https://github.com/SeldonIO/alibi-detect) - [Python](/@harrisonqian/awesome/wiki/programming-languages/python) library focused on outlier, adversarial and drift detection.
- [Archai](https://github.com/microsoft/archai) - Neural architecture search.
- [Data Version Control (DVC)](https://dvc.org/) - DVC is a data and ML experiments management tool.
- [Facets Overview / Facets Dive](https://pair-code.[github](/@harrisonqian/awesome/wiki/development-environment/github).io/facets/) - Robust visualizations to aid in understanding [machine learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning) [datasets](/@harrisonqian/awesome/wiki/miscellaneous/datasets).
- [FairLearn](https://fairlearn.[github](/@harrisonqian/awesome/wiki/development-environment/github).io/) - A toolkit to assess and improve the fairness of [machine learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning) models.
- [Git Large File System (LFS)](https://git-lfs.[github](/@harrisonqian/awesome/wiki/development-environment/github).com/) - Replaces large files such as [datasets](/@harrisonqian/awesome/wiki/miscellaneous/datasets) with text pointers inside Git.
- [Great Expectations](https://github.com/great-expectations/great_expectations) - Data validation and [testing](/@harrisonqian/awesome/wiki/testing/testing) with [integration](/@harrisonqian/awesome/wiki/platforms/integration) in pipelines.
- [HParams](https://github.com/PetrochukM/HParams) - A thoughtful approach to configuration management for [machine learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning) projects.
- [Kubeflow](https://www.kubeflow.org/) - A platform for data scientists who want to build and experiment with ML pipelines.
- [Label Studio](https://github.com/heartexlabs/label-studio) - A multi-type data labeling and annotation tool with standardized output format.
- [LiFT](https://github.com/linkedin/LiFT) - Linkedin fairness toolkit.
- [MLFlow](https://mlflow.org/) - Manage the ML lifecycle, including experimentation, deployment, and a central model registry.
- [Model Card Toolkit](https://github.com/tensorflow/model-card-toolkit) - Streamlines and automates the generation of model cards; for model documentation.
- [Neptune.ai](https://neptune.ai/) - Experiment tracking tool bringing organization and collaboration to [data science](/@harrisonqian/awesome/wiki/programming-languages/data-science) projects.
- [Neuraxle](https://github.com/Neuraxio/Neuraxle) -  Sklearn-like framework for hyperparameter tuning and AutoML in [deep learning](/@harrisonqian/awesome/wiki/computer-science/deep-learning) projects.
- [OpenML](https://www.openml.org) - An inclusive movement to build an open, organized, online ecosystem for [machine learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning).
- [PyTorch Lightning](https://github.com/PyTorchLightning/pytorch-lightning) - The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.
- [REVISE: REvealing VIsual biaSEs](https://github.com/princetonvisualai/revise-tool) - Automatically detect bias in visual data sets.
- [Robustness Metrics](https://github.com/google-research/robustness_metrics) - Lightweight modules to evaluate the robustness of classification models.
- [Seldon Core](https://github.com/SeldonIO/seldon-core) - An MLOps framework to package, deploy, monitor and manage thousands of production [machine learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning) models on [Kubernetes](/@harrisonqian/awesome/wiki/back-end-development/kubernetes).
- [Spark Machine Learning](https://spark.apache.org/mllib/) - Spark’s ML library consisting of common [learning](/@harrisonqian/awesome/wiki/programming-languages/learning) [algorithms](/@harrisonqian/awesome/wiki/theory/algorithms) and utilities.
- [TensorBoard](https://www.[tensorflow](/@harrisonqian/awesome/wiki/computer-science/tensorflow).org/tensorboard/) - [TensorFlow](/@harrisonqian/awesome/wiki/computer-science/tensorflow)'s Visualization Toolkit.
- [Tensorflow Extended (TFX)](https://www.[tensorflow](/@harrisonqian/awesome/wiki/computer-science/tensorflow).org/tfx/) - An end-to-end platform for deploying production ML pipelines.
- [Tensorflow Data Validation (TFDV)](https://github.com/tensorflow/data-validation) - Library for exploring and validating [machine learning](/@harrisonqian/awesome/wiki/computer-science/machine-learning) data. Similar to Great Expectations, but for [Tensorflow](/@harrisonqian/awesome/wiki/computer-science/tensorflow) data.
- [Weights & Biases](https://www.wandb.com/) - Experiment tracking, model optimization, and dataset versioning.

## Contribute

Contributions welcomed! Read the [contribution guidelines](contributing.md) first