Hello World - Applied ML Dev Ops
Hello World - Applied ML Dev Ops
Scope of Applied ML Dev Ops Repo
We deliver publicly re-usable external customer ready ML dev ops reference approaches and code. We develop and pilot these approaches through engagements, genericize and publish them for use by the public.
0. Efficient Model Design and Evaluation
a.) Provide methods to version, organize and share data, environment, model and hyperparameters for reproducability.
b.) Provide utilities and methods to log models and dependencies, and evaluate the impact of changes to target metrics.
c.) Provide utilities and methods for pipelining data of variable size and type.
1. Continuous Deployment
a.) Provide methods and code to construct entire process automation for deployment of new models, iterated models, as well as new data sources, and updated data schema.
b.) Improve parallelization of work by ML, data engineering and development by minimizing dependencies, and providing automation around remaining dependencies.
c.) Enable scaling up or down in operation.
d.) Enable differential privacy or other privacy methods.
2. Continuous Integration
a.) Provide methods and code to enable continuous integration of machine learning projects, from data pipelining, to model development and training, to model deployment and retraining.
b.) Reduce friction in hand-offs between ML, data engineering and development.
c.) Ease and automate data governance and curation workload to enable easier benchmarking across ML engineers and model iterations.
3. Continuous Learning & Monitoring
a.) Provide methods and code to benchmark model performance against reference data set.
b.) Provide methods and code to measure compute consumptions against benchmarks and in continual operation.
c.) Provide tools to monitor data quality and schema conformance.
d.) Provide methods and code to perform experimentation, from manual experiments to flighting.
Tenets
For each of the areas, we have a few guiding tenets that focus our deliverables.
- Simplicity. – It’s easy for engineers to understand and use.
- Productivity. – We make ML engineering work more efficient and productive alone or with data engineering and other engineering.
- Flexibility. – We don’t constrain ML approaches.
- Scalability. – We start by enabling simple deployments but enable scale.
- Privacy. – We enable privacy controls to be deployed protecting from disclosure of private information.
- Azure First. – We start with coverage of deployments in Azure environment.
Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.