ML Model Deployment concepts (part1)
Deployment Deployment is a process of running an application on a server or pipeline Given a model object, how to make it part of your application. It consist of components: Deploy model Low level solution Off the shelf solution (High level solution) Serving model Batch serving Low latency serving Batch Serving Pros: Very flexible Easily set up, you can put the model object/data anywhere Cons: High latency Process: Read input Run inference Return output Low Latency Serving Pros: Low latency Cons: Rigid schema Infrastructure setup overhead Only handles relatively simple model? Complexity: Communication interface setup Format (grpc/json) Security Multiple model deployment management Hybrid Still have the same complexity as low latency but with More accurate model Feasible infrastructure Input doesn’t need to contain all features Packages To properly productionise deploy and serving model, we may want to consider requirements related to Model versioning Model metadata Model artifact Lo...