Facebook offers a wide variety of products and services, most of which leverage machine learning. Each machine learning model requires first a training phase and then, once deployed, an inference phase. For the training phase, the model can ingest hundreds of terabytes of data. For the inference phase, depending on the product, the model may be run tens-of trillions of times per day. Generally, this needs to be performed in real-time. System design, therefore, is key in order to meet these requirements and guarantee optimal performance.
The post will analyze two papers ([1], [2]) published by Facebook in order to highlight the importance of system design in machine learning, illustrating three lessons that will be useful for any machine learning engineer. …
About