Designing privacy-proven pipelines
When any ML model is deployed to run in production, it needs a fully private pipeline that takes in data, preprocesses it, and makes it suitable for training and predictive actions. In this section, let us walk through some of the important concepts to be kept in mind while designing pipelines that take in terabytes or even petabytes of data every millisecond.
Big data pipelines
In a big data pipeline, we incorporate security and privacy across the design in terms of data aggregation, data processing, feature engineering, model training, evaluation, and serving the trained models. Data can come in from innumerable devices ranging from mobile devices, sensors, and IoT and Internet of Medical Things (IoMT) devices in the form of text, numbers, images, or video frames. To architect such an IoT-to-cloud privacy- and security-enabled data pipeline, we follow a hierarchical layered deployment strategy, with four access layers primarily designed to...