Data science projects involve majorly the multiple streams of roles performing different functions of big data engineers, data scientists, and operations teams. For data engineers, primary activities include ETL, preparing data sets for analysis, and coding for the models developed by data scientists into scripts. Data scientists are involved in developing the models, evaluating different algorithms and models based on sample test data and validating them with real data.
In this silo-working scenario, the team output may be confined to Poc's and not extend to big projects. However, even the minor tasks require overlap of skills as well as multiple interactions and design sessions with big data engineers, data scientists, and operations teams.
DevOps can bridge the gap between the streams for collaborative working by adopting best practices:
- Tools and platforms evolution: Engineers and data scientists should continuously evaluate and contribute to the creation of new...