In this chapter we will focus on actual implementation of the core tasks in data science life cycle using Greenplum analytics platform. As a quick recap, let us look at all that we covered until now. We have defined characteristics of Big Data, requirements for the next generation analytics, and business intelligence platform. We have also learnt about various phases of data science life cycle, and understood all that Greenplum has to offer to address the analytics' requirements. We have covered a little theory on some standard analytical methods and have had a quick onboarding exercise for R, Weka, and MADlib frameworks. We now have analytics' requirements and we also know where Greenplum product suite can be leveraged.
Let's now look at the implementation using Greenplum Products. We will also look at integration between various components.
This chapter covers the following topics:
Data loading
Structured (into Greenplum)
Using Greenplum...