IBM Cloud Pak for Data

By : Hemanth Manda, Sriram Srinivasan, Deepak Rangarao

3 (1)

Buy this Book

IBM Cloud Pak for Data

3 (1)

By: Hemanth Manda, Sriram Srinivasan, Deepak Rangarao

Buy this Book

Overview of this book

Cloud Pak for Data is IBM's modern data and AI platform that includes strategic offerings from its data and AI portfolio delivered in a cloud-native fashion with the flexibility of deployment on any cloud. The platform offers a unique approach to addressing modern challenges with an integrated mix of proprietary, open-source, and third-party services. You'll begin by getting to grips with key concepts in modern data management and artificial intelligence (AI), reviewing real-life use cases, and developing an appreciation of the AI Ladder principle. Once you've gotten to grips with the basics, you will explore how Cloud Pak for Data helps in the elegant implementation of the AI Ladder practice to collect, organize, analyze, and infuse data and trustworthy AI across your business. As you advance, you'll discover the capabilities of the platform and extension services, including how they are packaged and priced. With the help of examples present throughout the book, you will gain a deep understanding of the platform, from its rich capabilities and technical architecture to its ecosystem and key go-to-market aspects. By the end of this IBM book, you'll be able to apply IBM Cloud Pak for Data's prescriptive practices and leverage its capabilities to build a trusted data foundation and accelerate AI adoption in your enterprise.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the color images

Conventions used

Get in touch

Share Your Thoughts

Section 1: The Basics

Free Chapter

Chapter 1: The AI Ladder – IBM's Prescriptive Approach

Market dynamics and IBM's Data and AI portfolio

Introduction to the AI ladder

Collect – making data simple and accessible

Organize – creating a trusted analytics foundation

Analyze – building and scaling models with trust and transparency

Infuse – operationalizing AI throughout the business

Summary

Chapter 2: Cloud Pak for Data: A Brief Introduction

The case of a data and AI platform – recap

Overview of Cloud Pak for Data

Exploring unique differentiators, key use cases, and customer adoption

Cloud Pak for Data: additional details

Red Hat OpenShift

Summary

Section 2: Product Capabilities

Chapter 3: Collect – Making Data Simple and Accessible

Data – the world's most valuable asset

Challenges with data-centric delivery

Enterprise data architecture

Data virtualization – accessing data anywhere

Data virtualization versus ETL – when to use what?

Platform connections – streamlining data connectivity

Data estate modernization using Cloud Pak for Data

Summary

Chapter 4: Organize – Creating a Trusted Analytics Foundation

Introducing Data Operations (DataOps)

Organizing enterprise information assets

Establishing metadata and stewardship

Profiling to get a better understanding of your data

Classifying data for completeness

Enabling trust with data quality

Data privacy and activity monitoring

Data integration at scale

Master data management

Extending MDM toward a Digital Twin

Summary

Chapter 5: Analyzing: Building, Deploying, and Scaling Models with Trust and Transparency

Self-service analytics of governed data

BI and reporting

Predictive versus prescriptive analytics

Understanding AI

AI life cycle – Transforming insights into action

AI governance: Trust and transparency

Automating the AI life cycle using Cloud Pak for Data

Summary

Chapter 6: Multi-Cloud Strategy and Cloud Satellite

IBM's multi-cloud strategy

Supported deployment options

Cloud Pak for Data as a Service

IBM Cloud Satellite

A data fabric for a multi-cloud future

Summary

Chapter 7: IBM and Partner Extension Services

IBM and third-party extension services

Collect extension services

Organize extension services

Infuse cartridges

Modernization upgrades to Cloud Pak for Data cartridges

Summary

Chapter 8: Customer Use Cases

Improving health advocacy program efficiency

Voice-enabled chatbots

Risk and control automation

Enhanced border security

Unified Data Fabric

Financial planning and analytics

Summary

Section 3: Technical Details

Chapter 9: Technical Overview, Management, and Administration

Technical requirements

Architecture overview

Infrastructure requirements, storage, and networking

Foundational services and the control plane

Multi-tenancy, resource management, and security

Day 2 operations

Summary

References

Chapter 10: Security and Compliance

Technical requirements

Security and Privacy by Design

Secure operations in a shared environment

User access and authorizations

Meeting compliance requirements

Summary

References

Chapter 11: Storage

Understanding the concept of persistent volumes

Off-cluster storage

Operational considerations

Summary

Organize – creating a trusted analytics foundation

Given that data sits at the heart of AI, organizations will need to focus on the quality and governance of their data, ensuring it's accurate, consistent, and trusted. However, many organizations struggle to streamline their operating model when it comes to developing data pipelines and flows.

Some of the most common data challenges include the following:

Lack of data quality, governance, and lineage
Trustworthiness of structured and unstructured data
Searchability and discovery of relevant data
Siloed data across the organization
Slower time-to-insight for issues that should be real time-based
Compliance, privacy, and regulatory pressures
Providing self-service access to data

To address these many data challenges, organizations are transforming their approach to data: they are undergoing application modernization and refining their data strategies to stay compliant while still fueling innovation.

Delivering trusted data throughout your organization requires the adoption of new methodologies and automation technologies to drive operational excellence in your data operations. This is known as DataOps. This is also referred to as "enterprise data fabric" by many and plays a critical role in ensuring that enterprises are gaining value from their data.

DataOps corresponds to the Organize rung of IBM's AI ladder; it helps answer questions such as the following:

What data does your enterprise have, and who owns it?
Where is that data located?
What systems are using the data in question and for what purposes?
Does the data meet all regulatory and compliance requirements?

DataOps also introduces agile development processes into data analytics so that data citizens and business users can work together more efficiently and effectively, resulting in a collaborative data management practice. And by using the power of automation, DataOps helps solve the issues associated with inefficiencies in data management, such as accessing, onboarding, preparing, integrating, and making data available.

DataOps is defined as the orchestration of people, processes, and technology to deliver trusted, high-quality data to whoever needs it.

People empowering your data citizens

A modern enterprise consists of many different "data citizens" – from the chief data officer; to data scientists, analysts, architects, and engineers; to the individual line of business users who need insights from their data. The Organize rung is about creating and sustaining a data-driven culture that enables collaboration across an organization to drive agility and scale.

Each organization has unique requirements where stakeholders in IT, data science, and the business lines need to add value to drive a successful business. Also, because governance is one of the driving forces needed to support DataOps, organizations can leverage existing data governance committees and lessons from tenured data governance programs to help establish this culture and commitment.

The benefits of DataOps mean that businesses function more efficiently once they implement the right technology and develop self-service data capabilities that make high-quality, trusted data available to the right people and processes as quickly as possible. The following diagram shows what a DataOps workflow might look like: architects, engineers, and analysts collaborate on infrastructure and raw data profiling; analysts, engineers, and scientists collaborate on building analytics models (whether those models use AI); and architects work with business users to operationalize those models, govern the data, and deliver insights to the points where they're needed.

Individuals within each role are designated as data stewards for a particular subset of data. The point data citizens of the DataOps methodology is that each of these different roles can rely on seeing data that is accurate, comprehensive, secure, and governed:

Figure 1.4 – DataOps workflow by roles

IBM has a rich portfolio of offerings (now available as services within Cloud Pak for Data) that address all the different requirements of DataOps, including data governance, automated data discovery, centralized data catalogs, ETL, governed data virtualization, data privacy/masking, master data management, and reference data management.

IBM Cloud Pak for Data

By : Hemanth Manda, Sriram Srinivasan, Deepak Rangarao

IBM Cloud Pak for Data

By: Hemanth Manda, Sriram Srinivasan, Deepak Rangarao

Overview of this book

Related Content you might be interested in

Current Title:

IBM Cloud Pak for Data

Hybrid Cloud Infrastructure and Operations Explained

Principles of Data Fabric

Hybrid Cloud Security Patterns

Organize – creating a trusted analytics foundation

People empowering your data citizens