Site icon Steven Astorino

Artificial Intelligence – Evolution and Revolution

I’ve just released the second print of a book I co-authored titled “Artificial Intelligence – Evolution and Revolution”. Here’s chapter 8 – just as an extract that you might find interesting on the value of taking a hybrid multicloud platform approach for implementing Data and AI related projects.

If you enjoy the read, a complimentary copy of the book is available to download at the end of this blog post.

“Delivering a Hybrid Multicloud Data and AI Platform”

So far, we have discussed the value, as well as many aspects, challenges, and capabilities of ML and AI. But what good is something of value if it remains out of reach for the vast majority or is limited to just a select few of highly skilled people in the data science community?

To that end, the industry needs to make the aforementioned technologies more accessible and consumable. Some vendors or companies publish APIs to a range of ML and AI services, but that alone still infers a level of technical ability that might be out of reach for many. APIs are just one small aspect of the overall data science experience. While some people may like to build a vehicle from a kit or individual components, the vast majority of the public prefers to buy a ready-to-drive vehicle that meets their long-term needs to take them on their many journeys.

The Best Performers Are Data-Driven
While many organizations are struggling with the challenges of data complexity, some organizations are finding success as they embrace a modern data strategy. Data-savvy organizations are more likely to leverage data in a manner that informs decision-making and to strategically address unmet needs with new data-driven business models. When you provide organizationwide access to previously siloed data, configure governance policies, and address data-quality concerns, you are ready to make large strategic AI investments that can ultimately lead to outperforming revenue targets and thereby increase profitability.

The IBM Institute of Business Value (IBV) conducts regular surveys of organizations to identify market outperformers and looks for patterns that set them apart. The 20th edition of the C-Suite study was published in 2020 and draws input from over 13,000 respondents across multiple C-suite roles, industries, and countries. In this most recent edition of the study, companies are categorized based on their ability to create value from data and the degree to which they have integrated their data and business strategy. Identified as “torchbearers,” 9% of companies surveyed have shown the most leadership in this area. There are some striking numbers in this study about these “torchbearer” companies:

Source: IBM Institute of Business Value Study of 13,000 c-suite leaders:

The bottom line: You must outperform your competitors or risk being outperformed by them.

What IBM has learned from countless AI projects is that every step of the journey is critical. AI is not magic; it requires a thoughtful and well-architected approach. For example, the majority of AI failures are due to problems in data preparation and data organization, not the AI models themselves. Success with AI models depends on achieving success first with how you collect and organize data.

The AI Ladder, shown in Figure 8.1, represents a prescriptive approach to help customers overcome data challenges and accelerate their journey to AI, no matter where they are on their journey. It enables them to simplify and automate how an organization turns data into insights by unifying the collection, organization, and analysis of data, regardless of where it lives. By climbing the ladder to AI, enterprises can build a governed, efficient, agile, and future-proof approach to AI.

The AI Ladder has four steps (often referred to as “rungs”):

Figure 8.1: Four steps of the AI Ladder

  1. Collect: Make data simple and accessible
    Collect data of every type, regardless of where it lives, enabling flexibility in the face of ever-changing data sources. Note that “collect” does not mean put data all in one place. In fact, quite the opposite. It means virtualizing the data, allowing access to wherever it lives as if it were consolidated.
  2. Organize: Create a business-ready analytics foundation
    Organize collected data into a trusted, business-ready foundation with built-in governance, protection, and compliance.
  3. Analyze: Build and scale AI with trust and transparency
    Analyze data in automated ways and benefit from AI models that empower teams to gain new insights and make better, smarter decisions.
  4. Infuse: Operationalize AI throughout the business
    Infuse AI throughout the business (across multiple departments and within various processes), drawing on predictions, automation, and optimization.

These steps can be further broken down into a set of key capabilities, shown in Figure 8.2.

Figure 8.2: AI Ladder capabilities

Supporting the AI Ladder is the concept of modernization, which is how customers can simplify and automate how they turn data into insights by unifying the collection, organization, and analysis of data, regardless of where it lives, within a secure hybrid cloud platform.

The following priorities are built into the IBM technologies that support this AI ladder:

Reducing Complexity with a Data Fabric
Enterprises face all sorts of complexities in implementing their uses cases using current approaches, such as providing a 360-degree view of the data and Master Data Management use cases, regulatory compliance, operational analytics, business intelligence, and data science, to name a few.

As many infrastructures grow, enterprises can often face higher compliance, security, and governance risks. This can result in complexity and a high level of effort to enforce policies and perform stewardship. Complex infrastructures can lead to higher costs of integrating data and stitching data pipelines across multiple platforms and tools. In turn, these can bring more reliance on IT, making collaboration more challenging and possibly slowing time to value, whereas business-led self-service analytics, insights, and democratization of data could help deliver greater business agility.

What’s needed is a new design or approach that provides an abstraction layer to share and use data, with data and AI governance, across a hybrid cloud landscape—without a massive pendulum swing to having everything de-centralized. It’s a balance between what needs to be logically or physically decentralized and what needs to be centralized. For example, an enterprise can have multiple catalogs, but there can be only one source of truth for the global catalog.

A data fabric is a data management architecture that helps optimize access to distributed data and intelligently curate and orchestrate it for self-service delivery to data consumers. Some of a data fabric’s key capabilities are listed below:

The data fabric approach should enable organizations to better manage, govern, and use data to balance agility, speed, SLAs, and trust. Trust covers deep enforcement of governance, security, and compliance. There is also the total cost of ownership and performance (TCO/P). This covers integration costs, egress costs, bandwidth costs, processing costs vs. performance, etc. A data fabric could offer these benefits by orders of magnitude over the complexities often seen across many enterprise infrastructures.

IBM Cloud Pak for Data: A Hybrid Cloud Data and AI Platform
The IBM Cloud Pak for Data embodies everything you have just read about in a unified Enterprise Insight Platform (EIP) that runs on multiple vendors’ clouds and infrastructures. EIP is a term used by industry analysts and consultants as a category for describing integrated sets of data management, analytics, and development tools.

The first core tenet of Cloud Pak for Data is that you can run it anywhere. You can co-locate it where you are making your infrastructure investments. This means you can deploy Cloud Pak for Data on many major cloud vendor’s platforms, as well as the IBM Cloud. You can also deploy it on premises for the case in which you are developing a hybrid cloud approach. Finally, on IBM Cloud, you can subscribe to Cloud Pak for Data-as-a-Service if you need a fully managed option where you only pay for what you use. Cloud Pak for Data helps organizations to have deployment flexibility to run anywhere.

Figure 8.3: Cloud Pak for Data

Cloud Pak for Data is built on the foundation of Red Hat OpenShift. This provides the flexibility for customers to scale across any infrastructure using the leading open-source steward: Red Hat. Red Hat OpenShift is a Kubernetes-based platform that allows IBM to deploy software through a container-based model, delivering greater agility, control, and portability.

IBM’s Cloud Pak offerings all share a common control plane, which makes administration and integration of diverse services easy.

Cloud Pak for Data includes a set of preintegrated data services that allow you to collect information from any repository, such as databases, data lakes, data warehouses, etc. The design point here is for customers to leave the data in all the places where it already resides, but to its users it seems like the enterprise data is in one spot.

Once all of an enterprise’s data has been connected, industry-leading data organization services can be deployed that allow for the development of an enterprise data catalog. This capability enables a “shop for data” type of experience and enforces governance across all data sources, thereby enabling data consumers to have a single place to go for all their data needs.

With your enterprise data connected and cataloged, Cloud Pak for Data presents a wide variety of data analysis tools out of the box. For example, there is a wealth of data science capabilities that cater to all skill levels (meaning no-code, low-code, and all code). Users can quickly grab data from the catalog and instantly start working toward generating insights in a common workflow built around the “project” concept.

For additional capabilities, a large set of extended services is available for Cloud Pak for Data that presents more-specialized data management and analytics capabilities. These range from powerful IBM solutions, like Planning Analytics with Watson, to solutions from IBM Partners that offer business ontology creation, open-source databases, and more.

Automation: The Key to Agility
Cloud Pak for Data takes automation to the next level. Watson Query capabilities allow you to leave your data where it resides and connect to all structured or unstructured data sources in your enterprise without data movement. Building on that data collection, AutoCatalog and AutoPrivacy supercharge data discovery and ensure enforcement of governance policies across many sources and users. On top of this, AutoAI makes it easy for data analysts and data scientists to generate new models in a fast, low-code manner with an award-winning graphical interface and design. Figure 8.4 summarizes this.

Figure 8.4: Automation capabilities within Cloud Pak for Data

Let’s dive a little deeper into these automation-capabilities.

Your Data and AI: How and Where You Want It
IBM’s open information architecture for AI is built upon Cloud Pak for Data on Red Hat OpenShift, built for a hybrid cloud world. What does this mean? In one word: flexibility. To further explain, consider the following:

Regardless of the form factor and the degree of management control needed, Cloud Pak for Data provides cloud-native data management services that modernize how businesses collect, organize, and analyze data and then infuse AI throughout their organizations.

In summary, Cloud Pak for Data is designed to provide a unified, integrated user experience to collect, organize, and analyze data and infuse AI throughout the enterprise. Much of the complexities of managing and orchestrating data and other artifacts can be abstracted through the data fabric architectural approach. Think of the data fabric as the “magic” that can help make more of an organization’s data, applications, and services ready for AI by automating and augmenting a lot of the steps that would otherwise have to be undertaken by large groups of architects, administrators, and data scientists.

Having read this exact, if you are interested in reading the whole book you can download it at no cost here.

Exit mobile version