FAQ

General

Q: What is the difference between Kubernetes and OpenShift?
Q: What is an "add-on" or what does "add-on" mean here?
A: An "add-on" represents the functionality that you introduce into Cloud Pak for Data. It is a declaration that is represented as a tile (typically) in the catalog and has descriptions, links to documentation, images etc. An add-on is introduced usually via a configmap.
Q: We would like to see a User Side walkthrough of “Deploy” for a service, like the Db2... or any other one that uses the Helm “Deploy” button. is this possible?
A: A simple video example is available. See your contact for access.
Q: How can a partner get access to the Cloud Pak for Data software?
A: Access to Cloud Pak for Data and other IBM software is through the PartnerWorld program. https://ibm-10.gitbook.io/certified-for-cloud-pak-onboarding/prereqs/ibm-partnerworld. A short term option is to sign-up for a free trial of Cloud Pak for Data via the link that follows. This gives them an API key they can use to access the services. https://www.ibm.com/account/reg/us-en/signup?formid=urx-42212 Access to updates will expire at the end of 60 days.
Q: What is Red Hat Certification?
Q: Where does the code for the add-on reside?
A: The add-on code resides at the partner's code repository. Client will have a link from CPD to partner code repository. If the partner is an OEM solution, then part of offering readiness is to make the code available via IBM Passport Advantage (PPA).
Q: Where does the add-on installation instruction reside?
A: The add-on instruction is hosted on the partner site. A link from CPD to the partner site is required.
Q: What is the level of verification/audit for compliance for a successful onboard to CPD?
A: The IBM design and content team will do their reviews the artifacts stored in the Playground application to verify completion and approval.
Q: How can the partner obtain Portworx?
A: There are multiple ways to obtain Portworx. CPD comes with a limited use license for Portworx
1. You can download the Portworx entitled version from Passport advantage. Please see Installing the entitled Portworx instance for additional detail.
2. You can also get Portworx Essentials - the free licensed version here: https://docs.portworx.com/concepts/portworx-essentials/ Note this won't work for totally air-gapped environments but, technically is a "permanent" license as well.

Technical

Q: What is the base image required for Red Hat certification?
A: The Base image must be (or must be based on) a supported Red Hat image, such as Red Hat Enterprise Linux or Red Hat Universal Base Image. Any third party or community supported images such as Ubuntu, Debian, Alpine, CentOS etc. are not supported by Red Hat and cannot be certified. More info here: https://redhat-connect.gitbook.io/partner-guide-for-red-hat-openshift-and-container/program-on-boarding/technical-prerequisites
Q: When building our own CASE for integration, would it be possible to get access to the CASE spec?
A: The CASE specification is available publicly at https://github.com/IBM/case
Q: Can I get access to the Cloudctl command line tool to manage Container Application Software for Enterprises (CASEs)?
Q: Can I get access to the Cloud Pak for Data installer to install the Cloud Pak for Data control plane and services on your Red Hat OpenShift cluster?
Q: Is there any restriction to running kubectl or oc commands inside pod images during a “Deploy”?
A: No specific restrictions. The main concern will be on oc ‘adm’ commands, for which a regular pod will not typically be granted the kind of privileges needed.
Q: Are there any restrictions running oc commands during install, inside of Helm Chart hooks?
A: None, except for oc adm commands. However, it may be best to consider an Operator instead of a helm chart if there are such complexities needed.
Q: For Config Map, can it be its own Helm Chart, or does it need to be inside the same Helm Chart as our offering Helm Chart?
A: Yes. Can be in one helm chart.
Q: For Security Hardening related to Inter micro-service https/TLS communication, is this for interacting our service with other services, or does this also apply within intra-pod communication within our own service?
A: TLS is a requirement for any inter-pod communications, even within the same namespace. The expectation is that there is some level of auth as well, perhaps protected by secrets. For example, you do not want arbitrary user code, say some scripts in a Jupyter notebook or a Job to accidentally or maliciously access your pods.
Q: Security Hardening. We start our image as root and run a few commands to install a few packages, then run our service as non-root. Is this okay?
A: No. Running anything as root is something that customers push back on if their scans find the image manifest indicate USER 0 or USER root. The general expectation is for images to be immutable (i.e., all OS packages are pre-loaded). You can have some packages, libraries in persistent volumes and appropriately included in LD_LIBRARY_PATH or PYTHONPATH/CLASSPATH etc.
Q: Are we expected to set the annotations via Helm chart or through our code (especially operator code, i.e., using operator to set up the annotations)?
A: The general guideline is to apply annotations in every yaml (or template); including the Operator's yaml itself. However, for any dynamic workloads, for example any pods or jobs that get spun up by end user interactions (say via the operator), may not use a typical kube yaml in the first place. So, in that case, it would be the Operator itself (or equivalent) that would be responsible for the labels & annotations.
Q: What is the Values.addon.version field? E.g., how do we decide which should be the correct value for the version? Is it an arbitrary value that we can decide? Or does it have to be the same as the "version" filed in Helm's chart.yaml file?
A: Versioning is entirely up-to you; ideally it follows the SemVer convention and corresponds to a "release" of your Service for Cloud Pak for Data. This has nothing to do with helm chart version (if you indeed are using helm templates) nor with any image version-tag.
Q: For the deployment documentation, we have documentation on our company's website. We will provide the URL in our offering and we will document additional steps needed for deployment on OpenShift clusters. Do we need to provide additional documentation on Cloud Pak platform, E.g., do users expect to see the documentation's content without leaving IBM Cloud Pak platform/website?
A: documentation can stay on the partner website. The URL leading to the documentation should consider the customer is coming from and utilizes CPD
Q: In the configmap, we are to provide addon.id. What is this ID? It says it will be provided by IBM onboard team. So how do we apply for it?
A: our platform core team employs an 'add-ons.json" file that uses this name (add-on key) when it installs placeholder tiles as part of CPD install. You can choose the name. Do use alphanumeric and the add-on key should be same in both places. And once finalized, you should not change for upgrade purposes.
Q: Hardware configurations and sizes? Does IBM control these or is OpenShift installed on top of customer hardware? Are there minimum requirements?
A: IBM provides recommendations based on service and expected workload. Customers can use their own hardware or use Managed OpenShift services like IBM ROKS or use Cloud Pak for Data Systems. Minimum requirements depend on the services that are expected to be deployed. For example, for running Cloud Pak for Data base capability + Watson Studio + Watson Machine Learning assemblies for up to 5 concurrent data scientists, as the smallest possible prod system for that we require 3 nodes for HA with 16 vCPUs and 64 GB RAM each, e.g. 48 vCPUs total for the three worker nodes for that example, plus 3 OpenShift master nodes. When activating more function or with increased use more capacity will be needed, e.g. some customers with many data scientists who may need large, fast Python ML environments run on larger nodes with e.g. 8 GPUs + 64 vCPUs x 256 or more GB RAM per node, and may have many of these for larger data science groups. Generally we strive to keep the hardware capacity requirements for low use reasonably low, and allow the customer to scale to larger workloads as needed by using larger nodes and/or adding more nodes to the OpenShift cluster.
Q: Cluster sizes & resource footprint; 10 nodes, 10's of nodes, 100's of nodes? More?
A: Depends on the expected workload, HA requirements etc. Also depends on tenancy models - if a customer is using a "shared" OpenShift cluster for different departments/tenants or having development vs. test vs. production in the same cluster
Q: Virtualized disk setups; CEF running on hyper-converged hardware? SAN? Something else? Performance details of these will be very helpful.
A: Storage is abstracted via Kubernetes/OpenShift Dynamic provisioners/CSI drivers. Customers decide what kind of devices or IaaS disks to use or may depend on what is offered in the Managed OpenShift.
Q: Container registries; can we assume they exist? Can we assume external connectivity to any public registries?
A: Yes. There is usually a Registry used within enterprises. You cannot assume external outbound access in typical data centers. We provide mechanisms for customers to pull/replicate images from IBM's registries via a "Bastion" node through to their air-gapped environments
Q: Operating systems; RHEL7 vs RHCL vs something else?
A: The Operating System is completely abstracted. In OpenShift Container Platform, from a security perspective, the expectation is that workloads are not granted access to the kubernetes hosts. OpenShift v4 uses RHEL Core OS which is considered an "immutable" Linux operating system specially optimized for containerized workloads
Q: Intra-environment Ingress/Egress networking; I believe the OpenShift default is HA Proxy based. Is it expected we use that? I suspect we’ll want to run our own proxies within the cluster. Any issues with that?
A: Depends on the technology and how it fits into OpenShift/Kubernetes. For example, in many environments, there is an external load balancer that integrates well with OpenShift routes/ingresses.
For example, in Cloud Pak for Data, we internally use nginx as a reverse proxy, but it is behind an OpenShift route which allows the customer to externalize the https endpoints in a different way. Inside the cluster, we simply rely on kubernetes svc primitives for load balancing and communications between micro-services. Network Policies and namespaces help isolate "tenants" or at least development vs. production.
Q: External Ingress/Egress networking. If/when there is connectivity to the outside world, what are the restrictions and expectations?
A: The actual technology depends on the IaaS or private cloud where OpenShift gets deployed. However, it’s quite rare to see any outbound access being permitted inside secure data centers where Cloud Pak for Data is typically deployed in.
Q: Can we configure Auth-n/Auth-z/Admissions control webhooks?
A: Technically yes but may depend on the Security policies that the enterprise has. For example, for customers who don't permit our orchestration to set things up, we have provided manual steps, documentation and justification for Security teams to review and "permit" anything that is Cluster scoped. This is especially true in "shared" OpenShift environments where tenancy is strictly controlled, and no individual tenant is given the authority to create cluster level artifacts.
Q: Is the expectation that our software will be isolated to specific nodes or co-located on nodes with other software offerings?
A: Customers generally do not like to use dedicated nodes, but that option (in an isolated namespace) is possible if necessary.
Q: We rely on Cilium (CNI) for network security. Can we configure the CNI where our pods are running? Using CNI chaining to allow for Cilium to do enforcement but no IPAM is probably sufficient.
A: This would depend on whether OpenShift supports this and perhaps on whether customers have policies that allow for alternate CNIs
Q: Certificate management within the environment; what is usually used? We have an internal solution that is similar to (but pre-dates) SPIFFEE/SPIRE
A: there are choices, but this one is the something IBM Cloud Pak’s are standardizing around; https://cert-manager.io/docs/ (also a CNCF project) Q: Where can I find the value for cloudpakInstanceId? What config map in the CPD namespace should we be looking at and what value in that config map specifically?
A: The value is found in the product-configmap in the CLOUD_PAK_INSTANCE_ID variable