Ramon Perez


Running CNF Cert Suite certification with dci-openshift-app-agent

The dci-openshift-app-agent supports the execution of multiple test suites to validate containers, virtual functions, Helm charts, and operators. These suites are built as Ansible roles, helping the partners on getting prepared for the Red Hat certifications or actually running the certification process on the workloads deployed via DCI.

One of the test suites included on dci-openshift-app-agent is the CNF Cert Suite (old and new repo), to simplify this set of testing tools. Thanks to this integration, it is possible to run the certification tools on a daily basis with the automation capabilities provided by DCI, validating that the tested workloads are ready for certification.

This blog post is useful for people getting familiar with the usage of CNF Cert Suite using dci-openshift-app-agent as a tool to automate the whole process. We are going to focus mainly in 3 areas:

  1. The code structure of the dci-openshift-app-agent, in terms of the integration of the CNF Cert Suite, will be reviewed, focusing on the cnf-cert role.
  2. A practical example already defined on dci-openshift-app-agent, called tnf_test_example, will be presented, in order to see how to define a workload, based on containers and operators, that will be deployed on a running OpenShift cluster with DCI in order to be tested by CNF Cert Suite.
  3. We will review the configuration needed to deploy the tnf_test_example and have it tested using the CNF Cert Suite all via the dci-openshift-app-agent.

The targeted audience for this blog post is people that are used to using CNF Cert Suite and dci-openshift-app-agent. For a more general overview, please see the following presentation. Also, please refer to the documentation for tools like dci-openshift-app-agent, CNF Cert Suite, etc. to get more particular details about them.

Code structure: the cnf-cert role

This Ansible role, included on dci-openshift-app-agent, encapsulates the logic for the CNF Cert Suite, based on the following assumptions regarding the certification suite:

  • The configuration file used by the suite is reduced to the minimum, mainly using the auto-discovery capabilities to detect the resources under test.
  • The suite is executed with a pre-built container, running the tests on DCI.

Tasks executed

After deploying the workloads to be tested by CNF Cert Suite, in the DCI Red Hat tests phase, the main cnf-cert role is executed, following these steps sequentially:

  • Save images related to CNF Cert Suite execution in a provided local registry if we are in a disconnected environment.
  • Create a temporary directory to clone test-network-function (TNF) repo.
  • Clone the correct TNF version, depending if we are testing a stable branch or a pull request from the CNF Cert Suite repository:
    • If testing a pull request, the container image is built, based on the code included in the pull request. A customized DCI component is also created based on the latest commit SHA hash in the pull request.
    • If testing a stable branch, download the container image from Quay.
  • Generate the configuration file, based on a template, which takes care of filling the following fields:
    • targetNameSpaces, including the namespaces from which the certification suite has to look for auto-discovery labels.
    • targetPodLabels, defining the auto-discovery labels to be checked by the suite.
    • acceptedKernelTaints, including the tainted modules.
  • Run CNF Cert Suite with the correct parameters, being able to tune configurations like the location of the partner repository, the log level, the type of tests (intrusive or safe), etc.
  • Copy the log files generated in the execution in a log folder, to be uploaded to DCI afterwards. Four main files are gathered after the execution:
    • The created configuration file.
    • The generated claim.json file by the CNF Cert Suite.
    • The XML file containing the test results in JUnit format.
    • A file called execution.log, containing the standard output and standard error from the execution of the certification suite.

Then, after finishing the tests, in the DCI post-run phase, the environment is cleaned in the following way:

  • Clean CNF Cert Suite resources if desired (e.g. default namespaces, daemon sets, etc. created during the execution).
  • Delete the temporary directory.

Variables to have in mind

The tasks executed on the cnf-cert role rely on variables that allow DCI users to provide the configuration needed by dci-openshift-app-agent to run the CNF Cert Suite properly.

The configuration does not include the deployment of the workloads (containers, operators, etc.), those steps are done in the dci-openshift-app-agent hooks. Then, these configurations for the CNF Cert Suite act on top of the workloads deployed in the hooks.

The main variables to have in mind, whose default values are these for some generic variables, and these for some specific variables related to the certification suite, are the following:

  • Generic:
    • do_cnf_cert: boolean variable that activates or not the execution of the CNF Cert Suite.
    • dci_disconnected: boolean variable that indicates if we are in a disconnected environment or not.
    • provisionhost_registry: registry to be used on disconnected environments.
    • partner_creds: file including partner credentials to access private registries.
    • sync_cnf_cert_and_preflight: boolean variable that activates or not the gathering of the data related to operators tested by CNF Cert Suite, as well as to validate operators and container images via a role called preflight. More information regarding this functionality is available in its specific role.
  • Specific:

    • test_network_function_version: allows to indicate the CNF Cert Suite version to use, pointing to a specific release version or to the latest code released, referenced with HEAD. HEAD version (in the main branch) can be used, but is not guaranteed.
    • tnf_suites: list of executed test suites by the CNF Cert Suite, separated by spaces.
    • tnf_config: complex variable used to fill the CNF Cert Suite configuration file, allowing to test multiple resources on different namespaces, and including a list of elements composed by:
      • namespace: namespace in which we want to autodiscover workloads.
      • targetpodlabels: list of auto-discovery labels to consider by the CNF Cert Suite for pod testing.
      • operators_regexp1 (optional): a regular expresion (regex) to select operators to be tested by the CNF Cert Suite.
      • exclude_connectivity_regexp1 (optional): a regex to exclude containers from the connectivity test.
    • accepted_kernel_taints: allow-list for tainted modules. It must be composed of a list of elements called module: "<module_name>".
    • tnf_non_intrusive_only: skip intrusive tests which may disrupt cluster operations.
    • tnf_run_cfd_test: run the test suites from openshift-kni/cnf-feature-deploy prior to the actual CNF certification test execution. The results are incorporated in the same claim.
    • tnf_log_level: log level in the CNF Cert Suite.
    • tnf_postrun_delete_resources: boolean variable, to whether or not keep resources after the CNF Cert Suite execution. Used for debugging purposes.

    1 The logic for these settings requires an implementation. See examples in the following section.

Example: the tnf_test_example use case

Before executing the CNF (Cloud-native network function) Cert Suite, it is needed to deploy the workloads and to label the pods and operators to test with the auto-discovery labels required by CNF Cert Suite. This can be done manually or programmatically. An example of this can be found in tnf_test_example.

This example deploys a couple of pods in two different namespaces, to be used with the CNF Test Suite in a multi-namespace scenario.

The Deployment specification of this pod, obtained from this repository, is a suitable one for passing all the test suites from the CNF Test Suite.

It also deploys an operator in one of the namespaces, based on simple-demo-operator-bundle, in order to execute CNF Cert Suite and Preflight tests over this operator.

Hooks implemented

Here are the steps on each hook for this example:

  • pre-run:
    • Install required RPM packages.
    • Prepare the operator for disconnected environments.
  • install:
    • Create namespaces and deploy (based on this template) test pods on each namespace. Here, it is possible to create skip_connectivity_tests label when the exclude_connectivity_regexp variable is defined in the tnf_config.
    • Deploy simple-demo-operator in one of the two namespaces under test.
    • Tag simple-demo-operator CSV with the auto-discovery label when operators_regexp is defined in the tnf_config.
  • teardown:
    • Delete the namespaces under test if not done before (so, the workloads are automatically removed).
    • Delete the resources related to simple-demo-operator.

Variables to have in mind

To deploy this example, it is needed to define the following variables:

  • dci_config_dir: it must point to "/var/lib/dci-openshift-app-agent/samples/tnf_test_example", place in which this example is defined. This variable allows to incorporate the hooks defined there to the execution of dci-openshift-app-agent.
  • dci_openshift_app_image: it references the image to be used by the workloads. In this case, it must point to “quay.io/testnetworkfunction/cnf-test-partner:latest”.
  • dci_openshift_app_ns: base namespace to deploy workloads. It must be set to “test-cnf".
  • tnf_config: defining two elements, to deploy the workload in two different namespaces. In one of them, simple-demo-operator is referenced. When showing an example of DCI job, the full definition of this variable will be provided.
  • tnf_operator_to_install: references the information related to simple-demo-operator, including:
    • operator_name: it would be “simple-demo-operator”.
    • operator_version: referencing the correct version of the operator. In the tests, it is used “v0.0.5”.
    • operator_bundle: including the bundle for the operator. In our case, it would be "quay.io/telcoci/simple-demo-operator-bundle@sha256:8a4b6e4a430a520b438d91c5ecb815de3c49b204488dafd910d2a450ede1692a".

Example of DCI job running tnf_test_example with CNF Cert Suite

In order to execute an example of a DCI job, managed by dci-openshift-app-agent, making use of the tnf_test_example and running CNF Cert Suite, just follow these steps:

  1. Confirm you have a cluster up and running:

    $ export KUBECONFIG=/var/lib/dci-openshift-app-agent/kubeconfig
    $ oc version
    Client Version: 4.11.0-0.nightly-2022-04-24-135651
    Kustomize Version: v4.5.4
    Server Version: 4.11.0-0.nightly-2022-04-24-135651
    Kubernetes Version: v1.23.3+d464c70

    $ oc get nodes NAME STATUS ROLES AGE VERSION master-0 Ready master,worker 12h v1.23.3+54654d2 master-1 Ready master,worker 12h v1.23.3+54654d2 master-2 Ready master,worker 12h v1.23.3+54654d2

  2. Create a settings.yml file and place it in /etc/dci-openshift-app-agent/settings.yml, with the following content:

    $ cat /etc/dci-openshift-app-agent/settings.yml
    ---
    # dci-openshift-app-agent settings
    # defaults from /usr/share/dci-openshift-app-agent/group_vars/all
    # Remove "debug" when your jobs are working to get them in the
    # statistics:
    dci_tags: ["debug", "blog-post", "dci-openshift-app-agent", "tnf_v3.3.3"]
    dci_config_dir: "/var/lib/dci-openshift-app-agent/samples/tnf_test_example"
    dci_openshift_app_image: quay.io/testnetworkfunction/cnf-test-partner:latest
    dci_openshift_app_ns: "test-cnf"
    do_cnf_cert: true
    test_network_function_version: "v3.3.3"
    tnf_suites: "access-control networking lifecycle observability platform-alteration operator"
    tnf_config:
      - namespace: "test-cnf"
        targetpodlabels: [environment=test]
        operators_regexp: "simple-demo-operator"
        exclude_connectivity_regexp: ""
      - namespace: "production-cnf"
        targetpodlabels: [environment=production]
        operators_regexp: ""
        exclude_connectivity_regexp: ""
    tnf_operator_to_install:
      operator_name: simple-demo-operator
      operator_version: "v0.0.5"
      operator_bundle: "quay.io/telcoci/simple-demo-operator-bundle@sha256:8a4b6e4a430a520b438d91c5ecb815de3c49b204488dafd910d2a450ede1692a"
    ...
    

  3. Run dci-openshift-app-agent:

    $ dci-openshift-app-agent-ctl -s -- -v
    

  4. Check the status of the DCI job until it finishes.

  5. Check the results.

Finally, you should have a DCI job like this one, which was done in a connected environment. There, you can observe the results obtained. Mainly, you have to take care of the following:

  • In the Tests section, you will see the results of the CNF Cert Suite execution, in JUnit format, clearly viewing the tests that have passed, failed, or been skipped.
  • In the Files section, you can see the logs generated during the execution, including execution.log or claim.json files, useful for troubleshooting purposes.

Conclusions

This blog post has summarized the details to keep in mind when automating the CNF Cert Suite through the dci-openshift-app-agent on top of an OpenShift cluster.

For this purpose, we provide a full definition of the cnf-cert role with the help of a workload composed by a deployment created in two different namespaces and an operator running in one of the testing namespaces.

Finally, the work finishes with an example of a DCI job that executes the certification over the workload, showing the main aspects to consider when checking the logs and the job status.