######################################################## Thursday B: Adding QC filter and QC filter test to JEDI ######################################################## Introduction ---------------- This practical is on the JEDI Unified Forward Operator (UFO) code and quality control (QC) filters and tests. It has three main parts. First, you will create a feature branch in the ufo repository. Second, you will add a new simple QC filter to UFO. Third, you will add a test for your new QC filter. Filters are an essential component in a data assimilation workflow. Filters can change quality control flags (i.e., to reject or retain observations) and observation error variances (e.g., one might wish to increase observation error variances to decrease the observation weight in the analysis instead of rejecting observations altogether). In JEDI, filters are customizable and generic. This means that you can use the same code (written in C++) to accomplish different tasks (specified by you in a YAML file). This activity assumes that you have completed the previous activities and still have access to a JupyterLab or SSH session. Step 1: Access your AWS instance and enter the Singularity container ---------------------------------------------------------------------- Connect to your assigned compute node. You will use the same method as yesterday. You already have the singularity container that contains the JEDI dependencies. Enter the container using: .. code:: bash cd ~/ singularity shell -e jedi-gnu-openmpi-dev_latest.sif Once in the container be sure also to remove limits the stack memory to prevent spurious failures as noted :doc:`before `: .. code-block:: bash ulimit -s unlimited ulimit -v unlimited Step 2: Make a new feature branch in the UFO repository ------------------------------------------------------- You invoked ecbuild on Monday's Getting Started activity. Ecbuild cloned the stable branches of several repositories. However, in this tutorial we want to make modifications to the UFO code. In JEDI, we aim to follow the "git flow" paradigm when developing, and we will discuss this in depth in a later lecture on Friday. In summary, the ``develop`` branch contains the development version of each repository. This version of the code should always build and test successfully. Whenever you want to add a new feature to the code, you should do your work in another branch of the repository. Once the work is done, you can issue a "Pull Request" to have other JEDI users review your code and merge in your changes into the ``develop`` branch. With every release of the JEDI code, we create a snapshot of JEDI repositories by copying the development branch to a git "tag" (an immutable branch). The meanings of "master or main", "develop", and "tags" will be discussed in the git-flow lecture and later practical exercise. Because of the ordering of the lectures, and because we want stable, reproducible academy exercises, we have a special copy of the UFO repository in the jedi-da-academy on GitHub. In this repository, both the 1.1.0 tag and develop branch are identical. This is **not** the case in actual development. Ordinarily, the top-level ``CMakeLists.txt`` file would not reference ``TAG 1.1.0`` when describing each package and instead would reference ``BRANCH develop``. Open the top-level CMakeLists.txt file in the source code (``~/jedi/fv3-bundle/CMakeLists.txt``). Change line 49 from: .. code:: bash ecbuild_bundle( PROJECT ufo GIT "https://github.com/jedi-da-academy/ufo.git" TAG 1.1.0 ) to: .. code:: bash ecbuild_bundle( PROJECT ufo GIT "https://github.com/jcsda-da-academy/ufo.git" BRANCH feature/new_qc_test_ ) Then, enter the source code's ``ufo`` subdirectory (``cd ~/jedi/fv3-bundle/ufo``). NOTE: There is also a ``ufo`` directory in your current directory at ``~/jedi/build/ufo`` <-- This is not the directory that you want.) Checkout the ``develop`` branch and then create a new branch as follows: .. code:: bash git checkout develop git checkout -b feature/new_qc_test_ The ``-b`` option to ``git checkout`` creates the branch by effectively making a copy of the develop branch. Don't forget to set ``LOCAL_PATH_JEDI_TESTFILES`` otherwise the test data will not be linked correctly. This step is only needed for the practical sessions. In other cases, cmake will download and link the correct version of test data. .. code:: bash export LOCAL_PATH_JEDI_TESTFILES=$HOME/jedi/test-data-release Step 3: Add a new filter --------------------------------------------------------------------------- We are going to re-implement a **simplified** version of the Bounds Check filter. This filter checks that observation data are within certain user-specified bounds. You can refer to JEDI documentation for more details about `creating a new filter`_. Step 3a: The backend logic ~~~~~~~~~~~~~~~~~~~~~~~~~~ Navigate into the ``~/jedi/fv3-bundle/ufo/src/ufo/filters`` directory. Copy the ``DifferenceCheck.cc`` and ``DifferenceCheck.h`` files to ``PracticalBoundsCheck.cc`` and ``PracticalBoundsCheck.h``, respectively. Open these files in your editor of choice. In ``PracticalBoundsCheck.h``: - Rename all references of ``DifferenceCheck`` to ``PracticalBoundsCheck``. Search for all possible capitalizations. Don't forget the capitalized text on lines 8, 9, and 87! - Change the line ``int qcFlag() const override {return QCflags::diffref;}`` to return a different flag: ``QCflags::bounds``. This QC flag is conveniently already defined in ``ufo/filters/QCflags.h``. - Remove lines with ``ref`` and ``val`` parameters. We do not use them in this filter. In ``PracticalBoundsCheck.cc``: - Rename all references of ``DifferenceCheck`` to ``PracticalBoundsCheck``. - In ``PracticalBoundsCheck::applyFilter(...)``, replace the function body with something like this: .. code:: cpp ufo::Variables testvars; testvars += ufo::Variables(filtervars, "ObsValue"); // Retrieve the bounds. const float missing = util::missingValue(missing); const float vmin = parameters_.minvalue.value().value_or(missing); const float vmax = parameters_.maxvalue.value().value_or(missing); // Sanity checks if (filtervars.nvars() == 0) { oops::Log::error() << "No variables will be filtered out in filter " << config_ << std::endl; ABORT("No variables specified to be filtered out in filter"); } // Loop over all variables to filter for (size_t jv = 0; jv < testvars.nvars(); ++jv) { // get test data for this variable std::vector testdata; data_.get(testvars.variable(jv), testdata); // apply the filter for (size_t jobs = 0; jobs < obsdb_.nlocs(); ++jobs) { if (apply[jobs]) { ASSERT(testdata[jobs] != missing); if (vmin != missing && testdata[jobs] < vmin) flagged[jv][jobs] = true; if (vmax != missing && testdata[jobs] > vmax) flagged[jv][jobs] = true; } } } - Remove lines with ``ref`` and ``val`` parameters. We do not use them in this filter. - Feel free to customize the function further. Step 3b: Add your new filter to the build system ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Edit ``~/jedi/fv3-bundle/ufo/src/ufo/filters/CMakeLists.txt`` and add in ``PracticalBoundsCheck.cc`` and ``PracticalBoundsCheck.h`` to ``filters_files``. - UFO needs to be told that another filter is available. The list of known filters is located in ``~/jedi/fv3-bundle/ufo/src/ufo/instantiateObsFilterFactory.h``. To add in the new filter, first add ``#include "ufo/filters/PracticalBoundsCheck.h"`` to the top of ``instantiateObsFilterFactory.h``. At the end of ``instantiateObsFilterFactory.h``, follow the pattern and add in: .. code:: cpp static oops::FilterMaker > practicalBoundsCheckMaker("Practical Bounds Check"); - The filter is added! Step 4: Compile your code ------------------------------ Finally, return to the build directory (``$HOME/jedi/build-release``) and run ecbuild again. We want to re-run ecbuild because we added source code files to UFO. Once ecbuild completes, verify that it reports that configuration has succeeded. If the configuration step has succeeded you should see a line like this: .. code:: bash -- Build files have been written to: /home/ubuntu/jedi/build-release Now that you have modified the ufo source code, recompile it. To save a little time, you can go directly to the ufo directory and just compile that: .. code:: bash cd $HOME/jedi/build-release/ufo make -j8 If an error is reported, review the console to see what went wrong. If you do not know what to fix, please ask for help. Once the build succeeds, you need to run ctests from the UFO directory and ensure all tests pass. Step 5 provides more details about testing in JEDI. Step 5: Testing in JEDI ------------------------------ Each JEDI repository has its own suite of tests. In this step, we introduce some of the ctest commands that can help you test and debug your code. Please refer to (`JEDI documentation`_) for more information on the JEDI test suite. After building and compiling the bundle, you can run the tests using :code:`ctest`. .. code:: bash cd ctest Here :code:`` is :code:`$HOME/jedi/build-release`. To only run tests in UFO you can simply CD into ufo and run ctest command. .. code:: bash cd $HOME/jedi/build-release/ufo ctest After the tests are complete, ctest will print out a summary, highlighting which tests, if any, failed. To run a single test, you can use :code:`-R` followed by the test's name, for example: .. code:: bash ctest -R ufo_coding_norms The output from these tests will be printed to the screen and written to the file :code:`LastTest.log` in the directory :code:`/Testing/Temporary` or this example :code:`$HOME/jedi/build-release/ufo/Testing/Temporary`. In the same directory :code:`LastTestsFailed.log` lists the last tests that failed. You can run ctest with the verbose option to get more information which can be helpful for debugging. .. code:: bash ctest -V -R test_ufo_geovals and for extra verbose: .. code:: bash ctest -VV -R test_ufo_geovals ctest also has an option to only re-run the tests that failed last time: .. code:: bash ctest --rerun-failed A note on the ufo_coding_norms test ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This test runs ``cpplint``, which is a command-line tool to check C/C++ files for style issues following `Google's C++ style guide `_. We use several rules in this style guide to ensure that code that we write is readable by other people. If you see an error in the ``ufo_coding_norms`` test, this indicates that the style checker has detected an issue. To view the output of a failed ``ufo_coding_norms`` test, run: .. code:: bash ctest -V -R ufo_coding_norms Then, apply any fixes to your code, rerun ``make -j8``, and run the test again. Keep in mind that when you add a new feature to the JEDI repository you need to write a test for your code. This way you ensure your code is working properly and it will help us review and merge your code quicker. You will add a test to your new filter in the next section of this practical. Step 6: Add YAML configuration file for the new test ---------------------------------------------------- We will use :code:`filters_testdata.nc4`, a simplified IODA format file, for testing our new filter. Take a look at this dataset by using :code:`h5dump` command. .. code:: bash cd /ufo/test/Data/ufo/testinput_tier_1 h5dump filters_testdata.nc4 | less Here :code:`` is :code:`$HOME/jedi/build-release`. To test your filter you need to first add YAML configuration file in your source directory :code:`$HOME/jedi/fv3-bundle/ufo/test/testinput`. In this directory, YAML configuration files with the prefix :code:`qc_` are used for testing various filters in UFO. In the YAML configuration file, you can specify the details of how you want to test your filter. For example, the name of the file and the list of variables in the file you want to apply the filter on. Create a new YAML file in :code:`$HOME/jedi/fv3-bundle/ufo/test/testinput` called :code:`qc_practical_boundscheck.yaml`. Copy and paste this to your YAML file. If you are using the ``vim`` editor, it may be helpful to open the editor and immediately type ``:set paste`` so the indentation shown below should be kept the same. .. code:: yaml window begin: 2018-01-01T00:00:00Z window end: 2019-01-01T00:00:00Z observations: - obs space: name: test data obsdatain: obsfile: Data/ufo/testinput_tier_1/filters_testdata.nc4 simulated variables: [variable1, variable2, variable3] obs filters: - filter: Practical Bounds Check # test min/max value with all variables filter variables: - name: variable1 - name: variable2 - name: variable3 minvalue: 14.0 maxvalue: 19.0 # Compare variables with minvalue/maxvalue # variable1@ObsValue = 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 # variable2@ObsValue = 10, 12, 14, 16, 18, 20, 22, 24, 26, 28 # variable3@ObsValue = 25, 24, 23, 22, 21, 20, 19, 18, 17, 16 passedBenchmark: 13 The test will pass when the number of data points that pass the filter is equal to the :code:`passedBenchmark` value. The developer of the test is responsible for finding the correct :code:`passedBenchmark` value. You can determine this number by examining :code:`obsfile`, in this case: .. code:: bash h5dump /ufo/test/Data/ufo/testinput_tier_1/filters_testdata.nc4 | less You can define multiple mini-tests for your filter in one YAML configuration file. Now add a new test to filter out data points with ObsValues greater than 15.0 and less than 20.0 only for variable2 and variable3 using your new :code:`Practical Bounds Check` filter. Notice that all data points in variable1 will pass because variable1 is not specified in this test. You can copy the :code:`obs filters` section from the previous test and modify it. Or you can simply use the template below to add this test. .. code:: yaml - obs space: name: test data obsdatain: obsfile: Data/ufo/testinput_tier_1/filters_testdata.nc4 simulated variables: [variable1, variable2, variable3] obs filters: - filter: ... # test min/max value with all variables filter variables: - name: ... - name: ... minvalue: ... maxvalue: ... # Compare variables with minvalue/maxvalue # variable1@ObsValue = 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 # variable2@ObsValue = 10, 12, 14, 16, 18, 20, 22, 24, 26, 28 # variable3@ObsValue = 25, 24, 23, 22, 21, 20, 19, 18, 17, 16 passedBenchmark: ... Note that ``test/testinput`` directory exists in both source and build directories. CD to ``$HOME/jedi/build-release/ufo/test/testinput`` and then execute the command ``ln -l``. You can see that YAML files in the build directory are linked to the YAML files in the source directory. So, you can edit the YAML file either in the build or source directory, both would work! Step 7: Register your test to CMakeLists.txt -------------------------------------------- Now you need to register your new test to CMake by adding it to :code:`$HOME/jedi/fv3-bundle/ufo/test/CMakeLists.txt`. First, add your YAML configuration file to :code:`ufo_test_input` list. Next, under :code:`Test UFO ObsFilters (generic)` section add your test using :code:`ecbuild_add_test` command. .. code:: cmake ecbuild_add_test( TARGET test_ufo_qc_gen_practical_boundscheck COMMAND ${CMAKE_BINARY_DIR}/bin/test_ObsFilters.x ARGS "testinput/qc_practical_boundscheck.yaml" ENVIRONMENT OOPS_TRAPFPE=1 DEPENDS test_ObsFilters.x TEST_DEPENDS ufo_get_ufo_test_data ) Step 8: Run your new test ------------------------- Now you are ready to test your filter! Don't forget to rebuild UFO first. To rebuild UFO with the new changes you need to enter :code:`/ufo` and simply run the command :code:`make -j8`. Next, you can list all the UFO tests using :code:`ctest -N` or :code:`ctest -N -R practical`. Can you find your new test on the list? Now run your test using: .. code:: bash ctest -R name_of_your_test Did your test pass? When writing a new test, it is always a good idea to also test failure conditions. Modify your YAML configuration file to make your test fail. You do not need to rebuild the bundle if you are only making changes to the YAML files. You can simply rerun your test after modifying the YAML file. Run your test in verbose mode to see the detailed output. .. code:: bash ctest -VV -R name_of_your_test Did your test fail as expected? Don't forget to change your YAML file back to the passing condition. You can add more tests to your YAML configuration file to make your new filter robust. Execute the command ``ls -al`` in ``/ufo/test/testpinput`` You can find the solution for this practical under ``feature/new_qc_test_solution`` branch in ``jedi-da-academy/ufo`` repository: https://github.com/jedi-da-academy/ufo/tree/feature/new_qc_test_solution .. _JEDI documentation: https://jointcenterforsatellitedataassimilation-jedi-docs.readthedocs-hosted.com/en/latest/inside/testing/unit_testing.html .. _creating a new filter: https://jointcenterforsatellitedataassimilation-jedi-docs.readthedocs-hosted.com/en/latest/inside/jedi-components/ufo/qcfilters/NewFilter.html