Thursday B: Adding QC filter and QC filter test to JEDI¶
Introduction¶
This practical is on the JEDI Unified Forward Operator (UFO) code and quality control (QC) filters and tests. It has three main parts. First, you will create a feature branch in the ufo repository. Second, you will add a new simple QC filter to UFO. Third, you will add a test for your new QC filter.
Filters are an essential component in a data assimilation workflow. Filters can change quality control flags (i.e., to reject or retain observations) and observation error variances (e.g., one might wish to increase observation error variances to decrease the observation weight in the analysis instead of rejecting observations altogether). In JEDI, filters are customizable and generic. This means that you can use the same code (written in C++) to accomplish different tasks (specified by you in a YAML file).
This activity assumes that you have completed the previous activities and still have access to a JupyterLab or SSH session.
Step 1: Access your AWS instance and enter the Singularity container¶
Connect to your assigned compute node. You will use the same method as yesterday.
You already have the singularity container that contains the JEDI dependencies. Enter the container using:
cd ~/
singularity shell -e jedi-gnu-openmpi-dev_latest.sif
Once in the container be sure also to remove limits the stack memory to prevent spurious failures as noted before:
ulimit -s unlimited
ulimit -v unlimited
Step 2: Make a new feature branch in the UFO repository¶
You invoked ecbuild on Monday’s Getting Started activity.
Ecbuild cloned the stable branches of several repositories. However, in
this tutorial we want to make modifications to the UFO code. In JEDI,
we aim to follow the “git flow” paradigm when developing, and we will discuss
this in depth in a later lecture on Friday. In summary, the develop
branch contains the development version of each repository. This version of
the code should always build and test successfully. Whenever you want to add
a new feature to the code, you should do your work in another branch of the
repository. Once the work is done, you can issue a “Pull Request” to have
other JEDI users review your code and merge in your changes into the develop
branch. With every release of the JEDI code, we create a snapshot of JEDI repositories
by copying the development branch to a git “tag” (an immutable branch).
The meanings of “master or main”, “develop”, and “tags” will be discussed
in the git-flow lecture and later practical exercise. Because of the
ordering of the lectures, and because we want stable, reproducible academy
exercises, we have a special copy of the UFO repository in the jedi-da-academy on GitHub.
In this repository, both the 1.1.0 tag and
develop branch are identical. This is not the case in actual development.
Ordinarily, the top-level CMakeLists.txt
file would not reference
TAG 1.1.0
when describing each package and instead would reference BRANCH develop
.
Open the top-level CMakeLists.txt file in the source code (~/jedi/fv3-bundle/CMakeLists.txt
).
Change line 49 from:
ecbuild_bundle( PROJECT ufo GIT "https://github.com/jedi-da-academy/ufo.git" TAG 1.1.0 )
to:
ecbuild_bundle( PROJECT ufo GIT "https://github.com/jcsda-da-academy/ufo.git" BRANCH feature/new_qc_test_<yourname> )
Then, enter the source code’s ufo
subdirectory (cd ~/jedi/fv3-bundle/ufo
).
NOTE: There is also a ufo
directory in your current directory at
~/jedi/build/ufo
<– This is not the directory that you want.)
Checkout the develop
branch and then create a new branch as follows:
git checkout develop
git checkout -b feature/new_qc_test_<yourname>
The -b
option to git checkout
creates the branch by effectively making a copy of the develop branch.
Don’t forget to set LOCAL_PATH_JEDI_TESTFILES
otherwise the test data will not
be linked correctly. This step is only needed for the practical sessions.
In other cases, cmake will download and link the correct version of test data.
export LOCAL_PATH_JEDI_TESTFILES=$HOME/jedi/test-data-release
Step 3: Add a new filter¶
We are going to re-implement a simplified version of the Bounds Check filter. This filter checks that observation data are within certain user-specified bounds. You can refer to JEDI documentation for more details about creating a new filter.
Step 3a: The backend logic¶
Navigate into the ~/jedi/fv3-bundle/ufo/src/ufo/filters
directory. Copy the DifferenceCheck.cc
and DifferenceCheck.h
files to PracticalBoundsCheck.cc
and PracticalBoundsCheck.h
,
respectively.
Open these files in your editor of choice.
In PracticalBoundsCheck.h
:
Rename all references of
DifferenceCheck
toPracticalBoundsCheck
. Search for all possible capitalizations. Don’t forget the capitalized text on lines 8, 9, and 87!Change the line
int qcFlag() const override {return QCflags::diffref;}
to return a different flag:QCflags::bounds
. This QC flag is conveniently already defined inufo/filters/QCflags.h
.Remove lines with
ref
andval
parameters. We do not use them in this filter.
In PracticalBoundsCheck.cc
:
Rename all references of
DifferenceCheck
toPracticalBoundsCheck
.In
PracticalBoundsCheck::applyFilter(...)
, replace the function body with something like this:
ufo::Variables testvars;
testvars += ufo::Variables(filtervars, "ObsValue");
// Retrieve the bounds.
const float missing = util::missingValue(missing);
const float vmin = parameters_.minvalue.value().value_or(missing);
const float vmax = parameters_.maxvalue.value().value_or(missing);
// Sanity checks
if (filtervars.nvars() == 0) {
oops::Log::error() << "No variables will be filtered out in filter "
<< config_ << std::endl;
ABORT("No variables specified to be filtered out in filter");
}
// Loop over all variables to filter
for (size_t jv = 0; jv < testvars.nvars(); ++jv) {
// get test data for this variable
std::vector<float> testdata;
data_.get(testvars.variable(jv), testdata);
// apply the filter
for (size_t jobs = 0; jobs < obsdb_.nlocs(); ++jobs) {
if (apply[jobs]) {
ASSERT(testdata[jobs] != missing);
if (vmin != missing && testdata[jobs] < vmin) flagged[jv][jobs] = true;
if (vmax != missing && testdata[jobs] > vmax) flagged[jv][jobs] = true;
}
}
}
Remove lines with
ref
andval
parameters. We do not use them in this filter.Feel free to customize the function further.
Step 3b: Add your new filter to the build system¶
Edit
~/jedi/fv3-bundle/ufo/src/ufo/filters/CMakeLists.txt
and add inPracticalBoundsCheck.cc
andPracticalBoundsCheck.h
tofilters_files
.UFO needs to be told that another filter is available. The list of known filters is located in
~/jedi/fv3-bundle/ufo/src/ufo/instantiateObsFilterFactory.h
.To add in the new filter, first add
#include "ufo/filters/PracticalBoundsCheck.h"
to the top ofinstantiateObsFilterFactory.h
.At the end of
instantiateObsFilterFactory.h
, follow the pattern and add in:static oops::FilterMaker<OBS, oops::ObsFilter<OBS, ufo::PracticalBoundsCheck> > practicalBoundsCheckMaker("Practical Bounds Check");
The filter is added!
Step 4: Compile your code¶
Finally, return to the build directory ($HOME/jedi/build-release
) and run ecbuild again.
We want to re-run ecbuild because we added source code files to UFO.
Once ecbuild completes, verify that it reports that configuration has succeeded. If the configuration step has succeeded you should see a line like this:
-- Build files have been written to: /home/ubuntu/jedi/build-release
Now that you have modified the ufo source code, recompile it. To save a little time, you can go directly to the ufo directory and just compile that:
cd $HOME/jedi/build-release/ufo
make -j8
If an error is reported, review the console to see what went wrong. If you do not know what to fix, please ask for help.
Once the build succeeds, you need to run ctests from the UFO directory and ensure all tests pass. Step 5 provides more details about testing in JEDI.
Step 5: Testing in JEDI¶
Each JEDI repository has its own suite of tests. In this step, we introduce some of the ctest
commands that can help you test and debug your code.
Please refer to (JEDI documentation) for more information on the JEDI test suite.
After building and compiling the bundle, you can run the tests using ctest
.
cd <build-directory>
ctest
Here <build-directory>
is $HOME/jedi/build-release
.
To only run tests in UFO you can simply CD into ufo and run ctest command.
cd $HOME/jedi/build-release/ufo
ctest
After the tests are complete, ctest will print out a summary, highlighting which tests, if any, failed.
To run a single test, you can use -R
followed by the test’s name, for example:
ctest -R ufo_coding_norms
The output from these tests will be printed to the screen and written to the
file LastTest.log
in the directory <build-directory>/Testing/Temporary
or
this example $HOME/jedi/build-release/ufo/Testing/Temporary
.
In the same directory LastTestsFailed.log
lists the last tests that failed.
You can run ctest with the verbose option to get more information which can be helpful for debugging.
ctest -V -R test_ufo_geovals
and for extra verbose:
ctest -VV -R test_ufo_geovals
ctest also has an option to only re-run the tests that failed last time:
ctest --rerun-failed
A note on the ufo_coding_norms test¶
This test runs cpplint
, which is a command-line tool to check C/C++ files
for style issues following Google’s C++ style guide.
We use several rules in this style guide to ensure that code that we write is
readable by other people.
If you see an error in the ufo_coding_norms
test, this indicates that the style
checker has detected an issue. To view the output of a failed ufo_coding_norms
test,
run:
ctest -V -R ufo_coding_norms
Then, apply any fixes to your code, rerun make -j8
, and run the test again.
Keep in mind that when you add a new feature to the JEDI repository you need to write a test for your code. This way you ensure your code is working properly and it will help us review and merge your code quicker. You will add a test to your new filter in the next section of this practical.
Step 6: Add YAML configuration file for the new test¶
We will use filters_testdata.nc4
, a simplified IODA format file, for testing our new filter.
Take a look at this dataset by using h5dump
command.
cd <build-directory>/ufo/test/Data/ufo/testinput_tier_1
h5dump filters_testdata.nc4 | less
Here <build-directory>
is $HOME/jedi/build-release
.
To test your filter you need to first add YAML configuration file in your source
directory $HOME/jedi/fv3-bundle/ufo/test/testinput
.
In this directory, YAML configuration files with the prefix qc_
are used for
testing various filters in UFO.
In the YAML configuration file, you can specify the details of how you want to test your filter.
For example, the name of the file and the list of variables in the file you want to apply the filter on.
Create a new YAML file in $HOME/jedi/fv3-bundle/ufo/test/testinput
called qc_practical_boundscheck.yaml
. Copy and paste this to your YAML file.
If you are using the vim
editor, it may be helpful to open the editor and immediately
type :set paste
so the indentation shown below should be kept the same.
window begin: 2018-01-01T00:00:00Z
window end: 2019-01-01T00:00:00Z
observations:
- obs space:
name: test data
obsdatain:
obsfile: Data/ufo/testinput_tier_1/filters_testdata.nc4
simulated variables: [variable1, variable2, variable3]
obs filters:
- filter: Practical Bounds Check # test min/max value with all variables
filter variables:
- name: variable1
- name: variable2
- name: variable3
minvalue: 14.0
maxvalue: 19.0
# Compare variables with minvalue/maxvalue
# variable1@ObsValue = 10, 11, 12, 13, 14, 15, 16, 17, 18, 19
# variable2@ObsValue = 10, 12, 14, 16, 18, 20, 22, 24, 26, 28
# variable3@ObsValue = 25, 24, 23, 22, 21, 20, 19, 18, 17, 16
passedBenchmark: 13
The test will pass when the number of data points that pass the filter is equal to the passedBenchmark
value.
The developer of the test is responsible for finding the correct passedBenchmark
value.
You can determine this number by examining obsfile
, in this case:
h5dump <build-directory>/ufo/test/Data/ufo/testinput_tier_1/filters_testdata.nc4 | less
You can define multiple mini-tests for your filter in one YAML configuration file.
Now add a new test to filter out data points with ObsValues greater than 15.0 and less than 20.0
only for variable2 and variable3 using your new Practical Bounds Check
filter.
Notice that all data points in variable1 will pass because variable1 is not specified in
this test. You can copy the obs filters
section from the previous test and modify it.
Or you can simply use the template below to add this test.
- obs space:
name: test data
obsdatain:
obsfile: Data/ufo/testinput_tier_1/filters_testdata.nc4
simulated variables: [variable1, variable2, variable3]
obs filters:
- filter: ... # test min/max value with all variables
filter variables:
- name: ...
- name: ...
minvalue: ...
maxvalue: ...
# Compare variables with minvalue/maxvalue
# variable1@ObsValue = 10, 11, 12, 13, 14, 15, 16, 17, 18, 19
# variable2@ObsValue = 10, 12, 14, 16, 18, 20, 22, 24, 26, 28
# variable3@ObsValue = 25, 24, 23, 22, 21, 20, 19, 18, 17, 16
passedBenchmark: ...
Note that test/testinput
directory exists in both source and build directories.
CD to $HOME/jedi/build-release/ufo/test/testinput
and then execute the command ln -l
.
You can see that YAML files in the build directory are linked to the YAML
files in the source directory. So, you can edit the YAML file either in
the build or source directory, both would work!
Step 7: Register your test to CMakeLists.txt¶
Now you need to register your new test to CMake by adding it to $HOME/jedi/fv3-bundle/ufo/test/CMakeLists.txt
.
First, add your YAML configuration file to ufo_test_input
list.
Next, under Test UFO ObsFilters (generic)
section add your test using ecbuild_add_test
command.
ecbuild_add_test( TARGET test_ufo_qc_gen_practical_boundscheck
COMMAND ${CMAKE_BINARY_DIR}/bin/test_ObsFilters.x
ARGS "testinput/qc_practical_boundscheck.yaml"
ENVIRONMENT OOPS_TRAPFPE=1
DEPENDS test_ObsFilters.x
TEST_DEPENDS ufo_get_ufo_test_data )
Step 8: Run your new test¶
Now you are ready to test your filter! Don’t forget to rebuild UFO first.
To rebuild UFO with the new changes you need to enter <build-directory>/ufo
and simply run the command make -j8
.
Next, you can list all the UFO tests using ctest -N
or ctest -N -R practical
.
Can you find your new test on the list?
Now run your test using:
ctest -R name_of_your_test
Did your test pass? When writing a new test, it is always a good idea to also test failure conditions. Modify your YAML configuration file to make your test fail. You do not need to rebuild the bundle if you are only making changes to the YAML files. You can simply rerun your test after modifying the YAML file. Run your test in verbose mode to see the detailed output.
ctest -VV -R name_of_your_test
Did your test fail as expected? Don’t forget to change your YAML file back to the passing condition. You can add more tests to your YAML configuration file to make your new filter robust.
Execute the command ls -al
in <build>/ufo/test/testpinput
You can find the solution for this practical under feature/new_qc_test_solution
branch
in jedi-da-academy/ufo
repository: https://github.com/jedi-da-academy/ufo/tree/feature/new_qc_test_solution