Friday: Pull Requests and Code Reviews

In this activity you will:

  • Push your feature branch to GitHub

  • Issue a pull request on GitHub

  • Engage in the code review process

So, let’s say that you have downloaded one or more components of the JEDI code and you have made some modifications. Maybe you have added a new observation operator or a new data assimilation algorithm. And, now you want your code to become incorporated into the JEDI code base.

First, we’d like to express our gratitude. JEDI is a community effort and we welcome contributions from you and your colleagues.

Second, we’d like to outline the process that you should follow in order to get your code changes into JEDI; how you too can become a JEDI developer. That is the purpose of today’s activity.

Step 1: Fork the ufo repository

In order to illustrate how to contribute code to JEDI, we will use the feature branch of ufo that you have been working with the last few days.

This feature branch exists on your AWS node. But, in order to integrate this into JEDI, you need to upload your changes to GitHub.

Those familiar with git and GitHub may be aware that, in many cases, you can do this with a git push command. However, in order to push to a remote repository, you need to have permission to write to it. External contributors do not in general have permission to write/push directly to JCSDA repositories.

In this situation, you would work from a fork. In GitHub (or similar platforms like bitbucket), a fork is a copy of a repository that you can modify as you wish. But, it’s not just a copy. A fork maintains a connection to the parent repository. So, after making your changes, you can easily submit them for possible inclusion in the parent repository, as we will see below.

So, the first step in contributing your ufo feature branch to JEDI is to create a fork of ufo on GitHub that you can modify. This being the academy, there are a few things we will do a little differently, just for illustration. First, if you were doing this outside of the academy, you would likely create your fork before making any code changes. And, you would fork the appropriate public repository from the JCSDA organization on GitHub. But, these are just academy exercises intended for illustration - we do not really want to incorporate these changes into the JEDI source code. So, we will instead make a fork of a fork. The version of ufo we have been working with is at jedi-da-academy/ufo. This is what we will fork, even though this itself is a fork of the main JEDI code at jcsda/ufo.

The second difference from a normal workflow is that, for the purpose of the academy, we will create this fork in your own personal GitHub account. This is a valid way of proceed even in normal circumstances but in many cases, it might make more sense for your institution or university department to maintain a fork for a group of JEDI contributors.

So, with these caveats in mind, proceed to log in to GitHub and navigate to the jedi-da-academy version of ufo at https://github.com/jedi-da-academy/ufo.

In the upper right corner of the page, look for the Fork button. Press this and select your personal Github account from the drop-down menu as the destination for the fork.

Step 2: Set up a Github access token

The JCSDA (and jedi-da-academy) repositories we have been using throughout the week are public, meaning that you were able to clone them without the need for any GitHub authentication. In fact, you do not even need a GitHub account to do any of the previous activities.

However, in order to push code to GitHub you do need an account. And, since we will be pushing code to GitHub from our AWS node rather than our own machines, you will need to set up your GitHub credentials.

You could in principle do this with your Github user name and password. But, since you are working from a cloud instance over http, it is more secure to set up a personal access token that can be deleted after your push.

So, the next step is to log into your own personal GitHub account and follow these instructions to set up a personal access token.

For the Note you can just put “JEDI Academy” and For the scope just select public_repo as shown here:

../../_images/GitHub_token.png

Then select Generate token. Important before leaving the page, copy your token by selecting the disk symbol immediately to the right, and paste it into your jupyterlab terminal as a new environment variable. But make sure you do this inside the container. So, if you are not already in the container, enter it first:

singularity shell -e jedi-gnu-openmpi-dev_latest.sif
TOKEN=<copy-and-paste-token>

Also, for convenience, create another environment variable that contains your Github user name:

MYACCOUNT=<your-GitHub-user-name>

You can also put your name in your git configuration so your name appears as the one who is committing:

git config --global user.name $MYACCOUNT

Step 3: Push your feature branch to GitHub

Now we have a GitHub home for our feature branch. But, back on our AWS node, our local git repository does not yet know that this new fork exists. As far as git is concerned, it’s origin is still https://github.com/jedi-da-academy/ufo.

First, let’s make sure all of our local changes are committed in the ufo branch that we have been working with. So, go back to your AWS node and enter these commands:

cd ~/jedi/fv3-bundle/ufo
git add *
git status

Here it good practice to pause and review what files you will be committing. Make sure they are only the files you changed and that the list does not include files that do not belong in the GitHub repository, such as *.o files from your build directory or any backup files you may have created.

If there are files that don’t belong you can remove them by first entering git reset. If there are files that you wish to remove from the repo (such as the .ipynb_checkpoints files, then you can just remove them and run git add * again. Alternatively, if there are files that you want to keep but do not want to commit, then you can run git add <filename> only on the specific files you wish to commit (this accepts wildcards).

When everything looks good you can commit your changes with this command:

git commit -m "<message>"

Where <message> is some commit message of your choice, briefly describing the changes you made.

Now run the following commands to let git know that your fork exists (if you set up the MYTOKEN and MYACCOUNT variables as described above you can just copy and paste):

git remote add myfork https://$MYACCOUNT:$TOKEN@github.com/$MYACCOUNT/ufo

Now you can enter this to verify that git now knows about two remote ufo repositories, one called origin that is at jedi-da-academy and one called myfork that is on your personal GitHub account (you should also see your token there).

git remote -v

Next let git retrieve information about that new remote from GitHub:

git fetch myfork

You should always make sure your feature branch is up to date with the JCSDA develop branch before you issue a pull request. In our case - the way we have proceeded with this week’s academy - our local develop branch is tracking the parent repository at jedi-da-academy. This means that if we check out the develop branch locally with git checkout develop, this branch will track the develop branch on the jedi-da-academy repository, not necessarily the develop branch on the <myaccount> (forked) repository. The two should be the same because we just created our fork moments ago. But, it is important to keep straight such distinctions when working with forks. For further information, see GitHub’s documentation on remotes and forks.

So, with this in mind, we can bring our feature branch up to date with the following commands:

git checkout develop
git pull origin
git checkout feature/new_qc_test_<name>
git merge develop

Sometimes when you run these commands, you may see that there are merge conflicts that you will have to resolve. This may occur if you edited a file in your feature branch and that same file has changed on the develop branch since your last update. So, you would have to edit the file or files in question and resolve all conflicts. But, you probably won’t run into any conflicts within the context of the Academy.

Warning

When you execute the git push command below, make sure you do so inside the container. Otherwise you will likely get a warning that Git Large File Support is enabled for this repo, but is not installed in your environment. git-lfs is installed in the container so you should not see this message. For more information see the git-lfs page in JEDI Documentation.

When you have resolved all merge conflicts, you can push your feature branch to your fork:

git checkout feature/new_qc_test_<name>
git push myfork --set-upstream feature/new_qc_test_<name>

Now, and only now, your feature branch exists on GitHub, not just on your local AWS node.

Step 4: Issue a Pull Request

The only way you, or anyone, can change the main code in a JEDI repository is to issue a Pull Request on GitHub. The term “pull request” can be a little confusing. The idea is that you are requesting that your code be pulled into the main code base for the JEDI project. In keeping with the git flow paradigm, this means that you are requesting that your code change be incorporated into the develop branch of the repository in question. The content of the develop branch, including your changes if they are accepted, will eventually make it into the master branch at the time of the next release.

So, for this next step, we must go back to GitHub. Navigate to your forked copy of ufo at https://github.com/<myaccount>/ufo.

Near the top of the page you should see a drop-down menu displaying the name of the default branch, which is master. If all went well in Step 2 above, then that drop-down menu should include your feature branch. Select it from the menu.

Now, look for a big green button in the upper right of the page that says Compare & pull request. Select this button.

Now look closely at the page that comes up. There is a lot there to consider.

We highly recommend that you follow the specific instructions in the JEDI Documentation for filling in this form. We also recommend reading our more general guidance on how to create a good pull request.

We briefly summarize the highlights of these documents here but please do take the time to read them carefully if you wish to become a JEDI contributor.

Near the top of the page you will see several drop-down menus that verify which branch will be merged into which: the source branch and the destination branch. Make sure the destination branch is the develop branch in the jedi-da-academy/ufo base repository. And, make sure the source branch is your feature branch in the head repository located in your personal GitHub account.

Below that you will see a box for you to enter the title. Give a good descriptive title that summarizes what changed and/or why. Then provide further details in the box below it.

You will see headings in this description box indicated by ##. Replace the text below each of these headings with your own text. In the Description session provide a description of what has been changed and why it was changed.

The Definition of Done provides further clarification on what it means for this feature or bugfix to be done. What should work now that this feature has been added? What should the JEDI user now be able to do that she or he could not do before?

The Issues addressed section is used if the code changes in this pull request address any previously defined issues. You can leave this blank for now but possible things to mention here include a discussion thread from our public forums.

Dependencies and Impact describe how the code changes introduced in this pull request relate to other pull requests in this or other repositories and how these code changes may alter the compilation or functionality of other repositories. This is also the place to describe whether or not the test data stored on AWS needs to be updated.

If you scroll down further you can see the code changes you are proposing to make, highlighted by GitHub. Review these and look for unintended changes (like print statements for debugging) that you might want to remove or modify (you can still do so after submitting the pull request).

Now look at the columns on the right of the window. Most of these are for internal use by the JEDI team. But take particular note of the entry for Reviewers. Select this and request a review from at least two of your fellow Padawans (perhaps others in your breakout room). JEDI instructors may also assign other reviewers. You will have the opportunity to review the code of others in Step 4 of this activity.

If this were an actual pull request to the repositories on the JCSDA GitHub organization (instead of jedi-da-academy), you would see more information on the Conversation page of the pull request that reflects our Continuous Integration (CI) testing. All pull requests must pass all tests before they can be merged in to the develop branch. So, as discussed in today’s lectures, we have an automated system that will run the tests whenever a pull request is issued and whenever changes are made to an existing pull request. Furthermore, our CI testing also checks to see if the code you have added is included in any tests. If it is not, then your pull request will fail the CI testing and it will not be merged.

This is what you would see if you were to issue a pull request to the JEDI repositories on the JCSDA GitHub organization. What you will not see is a second round of CI testing that occurs internally at JCSDA. This will run more comprehensive tests across different compiler and MPI implementations, including GNU/OpenMPI, CLANG/MPICH, and INTEL/IMPI. If your code fails any of these tests, you would be informed in the pull request discussion thread.

Step 5: Engage in the Code Review Process

Check your email (the email that is linked to your GitHub account). You should soon see emails that request you to review pull requests from other Padawans. Follow the link and look over their code changes. You can also access your review requests by going to the top of any GitHub page and selecting Pull Requests from the menu bar. Then, on that page, select review Requests.

Again, as with pull requests, we take code reviews very seriously and we encourage you to read our documentation on reviewing code carefully before proceeding.

When you navigate to the pull request web page, you’ll see a few components. The default landing page shows the author’s title and description of the code changes and a discussion thread from various reviewers. You’ll see this page listed as Conversation in a menu bar right above the description. On the right end of that menu bar, you’ll see an item called Files changed. Select that to see the code changes being proposed.

Now examine the code carefully; be critical, but polite! Keep in mind the common goal we all have of making the code better. Some questions you may want to consider as you go through the code include:

  • Is it clear from the title and description what is being done and why?

  • Can the desired goal be achieved in a different way that is more readable, more efficient, or more generic?

  • Is there extraneous code that should be removed, such as print statements used for debugging or unnecessary include statements?

  • Consider the code that has been added or modified. Is it executed by any existing or new tests? Is its functionality properly tested?

  • Does it pass all tests (this would be more readily assessed with an actual pull request to the JCSDA repositories due to the CI infrastructure; see comments at the end of Step 3 above)?

  • Has the author added doxygen comments in the source code to explain aspects of the new or modified code, covering for example, usage, background, and known bugs or limitations?

If these or other questions enable you to identify aspects of the proposed code changes that could be improved, then let the author know. If there is a particular line of code that you want to comment on, move your cursor just to the left of it and hover. This should bring up a blue box with a plus sign. Selecting this box will allow you to comment. And/or, if you have general comments or questions on the approach or strategy, as opposed to specific lines of code, you can go back the the Conversation page, scroll down to the bottom of the discussion thread, and enter your comments there.

Meanwhile, you may receive comments and suggestions from others who are reviewing your code. You can respond to these comments directly on GitHub. But, if code changes are requested, you should go back to your AWS node and edit the code in your branch there. You may need to re-compile your code and re-run the tests. When you have modified the code to address the review comments, you can push the changes to GitHub with the following commands:

git add *
git commit
git push myfork

The reviewers will now be able to see your changes. You can repeat this process indefinitely as more comments and suggestions come in.

Eventually, when reviewers are satisfied with your changes, they will approve your pull request. Likewise, when you are satisfied with the pull requests that you are reviewing, you can formally approve the changes. To approve a pull request, you can go to the Files changed section of the pull request web page. There you will see a green drop-down menu on the upper left labelled Review changes. With that, you can comment, request changes, or approve. You can also approve with a comment. Select the appropriate item from the list and submit your review with the green button at the bottom of the box.

All JEDI pull requests require at least two approvals from reviewers before they can be merged. Normally, a repository administrator would merge your PR after it has received at least two approvals. However, in this artificial Academy situation, merging the PRs from all the Padawans would likely lead to conflicts, since everybody is changing the same code. So, we will skip that merge step today.

Note

If you were able to create a pull request but your were not able to add reviewers or if you have not received any requests to review others’ code, then this may be because you don’t have write access to the repository. In order to serve as a reviewer, you need to be explicitly added to the repository as an external collaborator. Earlier in the week you should have received an email inviting you to join the repository as an external collaborator. If you have not received any review requests, try to find that email and click on the link to accept the invitation. In order for this to work, you need to be logged into GitHub. If you are still having problems serving as a reviewer, consult with a JEDI master.

Step 6: Delete your Academy GitHub Token

At some point, after you are done with the practical (your AWS node does not need to be up), you should delete the GitHub access token you created in Step 2.

To delete your token, log in to your GitHub account and, as before, access your account Settings through the drop-down menu you see by selecting your account icon on the upper right of any GitHub page. Then select Developer Settings and Personal access tokens from the menus on the left. Finally, select the Delete button next to the JEDI Academy token.