###################################### Friday: Pull Requests and Code Reviews ###################################### .. _pull_request_goals: In this activity you will: - Push your feature branch to GitHub - Issue a pull request on GitHub - Engage in the code review process So, let's say that you have downloaded one or more components of the JEDI code and you have made some modifications. Maybe you have added a new observation operator or a new data assimilation algorithm. And, now you want your code to become incorporated into the JEDI code base. First, we'd like to express our gratitude. JEDI is a community effort and we welcome contributions from you and your colleagues. Second, we'd like to outline the process that you should follow in order to get your code changes into JEDI; **how you too can become a JEDI developer**. That is the purpose of today's activity. .. _pull_request_fork: Step 1: Fork the ufo repository =============================== In order to illustrate how to contribute code to JEDI, we will use the feature branch of ufo that you have been working with the last few days. This feature branch exists on your AWS node. But, in order to integrate this into JEDI, you need to upload your changes to GitHub. Those familiar with ``git`` and ``GitHub`` may be aware that, in many cases, you can do this with a ``git push`` command. However, in order to push to a remote repository, you need to have permission to write to it. External contributors do not in general have permission to write/push directly to JCSDA repositories. In this situation, you would work from a `fork `_. In GitHub (or similar platforms like bitbucket), a fork is a copy of a repository that you can modify as you wish. But, it's not just a copy. A fork maintains a connection to the parent repository. So, after making your changes, you can easily submit them for possible inclusion in the parent repository, as we will see below. So, the first step in contributing your ufo feature branch to JEDI is to create a fork of ufo on GitHub that you can modify. This being the academy, there are a few things we will do a little differently, just for illustration. First, if you were doing this outside of the academy, you would likely create your fork *before* making any code changes. And, you would fork the appropriate public repository from the `JCSDA organization on GitHub `_. But, these are just academy exercises intended for illustration - we do not really want to incorporate these changes into the JEDI source code. So, we will instead make a fork of a fork. The version of ufo we have been working with is at `jedi-da-academy/ufo `_. This is what we will fork, even though this itself is a fork of the main JEDI code at `jcsda/ufo `_. The second difference from a normal workflow is that, for the purpose of the academy, we will create this fork in your own personal GitHub account. This is a valid way of proceed even in normal circumstances but in many cases, it might make more sense for your institution or university department to maintain a fork for a group of JEDI contributors. So, with these caveats in mind, proceed to log in to GitHub and navigate to the jedi-da-academy version of ufo at ``_. In the upper right corner of the page, look for the ``Fork`` button. Press this and select your personal Github account from the drop-down menu as the destination for the fork. .. _pull_request_credentials: Step 2: Set up a Github access token ==================================== The JCSDA (and jedi-da-academy) repositories we have been using throughout the week are public, meaning that you were able to clone them without the need for any GitHub authentication. In fact, you do not even need a GitHub account to do any of the previous activities. However, in order to push code to GitHub you do need an account. And, since we will be pushing code to GitHub from our AWS node rather than our own machines, you will need to set up your GitHub credentials. You could in principle do this with your Github user name and password. But, since you are working from a cloud instance over http, it is more secure to set up a personal access token that can be deleted after your push. So, the next step is to log into your own personal GitHub account and `follow these instructions to set up a personal access token `_. For the Note you can just put "JEDI Academy" and For the scope just select ``public_repo`` as shown here: .. image:: images/GitHub_token.png Then select ``Generate token``. **Important** before leaving the page, copy your token by selecting the disk symbol immediately to the right, and paste it into your jupyterlab terminal as a new environment variable: .. code-block:: bash MYTOKEN= Also, for convenience, create another environment variable that contains your Github user name: .. code-block:: bash MYACCOUNT= You can also put your name in your ``git`` configuration so your name appears as the one who is committing: .. code-block:: bash git config --global user.name $MYACCOUNT .. _pull_request_push: Step 3: Push your feature branch to GitHub ========================================== Now we have a GitHub home for our feature branch. But, back on our AWS node, our local git repository does not yet know that this new fork exists. As far as git is concerned, it's ``origin`` is still ``_. First, let's make sure all of our local changes are committed in the ufo branch that we have been working with. So, go back to your AWS node and enter these commands: .. code-block:: bash cd ~/jedi/fv3-bundle/ufo git add * git status Here it good practice to pause and review what files you will be committing. Make sure they are only the files you changed and that the list does not include files that do not belong in the GitHub repository, such as ``*.o`` files from your build directory or any backup files you may have created. If there are files that don't belong you can remove them by first entering ``git reset``. If there are files that you wish to remove from the repo (such as the ``.ipynb_checkpoints`` files, then you can just remove them and run ``git add *`` again. Alternatively, if there are files that you want to keep but do not want to commit, then you can run ``git add `` only on the specific files you wish to commit (this accepts wildcards). When everything looks good you can commit your changes with this command: .. code-block:: bash git commit -m "" Where ```` is some commit message of your choice, briefly describing the changes you made. Now run the following commands to let ``git`` know that your fork exists (if you set up the ``MYTOKEN`` and ``MYACCOUNT`` variables as described above you can just copy and paste): .. code-block:: bash git remote add myfork https://$MYACCOUNT:$TOKEN@github.com/$MYACCOUNT/ufo Now you can enter this to verify that git now knows about two remote ufo repositories, one called ``origin`` that is at ``jedi-da-academy`` and one called ``myfork`` that is on your personal GitHub account (you should also see your token there). .. code-block:: bash git remote -v Next let ``git`` retrieve information about that new remote from GitHub: .. code-block:: bash git fetch myfork You should always make sure your feature branch is up to date with the JCSDA develop branch before you issue a pull request. In our case - the way we have proceeded with this week's academy - our local ``develop`` branch is tracking the parent repository at ``jedi-da-academy``. This means that if we check out the develop branch locally with ``git checkout develop``, this branch will track the develop branch on the ``jedi-da-academy`` repository, not necessarily the develop branch on the ```` (forked) repository. The two should be the same because we just created our fork moments ago. But, it is important to keep straight such distinctions when working with forks. For further information, see GitHub's documentation on `remotes and forks `_. So, with this in mind, we can bring our feature branch up to date with the following commands: .. code-block:: bash git checkout develop git pull origin git checkout feature/new_qc_test_ git merge develop Sometimes when you run these commands, you may see that there are merge conflicts that you will have to resolve. This may occur if you edited a file in your feature branch and that same file has changed on the develop branch since your last update. So, you would have to edit the file or files in question and resolve all conflicts. But, you probably won't run into any conflicts within the context of the Academy. .. warning:: When you execute the ``git push`` command below, make sure you do so *inside* the container. Otherwise you will likely get a warning that ``Git Large File Support is enabled for this repo, but is not installed in your environment``. ``git-lfs`` *is* installed in the container so you should not see this message. For more information see the `git-lfs page in JEDI Documentation `_. When you have resolved all merge conflicts, you can push your feature branch to your fork: .. code-block:: bash git checkout feature/new_qc_test_ git push myfork --set-upstream feature/new_qc_test_ Now, and only now, your feature branch exists on GitHub, not just on your local AWS node. .. _pull_request_issue_pr: Step 4: Issue a Pull Request ============================ The only way you, or anyone, can change the main code in a JEDI repository is to issue a **Pull Request** on GitHub. The term "pull request" can be a little confusing. The idea is that you are requesting that your code be pulled into the main code base for the JEDI project. In keeping with the `git flow paradigm `_, this means that you are requesting that your code change be incorporated into the ``develop`` branch of the repository in question. The content of the ``develop`` branch, including your changes if they are accepted, will eventually make it into the ``master`` branch at the time of the next release. So, for this next step, we must go back to GitHub. Navigate to your forked copy of ufo at ``https://github.com//ufo``. Near the top of the page you should see a drop-down menu displaying the name of the default branch, which is ``master``. If all went well in Step 2 above, then that drop-down menu should include your feature branch. Select it from the menu. Now, look for a big green button in the upper right of the page that says **Compare & pull request**. Select this button. Now look closely at the page that comes up. There is a lot there to consider. We highly recommend that you follow the specific instructions in the `JEDI Documentation for filling in this form `_. We also recommend reading our `more general guidance on how to create a good pull request `_. We briefly summarize the highlights of these documents here but please do take the time to read them carefully if you wish to become a JEDI contributor. Near the top of the page you will see several drop-down menus that verify which branch will be merged into which: the source branch and the destination branch. Make sure the destination branch is the ``develop`` branch in the ``jedi-da-academy/ufo`` base repository. And, make sure the source branch is your feature branch in the head repository located in your personal GitHub account. Below that you will see a box for you to enter the title. Give a good descriptive title that summarizes what changed and/or why. Then provide further details in the box below it. You will see headings in this description box indicated by ``##``. Replace the text below each of these headings with your own text. In the **Description** session provide a description of what has been changed and why it was changed. The **Definition of Done** provides further clarification on what it means for this feature or bugfix to be done. What should work now that this feature has been added? What should the JEDI user now be able to do that she or he could not do before? The **Issues addressed** section is used if the code changes in this pull request address any previously defined issues. You can leave this blank for now but possible things to mention here include a discussion thread from our `public forums `_. **Dependencies** and **Impact** describe how the code changes introduced in this pull request relate to other pull requests in this or other repositories and how these code changes may alter the compilation or functionality of other repositories. This is also the place to describe whether or not the test data stored on AWS needs to be updated. If you scroll down further you can see the code changes you are proposing to make, highlighted by GitHub. Review these and look for unintended changes (like print statements for debugging) that you might want to remove or modify (you can still do so after submitting the pull request). Now look at the columns on the right of the window. Most of these are for internal use by the JEDI team. But take particular note of the entry for **Reviewers**. Select this and request a review from at least two of your fellow Padawans (perhaps others in your breakout room). JEDI instructors may also assign other reviewers. You will have the opportunity to review the code of others in Step 4 of this activity. If this were an actual pull request to the repositories on the ``JCSDA`` GitHub organization (instead of ``jedi-da-academy``), you would see more information on the **Conversation** page of the pull request that reflects our Continuous Integration (CI) testing. All pull requests must pass all tests before they can be merged in to the ``develop`` branch. So, as discussed in today's lectures, we have an automated system that will run the tests whenever a pull request is issued and whenever changes are made to an existing pull request. Furthermore, our CI testing also checks to see if the code you have added is included in any tests. If it is not, then your pull request will fail the CI testing and it will not be merged. This is what you would see if you were to issue a pull request to the JEDI repositories on the `JCSDA GitHub organization `_. What you will not see is a second round of CI testing that occurs internally at JCSDA. This will run more comprehensive tests across different compiler and MPI implementations, including GNU/OpenMPI, CLANG/MPICH, and INTEL/IMPI. If your code fails any of these tests, you would be informed in the pull request discussion thread. .. _pull_request_code_review: Step 5: Engage in the Code Review Process ========================================= Check your email (the email that is linked to your GitHub account). You should soon see emails that request you to review pull requests from other Padawans. Follow the link and look over their code changes. You can also access your review requests by going to the top of any GitHub page and selecting ``Pull Requests`` from the menu bar. Then, on that page, select ``review Requests``. Again, as with pull requests, we take code reviews very seriously and we encourage you to read our documentation on `reviewing code `_ carefully before proceeding. When you navigate to the pull request web page, you'll see a few components. The default landing page shows the author's title and description of the code changes and a discussion thread from various reviewers. You'll see this page listed as **Conversation** in a menu bar right above the description. On the right end of that menu bar, you'll see an item called **Files changed**. Select that to see the code changes being proposed. Now examine the code carefully; be critical, but polite! Keep in mind the common goal we all have of making the code better. Some questions you may want to consider as you go through the code include: * Is it clear from the title and description what is being done and why? * Can the desired goal be achieved in a different way that is more readable, more efficient, or more generic? * Is there extraneous code that should be removed, such as print statements used for debugging or unnecessary ``include`` statements? * Consider the code that has been added or modified. Is it executed by any existing or new tests? Is its functionality properly tested? * Does it pass all tests (this would be more readily assessed with an actual pull request to the JCSDA repositories due to the CI infrastructure; see comments at the end of Step 3 above)? * Has the author added doxygen comments in the source code to explain aspects of the new or modified code, covering for example, usage, background, and known bugs or limitations? If these or other questions enable you to identify aspects of the proposed code changes that could be improved, then let the author know. If there is a particular line of code that you want to comment on, move your cursor just to the left of it and hover. This should bring up a blue box with a plus sign. Selecting this box will allow you to comment. And/or, if you have general comments or questions on the approach or strategy, as opposed to specific lines of code, you can go back the the **Conversation** page, scroll down to the bottom of the discussion thread, and enter your comments there. Meanwhile, you may receive comments and suggestions from others who are reviewing your code. You can respond to these comments directly on GitHub. But, if code changes are requested, you should go back to your AWS node and edit the code in your branch there. You may need to re-compile your code and re-run the tests. When you have modified the code to address the review comments, you can push the changes to GitHub with the following commands: .. code-block:: bash git add * git commit git push myfork The reviewers will now be able to see your changes. You can repeat this process indefinitely as more comments and suggestions come in. Eventually, when reviewers are satisfied with your changes, they will approve your pull request. Likewise, when you are satisfied with the pull requests that you are reviewing, you can formally approve the changes. To approve a pull request, you can go to the **Files changed** section of the pull request web page. There you will see a green drop-down menu on the upper left labelled **Review changes**. With that, you can comment, request changes, or approve. You can also approve with a comment. Select the appropriate item from the list and submit your review with the green button at the bottom of the box. All JEDI pull requests require at least two approvals from reviewers before they can be merged. Normally, a repository administrator would merge your PR after it has received at least two approvals. However, in this artificial Academy situation, merging the PRs from all the Padawans would likely lead to conflicts, since everybody is changing the same code. So, we will skip that merge step today. .. note:: If you were able to create a pull request but your were not able to add reviewers or if you have not received any requests to review others' code, then this may be because you don't have write access to the repository. In order to serve as a reviewer, you need to be explicitly added to the repository as an external collaborator. Earlier in the week you should have received an email inviting you to join the repository as an external collaborator. If you have not received any review requests, try to find that email and click on the link to accept the invitation. In order for this to work, you need to be logged into GitHub. If you are still having problems serving as a reviewer, consult with a JEDI master. Step 6: Delete your Academy GitHub Token ======================================== At some point, after you are done with the practical (your AWS node does not need to be up), you should delete the GitHub access token you created in Step 2. To delete your token, log in to your GitHub account and, as before, access your account ``Settings`` through the drop-down menu you see by selecting your account icon on the upper right of any GitHub page. Then select ``Developer Settings`` and ``Personal access tokens`` from the menus on the left. Finally, select the ``Delete`` button next to the ``JEDI Academy`` token.