Biostatistical Computing, PHC 6068
GitHub
Instructor: Zhiguang Huo (Caleb)
Guest Speaker: Shangchen Song
Monday October 4, 2021
Outline
- Git
- Version control
- Basic commands about Git
- GitHub
- Introduction
- GitHub commands
- Host R packages on GitHub
Git
- Git is a distributed version-control system for tracking changes in source code during software development.
- It is designed for coordinating work among programmers
- It can be used to track changes in any set of files.
Benefits of version control
- For yourself:
- Keep complete history of changes, and rationale for all changes
- Go back to a previous version of your code
- Support multiple version of the same basic project
- Collaborative project:
- Simplifies concurrent work and merging changes
- You can backup your code remotely (on GitHub), which can be easily distributed
Setting up the Git
- Install the Git
- Start Git
- Windows: Git Bash
- Mac: Terminal
- Rstudio: there is a terminal window as well (handy for R)
- Set up a user.name and user.email (if this is your first time to use git)
- You only need to do this once
git config --global user.name "Sam Smith"
git config --global user.email sam@example.com
git config --list
Git basics
- The Git directory (repository)
- stores the metadata and object database for your project.
- what is copied when you clone a repository from another computer.
- offers snapshots (commits) of your project history. You can go back to them later.
- The working directory
- one snapshot where you are currently working with
- a single checkout of one version of the project.
- you can usually pull from the Git directory to update it.
- The staging area
- stores information about what will go into your next commit. The list of changed files will be committed to repository
- sometimes referred as the index
- intermediate step between the working directory and the Git directory
Git workflow
- You modify files in your working directory.
- You stage the files, adding snapshots of them to your staging area (index).
git add XXX
- You do a commit, which takes the files in the staging area as snapshots to the Git directory.
git commit -m "your message"
Example to make change in git repository (use R codes as example)
- Initialize a R package (macOS/Linux)
WD <- '~/Desktop'
setwd(WD)
usethis::create_package("GatorPKG", open = FALSE) ## open = FALSE will prevent R open a new R studio session.
WD2 <- '~/Desktop/GatorPKG'
setwd(WD2)
- Initialize a R package (Windows)
WD <- 'C:/Users/ss/Desktop/'
setwd(WD)
usethis::create_package("GatorPKG", open = FALSE) ## open = FALSE will prevent R open a new R studio session.
- Put in f.R in the R folder
##' Add up two numbers (Description)
##'
##' We want to add up two numbers, blalala... (Details)
##' @title add two numbers
##' @param x first number
##' @param y second number
##' @return sum of two numbers
##' @author Caleb
##' @export
##' @examples
##' f(1,2)
f <- function(x, y) x + y
Example to make change in git repository (git part)
cd ~/Desktop/GatorPKG
git init
- Check the status of files in the working directory
git status
git add -A ## -A stage all your files
git status
git commit -m "first commit"
git status
git status
- Tells what Git thinks is going on
- Do this frequently!
- before staging your workspace
- after staging your workspace, before committing
- after committing
git add
git add newFile
git add newFile1 newFile2 newFile3
git add -A
git commit
- committing makes a snapshot of everything that has been staged in your repository
- a short message is necessary
git commit -m "any message you want to make here"
- The message is required and will be helpful for the future you and your teammates
- If you don’t type in a message, it usually will return an error or an editor (usually vi) will be opened for you to enter the message. Vi is hard to use for beginners. See https://kb.iu.edu/d/afcz about how to quit vi, just in case.
- After you commit your changes to git, it will create a commit object
- You can view all commit objects by
git log
Other files in the repository
ls -a ## list all files, including hidden ones
More on git
- Git only saves the changes from the previous commit, won’t waste space
- Just make commits often
So far, local version control
Distributed version control
Distributed version control
- GitHub
- GitHub provides free repositories, given that you make your project open.
- GitHub provides inidivual accounts, and organizational accounts.
- Academic users (you need to apply) can have free private repositories. https://education.github.com/students
- Usually 1Gb limit per repository.
- https://github.com
- When you prepare R packages for CRAN or bioconductor, you will need GitHub
- BitBucket
- Similar to GitHub, but provide free private repositories.
- GitLab
Connect the local repository with GitHub
- Benefit
- You can work anywhere with any computer, as long as you can pull/fetch your project from the remote.
- You and your teammates can work on the same project from the same remote.
Set up GitHub and connect with your local repository (1)
Set up GitHub and connect with your local repository (2)
- clone to your local computer (e.g., desktop)
cd "~/Desktop"
git clone https://github.com/lovestat/GatorPKG.git
Make sure to use this way to initialize your repository, in order to connect to GitHub
Set up GitHub and connect with your local repository (3)
Put your code in the GatorPKG package
- You should have at least the following files or folders:
Set up GitHub and connect with your local repository (4)
- In R, to add the documentation for f.R
devtools::document()
- At the directory of your local repository (e.g., ~/Desktop/GatorPKG), do the following in the terminal
git add -A
git commit -m "1st R pacakge"
git log
You will see a SHA1 number (Secure Hash Algorithm 1), this is the access number of a commit object
- Push your changes in local repository to GitHub
git push
devtools::install_github("lovestat/GatorPKG")
library(GatorPKG)
f(1,2)
?f
Your turn
Goals:
- make an R package on your GitHub repository
- install this package by devtools::install_github
Add README.md on GitHub
- md represents mark down
- .md is very similar to .Rmd
- except it won’t evaluate R code
- Below is an toy example of README.md
# testGatorPKG
This is a fancy R package
## how to install the R package
`devtools::install_github("lovestat/GatorPKG")`
## example
`f(1,2)`
- You can add it locally, and push back to GitHub, just like f.R
- Or you can directly edit on the remote, then fetch/pull to the local repository
More on the remote
Two ways to get your PC updated with the latest remote repository
- If you already have a copy of the remote repository on your computer
- If you don’t have a copy of the remote repository on your computer
Keep updated with the remote
- Update your local repository
git fetch
git pull
Clone an existing remote repository to your local PC
git clone https://github.com/cran/mclust.git
After some changes of this package
If you are the owner/team member of the origin remote repository, you are able to push back.
If you are not the owner, you won’t be able to push back
Fork a repository on your GitHub account
- Click on Fork, the repository will be copied to your GitHub account
- Then you can do anything you want on the forked repository, under their license
Whole picture for Git operations
- Have talked about initialize, update, changes
- Will talk about branching before revert and diff
Git branches
- master
- hotfixes
- release branches
- develop
- feature branches
Basic commands for Branching
- By default, we are on the main/master branch
- Now we want to create some new features on the develop/feature branch
- Create a new branch and switch to it:
git checkout -b <branchname>
- Switch from one branch to another:
git checkout <branchname>
- List all branches and tell where am I:
git branch
git branch -d <branchname>
- Push the branch to the remote
git push origin <branchname>
Exercise for creating an additional branch
git checkout -b vignette
usethis::use_vignette("GatorPKG")
git add -A
git commit -m "include vignette"
- Check all available branches (your current branch has a * on it)
git branch
git checkout main
Git revert
Go back to the previous snapshot by SHA1 number
git checkout -b old-state f0d8506
git branch
- To go back to where you were, just check out the branch you were on again.
git checkout main
Git merge (when there are no conflicts)
- keep merging history
- If you want to merge branch B (vignette) into branch A (main)
- Go to branch A (main)
- merge –no-ff
git checkout main
git merge --no-ff vignette
git log
Compare with another commit object
- compare all changes in the working directory with the last commit
git diff
- compare the change for a specific file in the working directory with the last commit
git diff --base <filename>
git diff aSHA1 bSHA1
git diff <sourcebranch> <targetbranch>
Set alias for github command
- my favorite colorful logs
git config --global alias.lg "log --color --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit --"
git lg
Add R CMD check on GitHub
- Run this following code will set up the GitHub Action, a tool automates software workflows, for R CMD check.
- No need to run R CMD check on your computer.
- It will automatically run R CMD check for each commits pushed on GitHub.
- In details, this workflow runs R CMD check via the rcmdcheck package on the three major operating systems (linux, macOS, and Windows) on the latest release of R and on R-devel. This workflow is appropriate for a package that is (or will hopefully be) on CRAN or Bioconductor.
usethis::use_github_action_check_standard()
Add README badges/icons for your package
- The badges can be informative regarding your package status
- Run the following code in your package project and do as the output suggests
usethis::use_cran_badge()
usethis::use_lifecycle_badge("stable")
use_github_actions_badge(name = "R-CMD-check", repo_spec = NULL)
Build A package website for your package (1)
usethis::use_pkgdown()
pkgdown::build_site()
usethis::use_github_action("pkgdown")
git add -A
git commit -m "add pkgdown"
git push
usethis::use_github_action("pkgdown")
could take a few minutes to complete.
- After
usethis::use_github_action("pkgdown")
is finished, there would be additional branch called gh-pages
.
Build A package website for your package (2)
- Open
Settings
in the repository GatorPKG page
- Select
Pages
in the sidebar
- Under
Source
section, select Branch: gh-pages
and /(root)
. Then hit Save
- Wait for a few minutes, then the package website will be published with above URL. The URL can be changed in settings.
Display html on github
- the html file won’t automatically show up in a webpage
- need the following trick
Use GitHub as a raw code repository
address <- "https://raw.githubusercontent.com/lovestat/misc-code/main/emojiFace.R"
source(address, echo = T)
Reference:
- GitHub pro
- Summary of Git command
- Many contents are from the following resources: