Biostatistical Computing, PHC 6068
GitHub
Zhiguang Huo (Caleb)
Monday October 1, 2018
Outline
- Why we need Git and GitHub
- Basic command about GitHub
- How to host R packages on GitHub
- More on GitHub commands
- More on host R packages on GitHub
Why Git 1 (Version control)
- Your R package is working
- You change only slightly on the R package
- The R package breaks
- You change it back to the original package
- The R package is still broken
Why Git 2 (Version control)
- Your R package worked very well yesterday
- You made a lot of improvement last night
- but haven’t gotten them to work yet
- The homework is due today
Why Git 3 (Version control)
You worked on a collabotative project with your teammates
- You make a lot of improvements for some functions.
- Your teammate makes a lot of improvements for the same functions.
- How to merge these improvement together
Why Git 4 (Version control)
You worked on a collabotative project with your teammates
- You change one part of the R package, it works
- Your teammate changess another part, it works
- When put them together, it doesn’t work
- Some change in one part must have broken something in the other part.
Benefits of version control
- For yourself:
- Go back to a previous version of your code.
- Support multiple version of the same basic project
- Collaborative project:
- Simplifies concurrent work and merging changes
A typical Git (version control) work flow
Basic command on Git
- Start Git
- Windows: open Git Shell
- Mac: open terminal
- Set up username
git config --global user.name "Sam Smith"
git config --global user.email sam@example.com
- You only need to do this once
The repository
- The repository contains everyting about your project
- code
- data
- documentation
- .git, which is a hidden folder
- At any time, you can take a snapshot of the repository
- The snapshot is called a commit object
- You can revisit the snapshot at any time
Initialize a git repository
cd Desktop
mkdir testGit
cd testGit
git init
- initialize on GitHub and clone to your PC
- Will revist this item shortly
Make changes in the git repository
- Your changes: happen in your working directory (untracked)
- Index/Stage your changes:
git add XXX
- Commit the indexed/staged changes to the git repository
git commit -m "your message"
Example to make change in git repository (R part)
WD <- '~/Desktop'
setwd(WD)
devtools::create("GatorPKG")
WD2 <- '~/Desktop/GatorPKG'
setwd(WD2)
- Put in f.R in the R folder
##' Add up two numbers (Description)
##'
##' We want to add up two numbers, blalala... (Details)
##' @title add two numbers
##' @param x first number
##' @param y second number
##' @return sum of two numbers
##' @author Caleb
##' @export
##' @examples
##' f(1,2)
f <- function(x, y) x + y
Example to make change in git repository (git part)
- Initialize git (in git bash)
cd ~/Desktop/GatorPKG
git init
- Check the status of files in the working directory
git status
git add -A
git status
git commit -m "first commit"
git status
git status
- Tells what Git thinks is going on
- Do this frequently!
- before staging your workspace
- after staging your workspace, before committing
- after committing
git add
git add newFile
git add newFile1 newFile2 newFile3
git add -A
git rm --cached <file>..
git commit
- committing makes a snapshot of everything that has been staged in your repository
- a short message is necessary
git commit -m "any message you want to make here"
- The message will helpful for future you and your teammates
- If you don’t type in a message, an editor will be open for you to enter the message
git commit
- After you commit your changes to git, it create a commit object
- You can view all commit objects by
git log
Connect the local repository with GitHub
- Benefit
- You can work anywhere with any computer, as long as you can pull/fetch your project from the remote.
- You and your teammates can work on the same project from the same remote.
More on the remote
- GitHub:
- GitHub provides free repositories, given that you make your project open.
- GitHub provides inidivual accounts, and organizational accounts.
- Academic users (you need to apply) can have free private repositories.
- Usually 1Gb limit per repository.
- BitBucket:
- Similar to GitHub, but provide free private repositories.
Set up GitHub and connect with your local repository
- Create a new repository on GitHub (GatorPKG).
- To avoid errors, do not initialize the new repository with README, license, or gitignore files.
- You can add these files after your project has been pushed to GitHub.
- connect your local repository with GitHub
git remote add origin https://github.com/Caleb-Huo/GatorPKG.git
- Push your changes in local repository to GitHub
git push
git push --set-upstream origin master ## for the first time push
Exercise: put the R package on GitHub
devtools::document()
- add, commit, and push back to GitHub
git add -A
git commit -m "1st R pacakge"
git push
git log
You will see a SHA1 number (Secure Hash Algorithm 1), this is the access number of a commit object
- Install the package from GitHub
devtools::install_github("Caleb-Huo/GatorPKG")
library(GatorPKG)
f(1,2)
?f
More on the remote
Two ways to connect your PC with the remote repository
- If you already have a copy of the remote repository
- If you don’t have a copy of the remote repository
Add README.md on GitHub
- md represents mark down
- .md is very similar to .Rmd
- except it won’t evaluate R code
# testGatorPKG
This is a fancy R package
## how to install the R package
devtools::install_github("Caleb-Huo/GatorPKG")
## example
f(1,2)
- now the remote repository is ahead of our local repository
Keep updated from the remote
- If the repository in your local PC refers to the remote
- Download objects and refs from another repository.
git fetch
- Fetch from and integrate with another repository or a local branch
git pull
Clone a existing remote repository to your local PC
git clone https://github.com/cran/mclust.git
If you are the owner of the origin remote repository, you are able to push back.
If you are not the owner, you won’t be able to push back
Fork a repository on your GitHub account
- Click on Fork, the repository will be copied to your GitHub account
- Then you can do anything you want on the forked repository
Initialize the repository from GitHub
- initialize the new repository with:
- README
- license
- gitignore files
git clone https://github.com/Caleb-Huo/GatorPKG2.git
- make some changes, add, commit, and push back
git add -A
git commit -m "touch"
git push
Create a another branch
- By default, we are on the master branch
- Now we want to create some new features on the develop/feature branch
- Create a new branch and switch to it:
git checkout -b <branchname>
- Switch from one branch to another:
git checkout <branchname>
- List all branches and tell where am I:
git branch
git branch -d <branchname>
- Push the branch to the remote
git push origin <branchname>
Git branches
- master
- hotfixes
- release branches
- develop
- feature branches
Exercise for creating a another branch
git checkout -b vignette
devtools::use_vignette("tutorial")
Compare with another commit object
- compare all changes in the working directory with the last commit
git diff
- compare the change for a specific file in the working directory with the last commit
git diff --base <filename>
git diff aSHA1 bSHA1
git diff <sourcebranch> <targetbranch>
Git merge (when there are no conflicts)
- keep merging history
- If you want to merge branch B (vignette) into branch A (master)
- Go to branch A (master)
- merge –no-ff
git checkout master
git merge --no-ff vignette
git log
Set alias for github command
- my favorate colorful logs
git config --global alias.lg "log --color --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset' --abbrev-commit --"
git lg
Git merge (when there are conflicts)
- make some changes in the master’s branch
- make some changes (with conflicts) in the vignette’s branch
- notice that after you switch to the vignette’s branch, the previous add/commit to the master’s branch disappear
git checkout master
git merge --no-ff vignette ## fail
git diff ## check the difference
## open the conflict file, resolve conflicts
git commit -a -m "conflict resolved"
Display html on github
- the html file won’t automatically show up in a webpage
- need the following trick
Go back to a previous snapshot
Go back to the previous snapshot by SHA1 number
git checkout -b old-state f0d8506
git branch
- To go back to where you were, just check out the branch you were on again.
git checkout master
Delete commits
Don’t suggest to do so, if you don’t want other people see your code history
.gitignore
- a hidden file
- allows you to control what files shouldn’t be tracked by git
- e.g., add the following in .gitignore
GatorPKG.Rproj
.DS_Store
*/.DS_Store
- .DS_Store are some undesired files from MAC
Commonly used git commands
- GitHub cheat sheet
- Another summary of Git command
Reference:
Many contents are from the following resources: