Skip to main content

Streamlining Your Workflow: Linking Git/GitHub with R Studio for Efficient Version Control

The Beginner’s Guide Linking Git/GitHub with R Studio:

Now that we have both R Studio and Git set-up on your computer and a GitHub account, it’s time to link them together so that you can maximize the benefits of using R Studio in your version control pipelines.

First we will link R studio and Git and then we will link R Studio and GitHub. We will also link an existing Project with Git and GitHub.



Linking R Studio and Git

In R Studio, go to Tools > Global Options > Git/SVN

Use the Global Options menu to tell R Studio you are using Git as your version control system

Sometimes the default path to the Git executable is not correct. Confirm that git.exe resides in the directory that R Studio has specified; if not, change the directory to the correct path. Otherwise, click OK or Apply.

Confirm that the directory R Studio points to for the Git executable is correct

R Studio and Git are now linked.

Linking R Studio and GitHub

In that same R Studio option window, click “Create RSA Key” and when this completes, click “Close.”

Following this, in that same window again, click “View public key” and copy the string of numbers and letters. Close this window.

Generate an RSA key and copy the public key to your clipboard

You have now created a key that is specific to you which we will provide to GitHub, so that it knows who you are when you commit a change from within R Studio.

To do so, go to github.com/, log-in if you are not already, and go to your account settings. There, go to “SSH and GPG keys” and click “New SSH key”. Paste in the public key you have copied from R Studio into the Key box and give it a Title related to R Studio. Confirm the addition of the key with your GitHub password.

Location of “SSH and GPG keys” on your profile settings
Telling GitHub the public SSH key generated in R Studio

GitHub and R Studio are now linked. From here, we can create a repository on GitHub and link to R Studio.

Create a new repository and edit it in R Studio

On GitHub, create a new repository (github.com > Your Profile > Repositories > New). Name your new test repository and give it a short description. Click Create repository. Copy the URL for your new repository.

Location of the “Repositories” link on your profile
Creating a new repository on GitHub

In R Studio, go to File > New Project. Select Version Control. Select Git as your version control software. Paste in the repository URL from before, select the location where you would like the project stored. When done, click on “Create Project”. Doing so will initialize a new project, linked to the GitHub repository, and open a new session of R Studio.

Creating a version controlled project on R Studio
Cloning your Git repository to R Studio

Create a new R script (File > New File > R Script) and copy and paste the following code:

print("This file was created within R Studio")

print("And now it lives on GitHub")

Save the file. Note that when you do so, the default location for the file is within the new Project directory you created earlier.

Saving your first script for this project

Once that is done, looking back at R Studio, in the Git tab of the environment quadrant, you should see your file you just created! Click the checkbox under “Staged” to stage your file.

All files that have been modified since your last pull appear in the Git tab

Click “Commit”. A new window should open, that lists all of the changed files from earlier, and below that shows the differences in the staged files from previous versions. In the upper quadrant, in the “Commit message” box, write yourself a commit message. Click Commit. Close the window.

Commiting your R Script to the repository!

So far, you have created a file, saved it, staged it, and committed it. If you remember your version control lecture, the next step is to push your changes to your online repository. Push your changes to the GitHub repository.

How to push your commit to the GitHub repository

Go to your GitHub repository and see that the commit has been recorded.

You’ve just successfully pushed your first commit from within R Studio to GitHub!

Projects under version control

Till now, we linked R Studio with Git and GitHub. In doing this, we created a repository on GitHub and linked it to R Studio. Sometimes, however, you may already have an R Project that isn’t yet under version control or linked with GitHub. Let’s fix that!

Linking an existing Project with Git

So what if you already have an R Project that you’ve been working on, but don’t have it linked up to any version control software (tut tut!)?

Thankfully, R Studio and GitHub recognize this can happen and have steps in place to help you (admittedly, this is slightly more troublesome to do than just creating a repository on GitHub and linking it with R Studio before starting the project…).

So first, let’s set up a situation where we have a local project that isn’t under version control. Go to File > New Project > New Directory > New Project and name your project. Since we are trying to emulate a time where you have a project not currently under version control, do NOT click “Create a git repository”. Click Create Project.

Creating a project that is not under version control

We’ve now created an R Project that is not currently under version control. Let’s fix that. First, let’s set it up to interact with Git. Open Git Bash or Terminal and navigate to the directory containing your project files. Move around directories by typing cd ~/dir/name/of/path/to/file

When the command prompt in the line before the dollar sign says the correct directory location of your project, you are in the correct location. Once here, type git init followed by git add . - this initializes (init) this directory as a git repository and adds all of the files in the directory (.) to your local repository. Commit these changes to the git repository using git commit -m "Initial commit"

Linking the project folder with Git so it is now under version control

At this point, we have created an R Project and have now linked it to Git version control. The next step is to link this with GitHub.

Linking this project with GitHub

To do this, go to GitHub.com, and again, create a new repository:
1) Make sure the name is the exact same as your R project;
2) Do NOT initialize a README file, .gitignore, or license.

Creating a repository on GitHub that is named the same as your R project

Upon creating the repository, you should see a page like this:

Your new repository on GitHub containing code to push from the command line

You should see that there is an option to “Push an existing repository from the command line” with instructions below containing code on how to do so. In Git Bash or Terminal, copy and paste these lines of code to link your repository with GitHub. After doing so, refresh your GitHub page and it should now look something like the image below.

When you re-open your project in R Studio, you should now have access to the Git tab in the upper right quadrant and can push to GitHub from within R Studio any future changes.

You’ve now pushed your R project repository to your GitHub repository of the same name

Working on an existing GitHub repository

If there is an existing project that others are working on that you are asked to contribute to, you can link the existing project with your R Studio. It follows the exact same premise as that from the above where you created a GitHub repository and then cloned it to your local computer using R Studio. In brief, in R Studio, go to File > New Project > Version Control. Select Git as your version control system, and like above, provide the URL to the repository that you are attempting to clone and select a location on your computer to store the files locally. Create the project.

Follow the same steps as previously done to clone your own repository to a new project in R Studio
Clone an existing project from GitHub from within R Studio

All the existing files in the repository should now be stored locally on your computer and you have the ability to push edits from your R Studio interface. The only difference from the above is that you did not create the original repository, instead you cloned somebody else’s.

Summary

In this lesson, we linked Git and R Studio, so that R Studio recognizes you are using Git as your version control software. Following that, we linked R Studio to GitHub, so that you can push and pull repositories from within R Studio. To test this, we created a repository on GitHub, linked it with a new project within R Studio, created a new file, and then staged, committed, and pushed the file to your GitHub repository!

We also went over how to convert an existing project to be under Git version control using the command line. Following this, we linked your newly version-controlled project to GitHub using a mix of GitHub commands and the command line. We then briefly recapped how to clone an existing GitHub repository to your local machine using R Studio.

Comments

Popular posts from this blog

Mastering Debugging in R: Essential Tools and Techniques

The Beginner’s Guide to Debugging Tools in R: Debugging is an essential part of programming in any language, including R. When your code doesn't work as expected, it can be frustrating and time-consuming to find and fix the issue. Fortunately, R provides a variety of debugging tools that can help you identify and fix issues in your code more efficiently. In this blog post, we'll explore some of the most useful debugging tools in R, along with examples of how to use them. The browser() function:  The browser() function is a built-in debugging tool in R that allows you to pause the execution of your code and inspect the values of variables at that point. To use the browser() function, simply insert it into your code where you want to pause the execution. For example: my_function <- function(x) {                                              y <- x * 2  ...

Mastering Loop Functions in R: Exploring tapply and split for Data Manipulation and Analysis

The Beginner’s Guide to Loop Functions in R: Loop functions are powerful tools in R for data manipulation and analysis . They provide efficient and concise ways to apply a function to multiple elements of a data structure. Two commonly used loop functions in R are tapply and split . In this blogpost, we will explore these functions in detail and learn how they can be used to effectively analyze and manipulate data. We will cover the basics of these functions and provide practical examples to illustrate their usage. tapply()  tapply is a loop function in R that applies a function to subsets of a vector or array based on a grouping factor. The syntax of tapply is as follows: tapply(X, INDEX, FUN) where X is the input vector or array, INDEX is the grouping factor, and FUN is the function to be applied. Now suppose we have a data frame containing information about various cities, including their population and average temperature. We could use tapply() to calculate the mean popula...

Mastering R Data Types: Matrices, Factors, Missing Values, Data Frames, and Names Attribute

The Beginner’s Guide to R Data Types: R is a programming language that is widely used for data analysis and statistical computing. It has a powerful set of data structures, including vectors, lists, and data frames, that allow users to work with data in a flexible and efficient way. Matrices A matrix is a two-dimensional array in R that can contain elements of any data type. You can create a matrix using the matrix() function. For example: # Create a matrix with 3 rows and 2 columns  my_matrix <- matrix(c(1, 2, 3, 4, 5, 6), nrow = 3, ncol = 2) Factors A factor is a type of variable in R that represents categorical data. Factors are stored as integers, where each integer corresponds to a level of the factor. You can create a factor using the factor() function. For example: # Create a factor with three levels: "low", "medium", "high"  my_factor <- factor(c("low", "high", "medium", "high", "low")) Missin...