Tuesday, 3 January 2017


Let’s make life easier with GitHub.

GIT is a version control system, a tool that tracks changes to our code and shares those changes with others. GIT is most useful when combined with GitHub, a website that allows us to share our code with the world. GitHub is the most popular version control system for developers of R packages.

GIT and GitHub are generally useful for all software development and data analysis, not just R packages. I’ve included it here, because it is so useful when you’re making a package.

·       It makes sharing your package easy. Any R user can install your package with just two lines of code.

·       Readers can easily browse code, and read documentation (via Markdown). They can report bugs, suggest new features with GitHub issues, and propose improvements to your code with pull requests.

·       With GIT, both of you can work on the same file at the same time. GIT will either combine your changes automatically, or it will show you all the ambiguities and conflicts.

·       It’s very easy to accidentally introduce a mistake that takes a few minutes to track down. GIT makes this problem easy to spot because it allows you to see exactly what’s changed and undo any mistakes.

You can do many of these same things with other tools (like subversion or mercurial) and other websites (like gitlab and bitbucket).

RStudio makes day-to-day use of GIT simpler. Once you’ve set up a project to use GIT, you’ll see a new pane and toolbar icon. These provide shortcuts to the most commonly used GIT commands. However, because only a handful of the 150+ GIT commands are available in RStudio, you also need to be familiar with using GIT from the shell (aka the command line or the console). It’s also useful to be familiar with using GIT in a shell because if you get stuck you’ll need to search for a solution with the GIT command names.




Posted by  Buddihi Anuradha

Learn to crunch big data with R

Get started using the open source R programming language to do statistical computing and graphics on large data sets