Skip to main content

Posts

Showing posts with the label Data science specialization with R

Mastering Data Science Experimental Design: From Hypothesis to Results

The Beginner’s Guide to Data Science Experimental Design: Now that we’ve looked at the different types of data science questions, we are going to spend some time looking at experimental design concepts, in our last lesson of our first Course "The Data Science Toolbox" in our Data Science Specialization using R Programming. As a data scientist, you are a  scientist  and as such, need to have the ability to design proper experiments to best answer your data science questions! Previous lesson, if you haven't watched! What does experimental design mean? Experimental design is organizing an experiment so that you have the correct data (and enough of it!) to clearly and effectively answer your data science question. This process involves clearly formulating your question in advance of any data collection, designing the best set-up possible to gather the data to answer your question, identifying problems or sources of error in your design, and only then, collecting the app...

Demystifying Big Data: Understanding the Fundamentals and Real-world Applications

The Beginner’s Guide to Big Data: A term you may have heard of before this course is “Big Data” - there have always been large datasets, but it seems like lately, this has become a buzzword in data science. But what does it mean? Previous lesson in case you haven't watched. What is big data? We talked a little about big data in the second lecture of this course. As the name suggests, big data are very large data sets. We previously discussed three qualities that are commonly attributed to big data sets: Volume, Velocity, Variety. From these three adjectives, we can see that big data involves large data sets of diverse data types that are being generated very rapidly. Three qualities of big data So none of these qualities seem particularly new - why has the concept of big data been so recently popularized? In part, as technology and data storage has evolved to be able to hold larger and larger data sets, the definition of “big” has evolved too. Also, our ability to collect...

Unlocking Insights with Data Science: Exploring the Types of Questions Data Scientists Can Answer

The Beginner’s Guide Types of Question in Data Science: In this lesson, we’re going to be a little more conceptual and look at some of the types of analyses data scientists employ to answer questions in data science. Previous lesson, in case you haven't watched. The main divisions of data science questions There are, broadly speaking, six categories in which data analyses fall. In the approximate order of difficulty, they are: Descriptive Exploratory Inferential Predictive Causal Mechanistic Let’s explore the goals of each of these types and look at some examples of each analysis! 1. Descriptive analysis The goal of descriptive analysis is to  describe  or  summarize  a set of data. Whenever you get a new dataset to examine, this is usually the first kind of analysis you will perform. Descriptive analysis will generate simple summaries about the samples and their measurements. You may be familiar with common descriptive statistics: measures...

Streamlining Your Workflow: Linking Git/GitHub with R Studio for Efficient Version Control

The Beginner’s Guide Linking Git/GitHub with R Studio: Now that we have both R Studio and Git set-up on your computer and a GitHub account, it’s time to link them together so that you can maximize the benefits of using R Studio in your version control pipelines. First we will link R studio and Git and then we will link R Studio and GitHub. We will also link an existing Project with Git and GitHub. Linking R Studio and Git In R Studio, go to Tools > Global Options > Git/SVN Use the Global Options menu to tell R Studio you are using Git as your version control system Sometimes the default path to the Git executable is not correct. Confirm that git.exe resides in the directory that R Studio has specified; if not, change the directory to the correct path. Otherwise, click OK or Apply. Confirm that the directory R Studio points to for the Git executable is correct R Studio and Git are now linked. Linking R Studio and GitHub In that same R Studio option window, clic...