Scatter Plots In R

Introduction to Scatter Plots in R Language

Scatter plots (scatter diagrams) are bivariate graphical representations for examining the relationship between two quantitative variables. Scatter plots are essential for visualizing correlations and trends in data. Scatter plots in R can be drawn in several ways. Here we will discuss how to make several kinds of scatter plots in R.

The plot function in R

In plot() function when two numeric vectors are provided as arguments (one for horizontal and the other for vertical coordinates), the default behavior of the plot() function is to make a scatter diagram. For example,

library(car)
attach(Prestige)
plot(income, prestige)

will draw a simple scatterplot of prestige by income.

Usually, the interpretation of a scatterplot is often assisted by enhancing the plot with least-squares or non-parametric regression lines. For this purpose scatterplot() in car package can be used and it will add marginal boxplots for the two variables

scatterplot(prestige ~ income, lwd = 3 )

Note that in the scatterplot, the non-parametric regression curve is drawn by a local regression smoother, where local regression works by fitting a least-square line in the neighborhood of each observation, placing greater weight on points closer to the focal observation. A fitted value for the focal observation is extracted from each local regression, and the resulting fitted values are connected to produce the non-parametric regression line.

Coded Scatterplots

The scatterplot() function can also be used to create coded scatterplots. For this purpose, a categorical variable is used for coloring or using different symbols for each category. For example, let us plot prestige by income, coded by the type of occupation

scatterplot(prestige ~ income | type)

Note that variables in the scatterplot are given in a formula-style (as y ~ x | groups).

The coded scatterplot indicates that the relationship between prestige and income may well be linear within occupation types. The slope of the relationship looks steepest for blue-collar (bc) occupations, and least steep for professional and managerial occupations.

Jittering Scatter Plots

Jittering the data by adding a small random quantity to each coordinate serves to separate the overplotted points.

data(Vocab)
attach(Vocab)
plot(education, vocabulary) 
# without jittering
plot(jitter (education), jitter(vocabulary) )
Scatter Plots in R Language

The degree of jittering can be controlled via factor argument. For example, specifying factor = 2 doubles the jitter.

plot(jitter(education, factor = 2), jitter(vocabulary, factor = 2))

Let’s add the least-squares and non-parametric regression line.

abline(lm(vocabulary ~ education), lwd = 3, lty = 2)
lines(lowess(education, vocabulary, f = 0.2), lwd = 3)

The lowess function (an acronym for locally weighted regression) returns coordinates for the local regression curve, which is drawn by lines. The “f” arguments set the span of the local regression to lowess.

Using these different kinds of graphical representations of relationships between variables may help to identify some hidden information (hidden due to overplotting).

See more on plot() function

https://itfeature.com

https://gmstat.com

Important MCQs R Language History & Basics 4

The post is about MCQs R Language. The quiz about MCQS R Language covers some basics of R language, its functionality, concepts of packages, and history of R Language.

MCQs about R Language

1. The following packages are not contained in the “base” R system.

 
 
 
 

2. What is the output of getOption(“defaultPackages”) in R Studio?

 
 
 
 

3. Which package contains most fundamental functions to run R?

 
 
 
 

4. Advanced users of R can write _______ code to manipulate R objects directly

 
 
 
 

5. The primary source code copyright for R is held by the

 
 
 
 

6. which of the following is a “base” package for R language?

 
 
 
 

7. Which of the following are examples of variable names that can be used in R?

 
 
 
 

8. The primary R system is available from the ______

 
 
 
 

9. In which year the R-Core Team was formed?

 
 
 
 

10. Which of the following are best practices for creating data frames?

 
 
 
 

11. R is published under the ______ General Public License version.

 
 
 
 

12. Which of the following is the wrong statement:

 
 
 
 

13. The “base” R system can be downloaded from

 
 
 
 

14. The wrong statement from the following is:

 
 
 
 

15. Which of the following is a recommended package in R

 
 
 
 

16. Which of the following is used for Statistical analysis in the R language?

 
 
 
 

17. The public version of R released in 2000 was

 
 
 
 

18. R functionality is divided into a number of _______

 
 
 
 

19. R Runs on the _________ operating system

 
 
 
 

20. One limitation of R is that its functionality is based on _________

 
 
 
 

The R language is a free and open-source language developed by Ross Ihaka and Robert Gentleman in 1991 at the University of Auckland, New Zealand. The R Language is used for statistical computing and graphics to clean, analyze, and graph your data.

MCQs R Language History and Basics

MCQs R Language History and Basics Online Quiz

  • In which year the R-Core Team was formed?
  • The public version of R released in 2000 was
  • R Runs on the operating system
  • The primary source code copyright for R is held by the
  • R is published under the General Public License version.
  • The “base” R system can be downloaded from
  • The following packages are not contained in the “base” R system.
  • One limitation of R is that its functionality is based on __________.
  • The wrong statement from the following is:
  • R functionality is divided into a number of
  • The primary R system is available from the ________.
  • Which package contains the most fundamental functions to run R?
  • Which of the following is the wrong statement:
  • Which of the following is a “base” package for the R language?
  • Which of the following is a recommended package in R
  • What is the output of getOption(“defaultPackages”) in R Studio?
  • Advanced users of R can write ___________ code to manipulate R objects directly
  • Which of the following is used for Statistical analysis in the R language?
  • Which of the following are examples of variable names that can be used in R?
  • Which of the following are best practices for creating data frames?

The strengths of R programming language lie in its statistical capabilities, data visualization tools (such as ggplot2), and a vast ecosystem of packages contributed by the community. R Language remains a popular choice for statisticians and data scientists working on a wide range of projects.

Basic Statistics and Data Analysis

JSON Files in R: Reading and Writing (2019)

Introduction to JSON Files in R

A JSON file stores simple data structures and objects in JavaScript Object Notation (JSON) format. JSON is a standard data lightweight interchange format primarily used for transmitting data between a web application and a server. The JSON file is a text file that is language-independent, self-describing, and easy to understand. In this article, we will discuss reading and writing a JSON file in R Language in detail using the R package “rjson“.

Since JSON file format is text only, it can be sent to and from a server and used as a data format by any programming language. The data in the JSON file is nested and hierarchical. Let us start reading and writing JSON files in R.

Creating JSON File

Let’s create a JSON file. Copy the following lines into a text editor such as Notepad. Save the file with a .json extension and choose the file type as all files(*.*). Let the file name be “data.json”, stored on the “D:” drive.

{ 
"ID":["1","2","3","4","5","6","7","8" ],
"Name":["Rick","Dan","Michelle","Ryan","Gary","Nina","Simon","Guru" ],
"Salary":["623.3","515.2","611","729","843.25","578","632.8","722.5" ],
"StartDate":[ "1/1/2012","9/23/2013","11/15/2014","5/11/2014","3/27/2015","5/21/2013",
"7/30/2013","6/17/2014"],
"Dept":[ "IT","Operations","IT","HR","Finance","IT","Operations","Finance"]
}
Reading and Writing JSON files in R

Installing rjson R Package

The R language can also read the JSON files using the rjson package. To read a JSON data file, First, install the rjson package. Issue the following command in the R console, to install the rjson package.

install.packages("rjson")

The rjson package needs to be loaded after installation of the package.

Reading JSON Files in R

To read a JSON file, the rjson package needs to be loaded. Use the fromJSON( ) function to read the file.

# Give the data file name to the function.
result <- fromJSON(file = "D:\\data.json")
# Print the result.
print(result)

The JSON file now can be converted to a Data Frame for further analysis using the as.data.frame() function.

# Convert JSON file to a data frame.
json_data_frame <- as.data.frame(result)
print(json_data_frame)

Writing JSON objects to .Json file

To write JSON Object to file, the toJSON() function from the rjson library can be used to prepare a JSON object and then use the write() function for writing the JSON object to a local file.

Let’s create a list of objects as follows

list1 <- vector(mode="list", length=2)
list1[[1]] <- c("apple", "banana", "rose")
list1[[2]] <- c("fruit", "fruit", "flower")

read the above list to JSON

jsonData <- toJSON(list1)

write JSON object to a file

write(jsonData, "output.json")

Read more about importing and exporting data in R: see the post

MCQs General Knowledge