Data analysis with R Programming Flashcards
What you have learnt so far?
-Use structured thinking to define a problem and ask the right questions.
- Work with spreadsheets, databases, and tools like SQL to organize and transform data.
-Clean your data to make sure it has integrity before you analyze it.
- Create impactful data visualizations to illustrate key points.
- Craft a compelling story to communicate insights to stakeholders.
Computer programming
Giving instructions to a computer to perform an action or set of instructions.
What you will learn?
- Introduction to programming languages.
- Explore main features and functions.
- Basic programming concepts in R.
- How to work with data in R.
- Clean, transform, visualize, report data in R.
R Programing language
Used for statistical analysis, visualization, and other data analysis.
Programming Languages
- The words and symbols we use to write instructions for computers to follow.
Coding
- is writing instructions to the computer in the syntax of a specific programming language.
Programming languages
-R
- Python
- JavaScript
- SAS
-Scala
-Julia
Benefits of using programming languages
- Clarify the steps of your analysis.
- Saves time.
- Reproduce and share your work.
R
A programming language frequently used for statistical analysis, visualization, and other data analysis.
Open Source
Code that is freely available and may be modified and shared by the people who use it.
R Benefits
- Accessible
- Data-centric
- Open source
- Community
Uses of R
- Reproducing your analysis
- Processing lots of data
- Creating data visualizations
Integrated Development Environment (IDE)
A software application that brings together all the tools you may want to use in a single place.
R code known as pipe
Helps make a sequence of code easier to work with and read.
The Basic concepts of R
- Functions
-Comments - Variables
- Data types
- Vectors
-Pipes
Functions (R)
A body of reusable code to perform specific tasks in R.
Argument (R)
Information that a function in R needs in order to run.
Variable (R)
A representation of a value in R that can be stored for use later during programming.
Vector (R)
A group of data elements of the same type stored in a sequence in R.
Pipe(R)
A tool in R for expressing a sequence of multiple operations, represented with “%>%.
Pipe (R) example
Tooth Growth %>%
filter(dose==0.5)%>%
arrange(Len)
Data Structure
Data structure is a format for organizing and storing data.
Types of atomic vectors
-Logical
-Double
-integer
-Character
Logical Vector
True/False
Logical vector example
TRUE
Integer vector
Positive and negative whole values
Integer vector example
3
Double vector
Decimal values
Double vector example
101.175
Character vector
String/ character values
Character vector example
“Coding”
Data Frames
are the most common way of storing and analyzing data in R.
Matrix
is a two-dimensional collection of data elements. This means it has both rows and columns.
Operator
A symbol that names the type of operation or calculation to be performed in a formula.
Assignment operators
Used to assign values to variables and vectors.
Assignment operator Example
sales _1 <-1 c(67.00,75.50,90.00,54.75)
Arithmetic Operators
Used to complete math calculations.
Athematic Operators
+ (addition)
-(subtraction)
*(multiplication)
/(division)
Function
A body of reusable code for performing specific tasks in R.
Argument
Information needed by function in R in order to run.
Comment
Helpful text that describes or explains R code, preceded by#.
Variable
A representation of a value in R that can be stored for later use.
Data Types
An attribute that describes a piece of data based on its values, its programming language, or the operations it can perform.
Vector
A group of data elements of the same type stored in a one-dimensional sequence in R.
Pipe
A tool in R for expressing a sequence of multiple operations, represented with %>%.
Packages (R)
Units of reproducible R code
Packages include:
- Reusable R functions
- Documentation about the functions
- Sample datasets
- Tests for checking your code.
CRAN(Comprehensive R Archive Network)
An online archive with R packages, source code, manuals, and documentation.
R Packages
Packages offer a helpful combination of code, reusable R functions, descriptive documentation, tests for checking operability, and sample data sets.
Tidyverse (R)
A system of packages in R with a common design philosophy for data manipulation, exploration, and visualization.