Functions: Theory and Practice (in R)

Athit Kao, PhD
UCI Bioinformatics Support Group - April 25, 2018

whoami

  • R enthusiast

  • PhD, Biomedical Sciences, UCI

  • IGB Biomedical Informatics Training program alumnus

  • Focus on proteomics mass spectrometry and bioinformatics

Prerequisites

  • Fundamental awareness of computers and programming

  • No programming experience necessary

  • R and RStudio installed (for 2nd half)

whoareyou

  • Who is using Windows, Linux, and/or Mac OS?

  • How many people have programmed before? What languages?

  • Who has written a function before?

  • What is a use case for your field that may require programming?

Objects and Variables

  • In R, any data structure is considered an object:
    • Vector of n length
    • Matrix of m x n size
    • Function
  • Generally in programming, a storage location identified by a name is a variable:
    • May contain data structure(s)
    • Value can be modified by referencing the name Figure 27

Concept of Functions

Figure 26

Purpose of Functions

  • N.b. “Perfect is the enemy of progress”
  • Encapsulates and organizes multiple tasks
  • Generally, they make life easier by improving:
    • Readability
    • Reusability
    • Abstraction

Readability and Reusability

Figure 11

Abstraction

Figure 6

Writing Script vs. Using Console

Figure 0

Why are we typing out everything?...

  • So you can make mistakes:
    • “That was case sensitive?”
    • “I forgot a parenthesis/bracket?”
    • “That wasn't a period/comma/semi-colon?”
  • We learn better from mistakes
  • We can help each other out here

Follow along with the red line numbers

Exercise #1: Our Function

Figure 0

Exercise #2: Using lapply

  • lapply(  ) is a function that wraps over your function
  • A clean and concise way to iterate values through your function in R (vs. using a loop) Figure 15g

Exercise #3: Package "parallel"

  • Multithreading in R is very straightforward

  • Instead of lapply, we will swap it for a similar function

  • Windows users will have extra steps

  • All operating systems need to start with the following: Figure 20

Exercise #3: mclapply (for Linux/Mac OS)

Figure 0

Exercise #3: clusterApply (for Windows/Linux/Mac OS)

Figure 22

  • Objects additional to provided function must be clusterExport'ed to it (see “?clusterApply”)

Figure 0

Performance Benchmark

  • Exactly how much faster was that?

  • Time code using function system.time(  )

  • Remember to encapsulate code with brackets “{  }” Figure 23

Performance Benchmark

Figure 16

Summarize for me...

  1. Describe the basic structure of a function?
  2. How do functions improve code?
  3. Anything weird or counterintuitive?
  4. Is Sys.sleep, system.time, or lapply a function?
  5. Which is faster, lapply or mclapply?
  6. How much does R cost after the trial period ends?
  7. #4, they're all functions; #5, it depends; #6, R is free!!!

Questions?

  • Just try new code and see what happens; this isn't wet lab
  • Google: Don't just search with “R”, use “R language”
  • Stack Overflow: Use tag “[r]” Figure s2

Keep going, you got this!

Figure 0

App: Readability (cont.)

  • Encapsulation via functions is an aspect of code legibility
  • Others include:
    • Documentation (comments and READMEs)
    • Formatting (indentation and line wrapping)
    • Variable naming (obvious and consistent names)
  • There is a balance appropriate for you/your audience: Figure 10

App: Generalized Concept

Figure 2 Figure 4

App: Nothing to Return = NULL

Figure 0

App: Function Argument Order

App: "Functionals" in R

  • Functional:
  • lapply:
    • Function that applies another function over list
    • Returns a list

App: lapply vs. Looping

Figure 0

App: NIH Proficiency Scale

  • 1. Fundamental Awareness (basic knowledge): Common knowledge/understanding of basic techniques/concepts
  • 2. Novice (limited experience): Expected to need help when performing this skill
  • 3. Intermediate (practical application): Able to successfully complete tasks; expert required occasionally
  • 4. Advanced (applied theory): Able to successfully complete tasks without assistance
  • 5. Expert (recognized authority): Can provide guidance, troubleshooting, and answers related to this skill
  • Source: https://hr.nih.gov/working-nih/competencies/competencies-proficiency-scale