Workshops, Curriculum, and Tutorials

I am a big proponent of freely sharing knowledge. I especially enjoy breaking down complex topics and presenting them in fun and engaging ways. Topics range from statistical analysis, modeling, cloud computing, machine learning, biology, data management, computational reproducibility, and general R & Python programming.

Community Knowledge Sharing

I believe one of the most powerful ways to build community is to leverage the knowledge of the people and create mechanisms for showcasing these unique voices. Below highlights a few communities in which I created websites to collectively present work for the world.

  • Data Science by Design (DSxD) - I am a co-founder and lead organizer for this organization which is made up of data scientists, artists, and designers. One of the main activities of DSxD is the sharing of knowledge on best practices of data science. I head the Leadership Team in the organization, editing, and displaying of all contributions. The posts on this blog are also available in print form as a Book
  • Curiosity Data Project - This website includes many tutorials using biodiversity and ecological data. Named after "Cabinet of Curiosities", the types of analyses are wide ranging, such as code for analyzing dinosaur fossils, 3D CT-Scans, Google Earth fire, animal movement data and more. Tutorials go through data retrieval, data cleaning, and exploratory analysis. These works were written by myself, contributors, and a team of UC Berkeley undergraduate interns that I led during the 2018/2019 school year.
  • Eisen Lab Coding Club - As a Postdoc, I led the organization of this body of work from The Eisen Lab at UC Berkeley. We met regularly to share knowledge of computational techniques for analyzing genomic and 3D image data.

In-person Workshops

I have extensive experience teaching coding and computational methodology to groups of people. I often created supplemental online material to accompany my in-person teaching. While I am not as frequently doing in-person workshops due to the inaccessibly of such learning experiences because of the pandemic, I hope the material can still be used and remixed for other purposes.
  • Using Biodiversity and Natural History Museum Databases - This is a talk and workshop that focuses on introducing R users to natural history and biodiversity databases, we further work through an exploratory analysis in R using the Neotoma database.
  • The Data Science of Shape Using Momocs - This tutorial explores the use of the Momocs R package for doing 2D morphometric analysis.
  • Reproducible Research Version Control Lesson for Data Carpentry - This lesson teaches the concept of version control using git. The lesson begins with a slides describing the motivations for using version control and Github. The first activity is a follow along partner activity which explores git in GitHub only (02-git-in-github). Since the project is never brought locally onto the students computer, the students do not need to install git for this project. The second activity starts with a demo in RStudio which can be followed along if the students have installed git (03-git-in-rstudio). This activity introduces the students to using git on their computers.
  • How to Fully Explore Your Clustering Results using ggplot and kohonen R packages - This tutorial using the Titanic Survival dataset as a basis to learn clustering using Self Organizing Maps (SOM). The focus is on how to visualize the results to ensure full understanding behind the clustering algorithm.
  • Evo Devo Module - This class is about my favorite field of biology - Evolutionary Development, often nicknamed Evo Devo. It is a lecture and lab that can be taught in about a three hour period. The lab uses fresh cut flowers to explore evo-devo concepts and plant evolutionary biology. I taught this lesson at three different colleges and universities.
  • MACS2 to explore ChIP-Seq data - This lesson was built and taught when I TAd a Genomics course (BIS180L) at UC Davis with Julin Maloof.
  • Introduction to Git and GitHub - This was written and taught with Matthias Bussonnier for the Hacker Within group at UC Berkeley.
  • Reproducible Science Workshop on Tools, Resources, and Practices
  • Reproducible Research Organization Lesson - Full workshop that I co-wrote and co-taught on two occasions.


These tutorials were created to be stand alone for self-taught learning or be a module for a larger course.
  • Become a Superhero, Handle Your Data with R: This is a beginner R course aimed at learners with no programming experience. I originally wrote this course for high school students, but this work has been used in several undergraduate and graduate courses over the years. Since I wrote it when I first started using R, the tutorial has some very valuable insights and conceptual knowledge that is usually glossed over in other beginner tutorials.
  • Gene Expression analysis Self Organizing Maps Tutorial: This tutorial was written for several research groups interested in using Self Organizing Maps (SOM) for gene expression analysis. The benefit of using SOM clustering, as opposed to other clustering algorithms for gene expression analysis, is that you can constrain clustering based on many variables, such as genotype.
  • Tutorial on Mixed effect Linear Modeling - This is a tutorial on how to analyze data with Mixed Effect Linear Modeling in R using the lme4 R package that I co-opted from Dan Chitwood.


  • Using AWS for Neural Networks - This documentation was written for my research team to build Neural Networks on Amazon Web Services.
  • BIS180 L - This is a wonderful course designed by Julin Maloof. I TAd this course when I was pursuing my PhD at UC Davis.