Find us on GitHub

A Data Carpentry Workshop

NWU Genomics Data Carpentry

Sep 26 - 29, 2016

8:00 am - 15:00 pm

Instructors: Bianca Peterson, Maryke Schoonen, Jason Williams

Helpers: Martin Dreyer, Anelda van der Walt, Bertie Seyffert, Riaan van der Walt

General Information

Data Carpentry workshops are for any researcher who has data they want to analyze, and no prior computational experience is required. This hands-on workshop teaches basic concepts, skills and tools for working more effectively with data.

This event forms part of a series of Software and Data Carpentry workshops which are run by newly qualified African-based instructors in collaboration with international mentors. The workshops form part of a twelve month program developed by the North-West University, University of Cape Town, Talarify, the Software Carpentry Foundation, Data Carpentry, and Mozilla Science Labs. For more information about the programme, please see A Programme for the Development of Computational and Digital Research Capacity in South Africa and Africa.

We will cover Introductions, Cloud Computing and R, Data Cleaning and Manipulation in R, Continuation in R and Data Importing and Uploading and Shell Continuation and Workflows. Participants should bring their laptops and plan to participate actively. By the end of the workshop learners should be able to more effectively manage and analyze data and be able to apply the tools and approaches directly to their ongoing research.

Who: The course is specifically aimed at postgraduate students and other researchers who already engage in research projects where Next Generation Sequencing data will be generated. Preference will be given to researchers and students from the North-West University.

Cost: A non-refundable registration fee of R500 is charged to cover workshop costs.

Registration: Please register by completing the registration form.

Where: Room G18, Building G23, Potchefstroom Campus, NWU, Gerrit Dekker Street, Potchefstroom. Get directions with OpenStreetMap or Google Maps.

Requirements: Participants must bring a laptop with a Mac, Linux, or Windows operating sytem (not a tablet, Chromebook, etc.) that they have administrative privileges on. They should have a few specific software packages installed (listed below). They are also required to abide by Data Carpentry's Code of Conduct.

Installation Party: Registered participants are invited to get support with installation of the required software on 23 September between 11:00 - 13:00 at Room 236, Building F3, Potchefstroom Campus, NWU. Please see details of installation instructions below.

Contact: Please mail eresearch@nwu.ac.za for more information.


Preliminary Schedule

Surveys

Please be sure to complete these surveys before and after the workshop.

Pre-workshop Survey

Post-workshop Survey

Day 1: Introductions, Cloud Computing and R

08:00 - 08:30 Rise 'n Shine Coffee Time
08:30 - 09:00 Welcome and Introductions
09:00 - 10:00 Introduction to Data Carpentry
Introduction to the Data Set
Genomics Data Tidyness
Connecting to the Cloud in 5 minutes or Less
10:00 - 10:30 Coffee
10:30 - 12:30 R and R Studio Orientation
Introduction to R and R Studio
12:30 - 13:30 Break
13:30 - 14:50 Dataframes and Metadata
14:50 - 15:00 Feedback and Wrap-up
15:00 - later Networking opportunity at Drakenstein Restaurant

Day 2: Data Cleaning and Manipulation in R

08:00 - 08:30 Rise 'n Shine Coffee Time
08:30 - 10:00 Dataframes (continued)
10:00 - 10:30 Coffee
10:30 - 12:30 Data Cleaning and Manipulation With dplyr (continue)
12:30 - 13:30 Break
13:30 - 14:50 Using Spreadsheets Effectively - Optional but highly recommended
14:50 - 15:00 Feedback and Wrap-up

Day 3: Continuation in R and Data Importing and Uploading

08:00 - 08:30 Rise 'n Shine Coffee Time
08:30 - 10:00 Data Cleaning and Manipulation With dplyr (continued)
Plotting and Visualizing in R
Data Importing and Uploading
10:00 - 10:30 Coffee
10:30 - 12:20 Introduction to Shell
12:20 - 12:30 Feedback and Wrap-up
12:30 - 13:30 Break
13:30 - 15:00 Optional: Discussion of individual projects

Day 4: Shell Continuation and Workflows

08:00 - 08:30 Rise 'n Shine Coffee Time
08:30 - 10:00 Shell (continued)
Project Organization and Documentation
10:00 - 10:30 Coffee
10:30 - 12:20 QC of Sequencing Data
Automating Analyses - Shell Scripting
Creating Variant Calling Workflows
12:20 - 12:30 Feedback and Wrap-up
12:30 - 13:30 Break
13:30 - 15:00 Optional: Discussion of individual projects

Etherpad: http://pad.software-carpentry.org/2016-09-26-nwu-genomics.
We will use this Etherpad for chatting, taking notes, and sharing URLs and bits of code.


Setup

To participate in a Data Carpentry workshop, you will need working copies of the described software. Please make sure to install everything (or at least to download the installers) before the start of your workshop. Participants should bring and use their own laptops to insure the proper setup of tools for an efficient workflow once you leave the workshop.

Please follow these Setup Instructions.

We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.