Summary and Schedule

This lesson is an introduction to programming in Python 3 for people with little or no previous programming experience. It uses plotting as its motivating example and is designed to be used in both Data Carpentry and Software Carpentry workshops. This lesson references JupyterLab but can be taught using alternative Python 3 interpreters as well (e.g., repl.it, Anaconda).

Prerequisites

  1. Learners need to understand what files and directories are, what a working directory is, and how to start a Python interpreter.

  2. Learners must install Python 3 before the class starts.

  3. Learners must get the gapminder data before class starts: please download and unzip the file python-novice-gapminder-data.zip.

Please see the setup instructions for more details.

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.

Getting the Data


The data we will be using is taken from the gapminder dataset. To obtain it, download and unzip the file python-novice-gapminder-data.zip. In order to follow the presented material, you should launch the JupyterLab server in the root directory (see Starting JupyterLab).

Alternative Data Sources: Generalist Repositories

In addition to platforms like GitHub, researchers often use generalist repositories to store, share, and access datasets. These repositories are designed to preserve research outputs (like data, code, and workflows) in a citable, discoverable, and standardized way. They are particularly valuable for ensuring long-term accessibility and reproducibility of research.

Under the NIH Generalist Repository Ecosystem Initiative (GREI), several generalist repositories are recognized for hosting biomedical and scientific data. These include:

  • Zenodo (zenodo.org): An open repository for EU-funded research but widely used globally.
  • Dryad (datadryad.org): A nonprofit repository focused on publishing and preserving research data.

These repositories assign DOIs (Digital Object Identifiers) to datasets, making them easier to cite and track.

Why Use Generalist Repositories?

  • Long-term preservation: Ensures data remains accessible beyond project lifetimes.
  • Compliance: Meets funder (e.g., NIH) and publisher requirements for data sharing.
  • Interoperability: Standardized metadata makes data reusable across disciplines.

For this lesson, we’ve provided the data via GitHub for simplicity, but we encourage you to explore these repositories for your own work.

Installing Python Using Anaconda


Please refer to the Python section of the workshop website for installation instructions.