Summary and Schedule

This lesson is an introduction to programming in Python 3 for people with little or no previous programming experience. It uses plotting as its motivating example and is designed to be used in both Data Carpentry and Software Carpentry workshops. This lesson references JupyterLab but can be taught using alternative Python 3 interpreters as well (e.g., repl.it, Anaconda).

Prerequisites

Learners need to understand what files and directories are, what a working directory is, and how to start a Python interpreter.
Learners must install Python 3 before the class starts.
Learners must get the gapminder data before class starts: please download and unzip the file python-novice-gapminder-data.zip.

Please see the setup instructions for more details.

Setup Instructions

Download files required for the lesson

00h 00m

1. Running and Quitting

How can I run Python programs?

00h 15m

2. Variables and Assignment

How can I store data in programs?

00h 35m

3. Data Types and Type Conversion

What kinds of data do programs store?
How can I convert one type to another?

00h 55m

4. Built-in Functions and Help

How can I use built-in functions?
How can I find out what they do?
What kind of errors can occur in programs?

01h 20m

5. Morning Coffee

01h 35m

6. Libraries

How can I use software that other people have written?
How can I find out what that software does?

01h 55m

7. Reading Tabular Data into DataFrames

How can I read tabular data?

02h 15m

8. Pandas DataFrames

How can I do statistical analysis of tabular data?

02h 45m

9. Plotting

How can I plot my data?
How can I save my plot for publishing?

03h 15m

10. Lunch

04h 00m

11. Lists

How can I store multiple values?

04h 20m

12. For Loops

How can I make a program do many things?

04h 45m

13. Conditionals

How can programs do different things for different data?

05h 10m

14. Looping Over Data Sets

How can I process many data sets with a single command?

05h 25m

15. Afternoon Coffee

05h 40m

16. Writing Functions

How can I create my own functions?

06h 05m

17. Variable Scope

How do function calls actually work?
How can I determine where errors occurred?

06h 25m

18. Programming Style

How can I make my programs more readable?
How do most programmers format their code?
How can programs check their own operation?

06h 55m

19. Wrap-Up

What have we learned?
What else is out there and where do I find it?

07h 15m

20. Feedback

How did the class go?

07h 30m

Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.

Getting the Data

The data we will be using is taken from the gapminder dataset. To obtain it, download and unzip the file python-novice-gapminder-data.zip. In order to follow the presented material, you should launch the JupyterLab server in the root directory (see Starting JupyterLab).

Alternative Data Sources: Generalist Repositories

In addition to platforms like GitHub, researchers often use generalist repositories to store, share, and access datasets. These repositories are designed to preserve research outputs (like data, code, and workflows) in a citable, discoverable, and standardized way. They are particularly valuable for ensuring long-term accessibility and reproducibility of research.

NIH-GREI Recommended Repositories

Under the NIH Generalist Repository Ecosystem Initiative (GREI), several generalist repositories are recognized for hosting biomedical and scientific data. These include:

Zenodo (zenodo.org): An open repository for EU-funded research but widely used globally.
Dryad (datadryad.org): A nonprofit repository focused on publishing and preserving research data.

These repositories assign DOIs (Digital Object Identifiers) to datasets, making them easier to cite and track.

Why Use Generalist Repositories?

Long-term preservation: Ensures data remains accessible beyond project lifetimes.
Compliance: Meets funder (e.g., NIH) and publisher requirements for data sharing.
Interoperability: Standardized metadata makes data reusable across disciplines.

For this lesson, we’ve provided the data via GitHub for simplicity, but we encourage you to explore these repositories for your own work.

Installing Python Using Anaconda

Please refer to the Python section of the workshop website for installation instructions.