Day 3: Setting Up Cont'd: Environments and Contexts

Raymond Yee

January 28, 2014 (http://is.gd/wwod1403)

Goals Today

I want people to ask questions and flag stuff that's confusing

Meet AJ Renold, the Course Tutor

AJ Renold is wonderful: kind, smart, and responsive. Get to know him.

His role:

Assumed Background Knowledge

The skills include:

Material you'll find in http://software-carpentry.org/v4/index.html for example.

Office Hours

Thanks to everyone who took the Doodle: Office Hour choices. I'm going to leave my office hours at Tues, Thurs 3:30-4:30pm in 302 South Hall. We start today and run until the end of the semester -- unless I make an announcement canceling an office hour.

I'm willing to schedule other meetings and virtual office hours as the need arises. (For example, I'm a fan of Google Hangout, and the campus has a new video-conferencing system we can try out or the Web Conferences in Working with Open Data.

Faculty Office Hours | School of Information

For info about AJ's office hours, see bCourses: AJ's Office Hours for Working With Open Data.

Assignments from Last Week

Working with Open Data: Assignments -- about 38 people submitted assignments.

When can I expect the remainder of class to submit assignments?

For this round, I will accept late submissions.

Results of survey of student background: Survey about Technical Background: Statistics

Abstractions in our work

Lots of things happen when we run a Python program or run a cell in an IPython notebook. Most we can take for granted, but sometimes we need to understand some of the underlying complexity.

Much work goes into allowing us to work with abstractions but beware The Law of Leaky Abstractions.

The core of this course is using Python to work with data (particularly open data.)

Many Things at Work in "Simple" Computation

But consider, for a moment, all the things associated with the "simple" act of running a Python program or an IPython notebook:

The Culture of Software Development

A few Questions about the Data

RY's philosophy for learning how to work with data

I have a philosophy to guide ourselves in this course

Learning to Program is a bit Like learning a natural language

Key Concept for Today: Execution Environment of Python

What is an execution environment? Python Essential Reference, Fourth Edition> Execution Environment : Safari Books Online

Python: (Core Language + Standard Library) + other installed packages

Typically, the quotidian question we have is: how to install a given Python package

Major Alternatives to Package Installers

I would say that pip is a central tool for installing packages -- especially because of the great PyPI "repository of software for the Python programming language."

But there are times, it's actually really hard to get certain packages installed...and there's where I really appreciate Python distributions along with their package installer.

For this course, we recommend Anaconda: see IPython Installation Options · rdhyee/working-open-data-2014 Wiki.

(BTW, anyone can edit the course github wiki).

Wakari vs Anaconda

I recommended Wakari to start because it's easy to set up: create an account on wakari.io and you can go...

Think about the trade offs betwen running on your laptop (Anaconda) vs running on a virtualized Python environment running on the continuum.io servers (Wakari)

Wakari is good to have around but I think everyone will be happier having IPython notebook running on their own laptops.

I encourage you to install the anaconda python distribution even if already have another one running.

Conda

Conda — Continuum documentation is a Anaconda-specific alternative to virtualenv/virtualenvwrapper.

Conda [to quote the Conda docs]:

Examples — Continuum documentation

How I use conda

Conda invocations I've found Useful

conda info: tells you things like what your default environment is. conda --help: to get help

Packages included in Anaconda 1.8.0 — Continuum documentation

creating a minimalist conda environment

Just Python 2.7.6 and pip (and some other basic packages deemed important by conda):

To create minimal env and activate it:

conda create --no-default-packages -n minimal python=2.7.6 pip

source activate minimal

To list packages installed by conda:

conda list

To leave this environment:

source deactivate

[One issue that I've not worked out: source activate minimal should actually remove ~/anaconda/bin from my \$PATH so I don't end up accidentally using the base installation when I don't want to.]

a maximalist environment

What I call myenv -- install everything in anaconda:

conda create -n myenv anaconda

a IPython-dev environment

conda create -n ipython-dev ipython-notebook ipython-qtconsole pip
matplotlib numpy pandas

I then install the master branch of IPython into this environment: Quickstart — IPython 1.1.0: An Afternoon Hack documentation

Using the Course Github site:

git clone https://github.com/rdhyee/working-open-data-2014.git

If you don't know how to use git/github, it's time to learn.

Conda environments to Possibly Use

It is possible to use anaconda w/o defining any environments, but I recommend using them to control for dependencies.

installing census package

pip install -U census

Wakari custom environments

I'll show you in class how to use conda in Wakari -- think about what's going on in the Linux environments running on the Wakari machines

Custom Python Environments in Wakari

Assignments / Homework

Day_02_A_US_Census_API.ipynb: Fill in the notebook working-open-data-2014/notebooks/Day_02_A_US_Census_API.ipynb at master · rdhyee/working-open-data-2014. Due Friday, Jan 31, 2014 at 11:59pm PST