Raymond Yee
January 28, 2014 (http://is.gd/wwod1403)
Checklist: can everyone run a sample IPython notebook and upload to bCourses
My goal: to make WwOD FUN, ENGAGING, and NOT OVERWHELMING
I want people to ask questions and flag stuff that's confusing
AJ Renold is wonderful: kind, smart, and responsive. Get to know him.
His role:
The skills include:
Material you'll find in http://software-carpentry.org/v4/index.html for example.
Please flag things that are unclear by posting questions to bCourses.
Work with AJ to get up to speed.
Should many people struggle with certain skills/knowledge areas, we'll consider providing more training and taking up some issues during class.
Thanks to everyone who took the Doodle: Office Hour choices. I'm going to leave my office hours at Tues, Thurs 3:30-4:30pm in 302 South Hall. We start today and run until the end of the semester -- unless I make an announcement canceling an office hour.
I'm willing to schedule other meetings and virtual office hours as the need arises. (For example, I'm a fan of Google Hangout, and the campus has a new video-conferencing system we can try out or the Web Conferences in Working with Open Data.
Faculty Office Hours | School of Information
For info about AJ's office hours, see bCourses: AJ's Office Hours for Working With Open Data.
Working with Open Data: Assignments -- about 38 people submitted assignments.
When can I expect the remainder of class to submit assignments?
For this round, I will accept late submissions.
Results of survey of student background: Survey about Technical Background: Statistics
Lots of things happen when we run a Python program or run a cell in an IPython notebook. Most we can take for granted, but sometimes we need to understand some of the underlying complexity.
Much work goes into allowing us to work with abstractions but beware The Law of Leaky Abstractions.
The core of this course is using Python to work with data (particularly open data.)
But consider, for a moment, all the things associated with the "simple" act of running a Python program or an IPython notebook:
I have a philosophy to guide ourselves in this course
What is an execution environment? Python Essential Reference, Fourth Edition> Execution Environment : Safari Books Online
Python: (Core Language + Standard Library) + other installed packages
versions matter: version of Python Python versions + versions of packages
Python core language + any Python packages installed --> often run across different contexts (operating systems)
Typically, the quotidian question we have is: how to install a given Python package
I would say that pip is a central tool for installing packages -- especially because of the great PyPI "repository of software for the Python programming language."
But there are times, it's actually really hard to get certain packages installed...and there's where I really appreciate Python distributions along with their package installer.
For this course, we recommend Anaconda: see IPython Installation Options · rdhyee/working-open-data-2014 Wiki.
(BTW, anyone can edit the course github wiki).
I recommended Wakari to start because it's easy to set up: create an account on wakari.io and you can go...
Think about the trade offs betwen running on your laptop (Anaconda) vs running on a virtualized Python environment running on the continuum.io servers (Wakari)
Wakari is good to have around but I think everyone will be happier having IPython notebook running on their own laptops.
I encourage you to install the anaconda python distribution even if already have another one running.
Conda — Continuum documentation is a Anaconda-specific alternative to virtualenv/virtualenvwrapper.
Conda [to quote the Conda docs]:
In my specific case, I'm running OS X 10.6.8. When I downloaded and installed anaconda, all my anaconda files end up in ~/anaconda
.
A really nice feature: "installs into a single directory and doesn't affect other Python installations on your system. Doesn't require root or local administrator privileges."
On my Mac, the key to swapping between using Anaconda environments and my virtualenvs is through manipulating the $PATH environment variable. If I want to use anaconda, I need to get ~/anaconda/bin
to beginning of $PATH
-- see Anaconda FAQ — Continuum documentation conda environments
I have an alias defined in my ~/.profile
to make this easier to do on the fly:
# have alias for using anaconda
alias use_conda='export ANACONDA_HOME=$HOME/anaconda; export PATH=$ANACONDA_HOME/bin:$PATH;'
My conda environments are found in ~/anaconda/envs
. To get rid of an environment, just delete the corresponding subdirectory.
Note: wakari also uses Conda: Custom Python Environments in Wakari
conda info
: tells you things like what your default environment is. conda --help
: to get help
Packages included in Anaconda 1.8.0 — Continuum documentation
Just Python 2.7.6 and pip
(and some other basic packages deemed important by conda):
To create minimal env and activate it:
conda create --no-default-packages -n minimal python=2.7.6 pip
source activate minimal
To list packages installed by conda:
conda list
To leave this environment:
source deactivate
[One issue that I've not worked out: source activate minimal
should actually remove ~/anaconda/bin
from my \$PATH
so I don't end up accidentally using the base installation when I don't want to.]
What I call myenv
-- install everything in anaconda:
conda create -n myenv anaconda
conda create -n ipython-dev ipython-notebook ipython-qtconsole pip
matplotlib numpy pandas
I then install the master branch of IPython into this environment: Quickstart — IPython 1.1.0: An Afternoon Hack documentation
git clone https://github.com/rdhyee/working-open-data-2014.git
If you don't know how to use git/github, it's time to learn.
It is possible to use anaconda w/o defining any environments, but I recommend using them to control for dependencies.
pip install -U census
I'll show you in class how to use conda in Wakari -- think about what's going on in the Linux environments running on the Wakari machines
Day_02_A_US_Census_API.ipynb: Fill in the notebook working-open-data-2014/notebooks/Day_02_A_US_Census_API.ipynb at master · rdhyee/working-open-data-2014. Due Friday, Jan 31, 2014 at 11:59pm PST