ASTR 9 Python Introduction

An Introduction to Programming with Python

AY9 - Winter 2018

The goal of this notebook is to give a brief introduction to how to begin programming with Python. Since it is nowhere near complete in its coverage of the topic, I've listed some additional resources below for self-directed learning.

  • Read the docs: python.org
  • The official Python tutorial is fantastic!
    • chapters 3 through 5 cover a lot of good material for starting out
  • Object-Oriented Programming in Python. Despite the name, this free online textbook comprehensively covers much of the language beyond just object-oriented programming
  • Stack Overflow: you may end up here anyway when googling your problems, this can be a good source of programming questions and answers.

Note that we'll be using Python3. While there aren't too many major differences between the different versions of Python3, there are some notable differences between Python2 and Python3 that may come up from time to time.

python

Jupyter notebook

A Jupyter notebook is a browser-based environment for running computations. We'll be using this to run Python code on the class server. You can read the official documentation here, but the main thing we need to know for now is that we can enter text into cells, which we can evaluate by either hitting "Shift + Enter" or selecting "run cell" from the menubar.

A rough overview of using these notebooks can be found on the main Canvas page for the course.

https://canvas.ucsc.edu/courses/10107/files/374251?module_item_id=31967

jupyter

Glossary

Python

  • Python - a dynamically-typed interpreted programming language
  • script - a file with lines of Python code
  • module - a script that can be imported into another script
  • package - a module or group of modules that are easily installed
  • Jupyter notebook - a browser-based environment for running programs, AKA an Ipython notebook
  • Jupyterhub - a server that provides users with individual Jupyter notebooks
  • hyperion - the name of our Jupyterhub server, also a moon of Saturn

Programming terms

  • syntax - the set of rules that define a legal statement in a language
  • variable - a label for an object in the computer's memory
  • assignment - the act of storing a value in a variable
  • scope - for a particular variable, its scope refers to the parts of the code where its assignment is valid; it can be global to the whole script, or local to a particular function or class
  • literal - a fixed value of a data type, e.g., 42 is a literal int, "42" is a literal string
  • binary operator - a function that takes two values and returns a new value, e.g., +, -, *, /
  • unary operator - a function that takes one value and returns a new value, e.g., -, not
  • loop - an evaluation of the same logic multiple times
  • conditional - a control statement that determines which of some possible statements will be evaluated
  • function - an object that can be called with some input values and can return some output values
  • class - a "prototype" of an object, e.g., int is a class, and 1, 2, and 3 are instances (or objects) of this class
  • object - an instance of a particular class; in Python, just about everything is some kind of object
  • method - a function associated with objects of a particular class; e.g., for the list class, the sort() function is a method of the class
  • object-oriented programming - a programming style that relies on organizing logic into classes of objects which can have their own attributes and functions
  • functional programming - a programming style that only uses "pure functions" (that is, functions which do not have side effects, and only interact with the rest of the scope through their return values)

Basic variable types

  • string - a sequence of characters, created by quotation marks, e.g., "Hello Python!"
  • int - a type of numeric value, short for integer
  • float - a type of numeric value, short for floating point (an approximation of a real number)
  • bool - short for Boolean, a value that is either True or False

Data structures

  • list - a sequence of values, indexed by the natural numbers (0, 1, 2, ...)
  • dictionary - a set of values, indexed by arbitrary objects
  • array - a sequence of values, indexed as a list, but fixed in size
  • index - a natural number (0, 1, 2, ...) that acts as an address into a list or an array
  • slice - a range of indices that allows you to select some subset of a list or an array

Hello, world

This is a simple program. Its output is the text, "Hello, everyone!".

In [ ]:
print("Hello, everyone!")

So what's the big deal? That it isn't a big deal to do something as "simple" as outputting text! Compare this with the equivalent program in Java:

public class Hello {
    public static void main(String[] args) {
        System.out.println("Hello, AY9!");
    }
}

That's not to say that programming languages like Java or C++ don't have their proper purpose or place. But the biggest benefit of Python is that it decreases the barrier between thinking something and doing something. In this tutorial, we'll get a taste of just what we can do.

A meta-comment: lines that start with the hash symbol, #, are comments; they do nothing, but are a great way of making notes for both yourself and others who will be looking at your code.

Peppered throughout this notebook are some caution signs, indicating some common pitfalls. Here's our first one! Evaluating a code cell in a Jupyter notebook will print out the output from the last line. So

"hello"
"world"

will print out "world", while

print("hello")
print("world")

will explicitly print out both "hello" and "world".

Using Python as a scientific calculator

In [ ]:
2 + 2
In [ ]:
4 - 2
In [ ]:
4 * 2
In [ ]:
4 / 2
In [ ]:
# Note that the power operation is "**", not "^"!
4 ** 2

In fact, the caret operator, ^, is used in a completely different context in Python, and so

1 ^ 2

is a legal operation which happens to return 3. See bitwise XOR if you're interested, and ignore it otherwise.

In [ ]:
(1 + 2) * 3 ** 4 / 5 - 6
In [ ]:
# one less-used operator: modulo
# a mod b is the the remainder of a / b
15 % 4

Assigning values to variables

Variables, like in algebra, are a container to hold an arbitrary value. We can store a value in a variable through the process of assignment, using the assignment operator, =.

In [ ]:
a = 1
b = 2
print(b)
print(b)
In [ ]:
# we can reassign the variable to have a new value
a = 1 + 2
print(a)
a = 27
print(a)
In [ ]:
# we can perform operations on variables as we did for normal numbers
a = 1
b = 2
print(a + b)
In [ ]:
# note that in this notebook environment, variables will stick around outside of their cell
print(a, b)

This is useful in that you don't need to write all of your code to be evaluated in one cell. But it can lead to unexpected behavior if you are constantly assigning to the same variable names.

Beware! While the expression "3 + 1 = 4" makes sense mathematically, it is nonsense in Python. The assignment operator, "=", goes to the right of the variable to be assigned, and is followed by the value with which the variable is assigned.

3 + 1 = 4

will raise a syntax error.

Since updating a variable based on its previous value is a common operation, there's a shortcut for doing this assignment.

In [ ]:
a = 1
print(a)
a += 1  # is the same thing as writing a = a + 1
print(a)
a *= 3  # is the same thing as writing a = a * 3
print(a)

Functions

We've already seen one example of a function: print(). This function takes one or more objects as input and prints out a representation of those objects. Strictly speaking, it doesn't have any output, so assigning x = print('test') would cause the variable x to have the special value of None.

We'll now see how to define our own functions.

We define a new function with the def keyword. There's a colon after the initial line and all other lines in the function are indented. In the example below, the function name is norm, and the variables x and y are called the parameters of the function. The function can be called with specific parameters (e.g., norm(4, 3)), in which case we call those specific values the arguments of the function. The return keyword at the end of the function block indicates that the expression to the right will be outputted by the function.

It's good practice to include comments that specify what kinds of input the function expects and what it will return. You can use triple quotes for a multi-line comment. Remember, code is read much more often than it is written, so well-written comments are incredibly important.

In [ ]:
def norm(x, y):
    """
    Calculate the Euclidean norm of a vector in 2D with coordinates x and y.
    
    Parameters
    ----------
    x, y : float, coordinates of vector
    
    Returns
    -------
    r : float, the Euclidian norm of the vector
    """
    n = (x ** 2 + y ** 2) ** (1 / 2)
    return n

print(norm(4, 3))
print(norm(1, -1))

Functions define their own variable scope. This means that any variables assigned in the function are not accessible outside of the function. The examples below demonstrates the difference between the local variable, dist, defined within the function distance, and the global variable defined in the block below that. While functions create their own local scope, conditional blocks share the global variable scope. For more background, see the section on variables and scope in the Object-Oriented Programming textbook.

In [ ]:
def distance(x1, y1, x2, y2):
    dist = norm(x2 - x1, y2 - y1)
    print("dist = ", dist)
    return dist

result = distance(1, 2, 3, 4)
print(result)

# print(dist) # will get a NameError since dist is not defined
In [ ]:
# Now if we assign a value to d outside of the function, it won't be changed by the function
dist = "outside variable"
result = distance(1, 2, 3, 4)
print("dist = ", dist)

One nice thing about working in the notebook environment: much of the documentation for functions and other objects are built into their definitions. You can access them interactively with the help function.

In [ ]:
x = [0, 1, 2]
help(x.append) # prints out the documentation string for the append function
In [ ]:
# this even works for functions that you define!
help(norm)

Strings

Aside from just doing number crunching, sometimes we'll need to manipulate strings of characters. Python provides a lot of built-in tools for doing so.

In [ ]:
s = "Hello, World!"
print(s)
print(s.lower())
print(s.upper())
print(s.replace('World', 'AY9'))

Quick tangent on methods

Here we see that there are certain functions (like lower() and replace()) that are called with an associated object. These special functions are called methods, and they use their associated object to accomplish their job. So when we evaluate s.lower(), we are implicitly giving the string s as input to the function lower(), which then returns a new string of lower-case characters in s.

Strings can even be "added" together and "multiplied" by integers

In [ ]:
new_string = s + " It's a beautiful day!"
print(new_string)
print(2 * s)

Lists

A list is a sequence of objects, indexed by the natural numbers (0, 1, 2,...).
Lists are denoted with square brackets, e.g., [1, 2, 3] is the list containing 1, 2, and 3 in that order.

To fetch the entry at a given index, $i$, for the list, x, we use square brackets enclosing the index: x[i].

We start counting indices at 0, not 1! Some languages (e.g., FORTRAN, Matlab, and Julia) start counting at 1.

In [ ]:
# note the two uses of square brackets; we use them to create a list,
# but also to indicate that we are indexing into a list variable
x = [0, 2, 4, 6]
print(x)
print(x[0])
print(x[1])
print(x[2])
print(x[3])

If you try to get an index outside of the size of the list, you'll get an error. So if you need to know how long a list is, use the len function.

In [ ]:
x = [0, 1, 2, 3]
print('length of x:', len(x))
# x[4] # will raise an IndexError

You can count entries from the end of the list by indexing with negative integers.

In [ ]:
x = [0, 1, 2, 3]
print(x[-1])
print(x[-2])
print(x[-3])
print(x[-4])

You can slice through a list to get a copy of a subsequence of the list. The notation is x[a:b], where x is the list, a is the lower index (inclusive) and b is the upper index (exclusive).

In [ ]:
x = [0, 1, 2, 3]
print(x[0:2])
print(x[0:3])
print(x[:3]) # note that if there is nothing to the left of the colon, the 0 is implicit
print(x[1:]) # if there is nothing to the right of the colon, it will slice through the end
# you can get fancy and slice with negative integers!  see if you can make sense of this one
print(x[:-2]) 
In [ ]:
# You can reassign individual entries in a list.
x = [1, 2, 3, 4]
print(x)
x[1] = 486.1
print(x)
In [ ]:
# Lists can hold anything!  Even other lists.
a = [1, "a string", [1, 2, 3]]
print(a)
print(a[0])
print(a[1])
print(a[2])
# note that the second set of square brackets refers 
# to the inner list at index 2 in the outer list
print(a[2][0])

There are lots of built-in tools for manipulating lists in Python. While slicing makes a copy of the list, some list operations will change the list.

In [ ]:
# construct a list, starting at 1, ending at 10 (exclusive) and stepping by 2's
# in python3, the builtin range function needs to be collected into a list as shown below
x = list(range(1, 10, 2))
print(x)
In [ ]:
# construct a list, implicitly starting at 0, and ending at 5 (exclusive), 
# stepping implicity by 1
y = list(range(5))
print(y)

We often want to know how long a list is before doing some operations with it. We can use the builtin len function to get the length of a list or any other sequential data structure.

In [ ]:
x = list(range(3))
print(x)
print(len(x))
In [ ]:
# adding lists will make a new list by combining the two
print(x + y)
In [ ]:
# multiplying lists by integers
print(2 * y)
In [ ]:
# make a new list and sort it
z = x + y
print(z)
z.sort()
print(z)
In [ ]:
# appending to lists
x = [3, 2]
x.append(1)
print(x)
x.append(0)
print(x)

Tuples

A tuple is like a list, but is immutable, meaning you can't change the value of an entry after it's created. Tuples are specified with parentheses instead of square brackets, but we still read its entries using square brackets (since parentheses are reserved for function calls).

In [ ]:
letters = ('a', 'b', 'c')
print(letters)
print(letters[1]) # will print 'b', recall we index starting at 0
# letters[0] = 'A' # would get a TypeError since tuples cannot be changed after creation

Dictionaries

A dictionary is another data structure, but instead of being indexed be a sequence of natural numbers, it can be indexed by anything (and thus has no order). The indices of a dictionary are called keys. The entries are called values.

We can use curly braces to define a dictionary, but as with tuples and lists, the entries are read using square brackets.

In [ ]:
# create a dictionary with keys 'a', 'b', and 'c'
d = {'a': 1, 'b': 2, 'c': 3}

print(d['a'])
print(d['b'])
print(d['c'])
In [ ]:
# trying to access the dictionary where there is no key leads to... a KeyError!
# d['d']

Arrays

An array is like a list, but has a fixed size (i.e., we can't append values to it, but we can change entries within the array). The python package numpy provides a fairly comprehensive set of modules for working with arrays. We won't cover them here, but you'll find them useful for working with large sets of numerical values.

Conditionals and booleans

A boolean variable has one of two states, either True or False. Boolean variables can be combined with the binary operators and and or, and inverted with the unary operator not.

In [ ]:
sky_is_blue = True
grass_is_pink = False
print(sky_is_blue and grass_is_pink)
print(sky_is_blue or grass_is_pink)
print(not sky_is_blue)
In [ ]:
# Numerical comparison operators (e.g., "equal to" or "less than") will return booleans.
print(1 == 4)  # equal to
print(1 != 2)  # not equal to
In [ ]:
print(1 < 2)   # less than
print(1 <= 2)  # less than or equal to
In [ ]:
print(1 > 2)   # greater than
print(1 >= 2)  # greater than or equal to

A conditional is a statement that lets us choose different code evalutations depending on the state of a boolean value. They follow a simple form:

if condition evaluates to True:
    do something

Note the colon after the condition, and also the indentation of the block inside of the conditional statement.

Indentation matters in Python! While some programming languages use braces to separate between different blocks of code, Python uses the indentation as a way of distinguishing whether code is inside or outside of a block. For example, here

if condition:
    print('condition is True')
print('condition may be True or False')

the first line will only print if the condition is True while the second line will always print out.

To do some action given that the condition is false, use the else keyword.

In [ ]:
x = 5
if x > 10:
    print(x, 'is greater than 10')
else:
    print(x, 'is less than or equal to 10')

Conditionals can be chained together with the elif (short for else if) keyword.

In [ ]:
x = 10
if x > 10:
    print(x, 'is greater than 10')
elif x > 0:
    print(x, 'is greater than 0 and less than or equal to 10')
else:
    print(x, 'is less than or equal to 0')

The in keyword has a few purposes (one of which will be covered with loops), but one relevent one is for asking whether or not an item is in a list.

In [ ]:
my_list = [0, 1, 2]

x = 1
if x in my_list:
    print(x, "is in my_list")
else:
    print(x, "is NOT in my_list")

Loops

Some times we want to repeat the same series of logical steps, or iterations, many times. Python has two kinds of loops: for loops and while loops.

A while loop will repeat a block of code as long as some condition is satisfied.

In [ ]:
countdown = 3
while countdown > 0:
    print(countdown)
    countdown = countdown - 1
print('liftoff!')

Make sure to have some way of achieving the condition, or else you'll end up in an endless loop! For instance, the following while loop is nearly identical to the previous example, but will never break:

countdown = 3
while countdown > 0:
    print(countdown)
    countdown = countdown + 1
print('liftoff!')

To make a for loop, we'll need a list we can iterate over. The iteration will define an iteration value (i in the example below), which will cycle through the values in the list.

In [ ]:
mylist = [0, 1, 2, 3]
for i in mylist:
    print(i)

As with conditionals, the indentation block indicates whether or not a line is in the loop or not.

In [ ]:
mylist = [0, 1, 2, 3]
for i in mylist:
    print(i)
print('all done')

Be careful about changing a list as you iterate over it. The following loop will never end, despite starting out with a finite list!

mylist = [0, 1, 2, 3]
for i in mylist:
    print(i)
    mylist.append(2 * i)

You can have loops in loops!

In [ ]:
for i in range(2):
    for j in range(3):
        print('outer index:', i, ' outer index:', j)
    print('done with outer iteration', i)

Sometimes we want to skip to the next iteration without completing the current one. In this case, we use the continue keyword.

In [ ]:
for i in range(10):
    # skip even numbers
    if i % 2 == 0: 
        continue
    print(i)

We might also decide we need to end the loop prematurely, in which case we can use the break keyword.

In [ ]:
for i in range(10):
    if i > 5:
        break
    print(i)

Importing other modules

Whenever possible, it's nice to not have to "reinvent the wheel". That is, if someone already figured out a good solution for what you want to do, make use of their hard work (with proper credit of course). The Python standard library includes lots of packages, and many people publish third-party Python packages under free and open source software licenses.

Let's say we want to do some trigonometry. We could look up algorithms for approximating a sine function from the basic operations we've already used. But the standard library module math already does this. We can bring in this module to our global scope using the import keyword.

In [ ]:
import math
In [ ]:
# now we have access to the function sin, and the float pi, as attributes of math
help(math.sin)
print('pi = ', math.pi)
angles = [0, math.pi / 4, math.pi / 2, math.pi]
for angle in angles:
    print('sin(', angle, ') = ', math.sin(angle))

There are variants on this import statement that do somewhat different things. For instance, if we knew we only wanted to use pi, sin, cos, and tan, we could do the following:

In [ ]:
from math import pi, sin, cos, tan
In [ ]:
# now we have access to these objects without explicitly referring to the math module
a = pi / 4
print(sin(a))
print(cos(a))
print(sin(a) / cos(a))
print(tan(a))

Putting it all together

Let's look an example that uses everything we've looked at so far. Suppose we need to get some targets for an upcoming observing run. We know their spatial coordinates and their brightness, and we've stored these values in lists.

In [ ]:
# There are twenty potential targets in a particular galaxy cluster.
# Each target's position and brightness is given by a particular index into the lists below

# right ascention, or longitude on the sky, measured in degrees
x_values = [182.47525964,  181.71979313,  182.01188744,  182.38965086,
            181.33694731,  181.81918184,  182.12838376,  182.24847701,
            181.96221347,  182.1016029 ,  181.81358103,  182.36078662,
            182.47452539,  182.20960101,  182.44307302,  181.32153755,
            181.95492827,  182.37629458,  181.89390667,  182.04027463]
# declination, or latitude on the sky, measured in degrees
y_values = [14.57684245,  14.0413467 ,  14.42422892,  13.72048441,
            14.32234319,  13.93295502,  14.66749945,  14.04502233,
            14.44982115,  14.59439852,  14.84721984,  15.02108029,
            14.28635226,  14.36287386,  14.37801609,  14.25512123,
            14.29892983,  15.04374266,  14.08003671,  14.48721843]
# magnitude, a (literally) backwards system of measuring light flux,
# such that larger magnitude is fainter
magnitudes = [22.8003388 ,  19.76338384,  19.38678407,  22.77904055,
              19.40079346,  16.90532576,  20.5119855 ,  17.20244618,
              21.32856068,  23.49640448,  21.59814329,  16.54993158,
              20.86924288,  23.69540075,  23.25189441,  20.16867178,
              18.06682487,  20.9476858 ,  21.22210576,  16.55301419]
In [ ]:
# print out the potential targets, their positions and magnitude
for i in range(len(x_values)):
    x = x_values[i]
    y = y_values[i]
    magnitude = magnitudes[i]
    print('Target', i, ': x =', x, 'deg') 
    print('           y =', y, 'deg')
    print('           mag =', magnitude)
In [ ]:
# we want targets that are bright enough (magnitude < 23)
# and that are close enough in distance to the cluster's center
x_center = 182.07387
y_center = 14.36368
magnitude_limit = 23
distance_limit = 0.5 # in degrees

In the function below, we'll use the small-angle approximation for the angular separation between points on the sphere:

$$ \Delta \theta ^ 2 \approx (\Delta \alpha \cos(\delta)) ^ 2 + (\Delta \delta) ^ 2$$

where $(\alpha, \delta)$ are the spherical coordinates in radians. See here for more on angular distances.

In [ ]:
# let's define some function that will help us figure out if we have any good targets
def angular_distance(x1, y1, x2, y2):
    '''
    Calculate the distance between two points on the celestial sphere, (x1, y1) and (x2, y2).
    
    Parameters
    ----------
    x1 : x coordinate of the first object in degrees
    y1 : y coordinate of the first object in degrees
    x2 : x coordinate of the second object in degrees
    y2 : y coordinate of the second object in degrees
    
    Returns
    -------
    separation : the angular separation of the two objects, in degrees
    '''
    # the difference in each coordinate, still in degrees
    dx = x2 - x1
    dy = y2 - y1
    
    # convert to radians for computing a trig function
    dy_radians = dy * math.pi / 180 
    
    separation = math.sqrt((dx * math.cos(dy_radians)) ** 2 + dy ** 2)
    return separation
In [ ]:
# now let's loop through each potential target and evaluate if each is close enough
# and also bright enough
# we'll store the boolean variables in a list of the same size as the original data

# makes empty lists for being close and bright
close_enough_list = []
bright_enough_list = []

# let's keep track of how many are both close and bright
count = 0 

# iterate across each of the indices of the data lists
for i in range(len(x_values)):
    x = x_values[i]
    y = y_values[i]
    magnitude = magnitudes[i]
    separation = angular_distance(x, y, x_center, y_center)
    print('Target', i, 'has a separation of', separation, 'degrees')
    # isclose is a boolean variable that is True if the target is close enough,
    # and False otherwise; ditto for isbright
    isclose = separation < distance_limit
    isbright = magnitude < magnitude_limit
    
    if isclose and isbright:
        # only add to our count if the target is close and bright
        count += 1
    # but we want to add the state of a target regardless of if it is selected or not
    close_enough_list.append(isclose)
    bright_enough_list.append(isbright)
    
print('There are', count, 'targets that are observable.')

So we found that we have 10 observable targets, but which ones are they? We could have stored the indices as we went through the loop above. But we can also just get those now too.

In [ ]:
targets = []
for i in range(len(close_enough_list)):
    isclose = close_enough_list[i]
    isbright = bright_enough_list[i]
    if isclose and isbright:
        targets.append(i)

# print out the observable targets, their positions and magnitude
for i in targets:
    x = x_values[i]
    y = y_values[i]
    magnitude = magnitudes[i]
    print('Target', i, ': x =', x, 'deg') 
    print('           y =', y, 'deg')
    print('           mag =', magnitude)

Of course, in this example, looking at twenty objects wouldn't have been too bad to do by eye. But If you needed to evaluate thousands of objects? Automate it!

What didn't we cover?

A lot! But that's okay, hopefully this has been enough to get started. You might see some of the topics listed below during the course of your work.

Plotting

There are a few options for plotting in Python, but arguably the most complete plotting library is matplotlib. The official documentation is rather intimidating, but there are some tutorials and helpful examples here

Arrays

The unofficial standard for array manipulation in Python is the numpy package. Arrays are like lists but have a fixed size. Due to their fixed size in memory, computations with arrays can be much faster than looping over a list of numbers.

File input/output

Python makes it easy to read and write text files. See the official docs here.

Classes and objects

A class is a prototype of an object, and we've been using objects this whole time. Objects have attributes, which can be the usual types we've talked about like floats and strings. They can also have methods, like the sort() and append() functions shared by all lists. Attributes are accessed using the "dot" notation, in that object.attribute is a reference to the attribute held by that particular object. For more on writing your own classes, see here.

In [ ]:
# example plot for our target sample
# more to be seen in the plotting tutorial!

%matplotlib inline
from matplotlib import pyplot as plt

plt.scatter(x_values, y_values, c=magnitudes, 
            cmap='viridis_r', s=200, marker='o', lw=1, edgecolor='k', alpha=0.8)
cb = plt.colorbar()
plt.xlim(plt.xlim()[::-1])  # reverses the x-axis, since the sky is backwards
plt.xlabel('RA [deg]')
plt.ylabel('Dec [deg]')
cb.set_label('Magnitude')
plt.title('Target candidates')

social