# Python lists and numpy arrays¶

Lists and arrays are both useful data structures. When would we want to use one over the other?

## Lists¶

Lists are built into python; there's no need to import anything. Lists are very flexible. They can hold any type of object and they can change their size and content on demand.

### Creating lists¶

We can directly make a list of objects by wrapping a comma-separated sequence of objects with square brackets as shown below.

```
a = [0, "one", 2.0]
print(a)
```

If we have an object that looks like a sequence, we make a list from that object using the `list()`

function.

```
# create a tuple, an immutable (non-rewritable) data structure that behaves in many ways like a list
a = (0, "one", 2.0)
print("a = {}, type of a is {}".format(a, type(a)))
# create a list from this tuple
b = list(a)
print("b = {}, type of b is {}".format(b, type(b)))
```

We can create a range of integers with the `range()`

function. This returns a special object (creatively called a `range`

object) that can't directly be indexed like a list. If we wanted to make a loop over these integers, we could use this object directly, but if we wanted to use it like a list we would have to use the `list()`

function

```
print(range.__doc__)
```

```
a = range(5)
print("a = {}, type of a is {}".format(a, type(a)))
b = list(a)
print("b = {}, type of b is {}".format(b, type(b)))
```

### Accessing and changing list elements¶

We can *index* into a list (and most other sequencey objects in Python) with square brackets. Recall that in Python we start indexing from 0!

```
a = [0, "one", 2.0]
print(a[0])
print(a[1])
print(a[2])
```

We can access a range of elements with a colon in the index.

```
# 0:2 picks out elements from indices 0 (inclusive) to 2 (exclusive)
print(a[0:2])
# leaving the right index in a range empty will take everything from the specified index to the end
print(a[1:])
# similarly, leaving the left index in a range empty will take everything from the start to the specified index
print(a[:2])
# indexing with negative integers counts back from the end of the list
print(a[-1])
print(a[-2])
print(a[-3])
```

### Common list methods¶

Lists have useful function built into them. These functions, accessed as attributes of lists in the "dot" notation, are generically called *methods* of the object.

```
a = [0, "one", 2.0]
print(a)
```

Append an object to the back of the list.

```
a.append("3")
print(a)
```

"Pop", or spit out and remove, the element of the list at some index.

```
x = a.pop(2)
print(x)
print(a)
```

Insert an element before some index.

```
x = 2.0
a.insert(2, x)
print(a)
```

Remove the first instance of some value.

```
a.remove('3')
print(a)
```

Extend a list with another sequence.

```
a.extend(range(4, 6))
print(a)
```

Reverse the list.

```
a.reverse()
print(a)
```

Sort the list.

```
a = [5, 2, 7, 32]
a.sort()
print(a)
```

Sorting only works if the elements of the list are comparable. For instance, it doesn't make sense to numerically rank the string "one" and the integer 0, where as it does make sense to compare the values of 5 and 2.

```
a = [0, "one", 2.0]
try:
a.sort()
except TypeError as e:
print("TypeError: {}".format(e))
```

### Binary operators on lists¶

We can concatenate two lists with some syntactic sugar of the addition operator.

```
a = [0, 1, 2]
b = [3, 4, 5, 6]
c = a + b
print(c)
```

If we use the multiplication operator with a list and an integer, we can copy and concatenate the same list.

```
a = [0, 1, 2]
b = 3 * a
print(b)
```

This multiplication will fail if we try just about anything else with a list.

```
try:
a * 2.1
except TypeError as e:
print("TypeError: {}".format(e))
```

## Arrays¶

Arrays are sequences that in some ways behave like lists, but are more restricted in what they can do. In particular, they can only hold objects of one type of basic values, and they are fixed in size. While python has one implementation of arrays built into the standard library, we're going to be using a third-party numerical array library called numpy. `numpy`

is the de-facto standard library for numerical computation in python.

```
import numpy as np
```

```
a = np.array([1, 1, 2, 3, 5, 8])
print("a = {}, type of a is {}".format(a, type(a)))
```

We can reassign the value in an array, so long as they are of the same type.

```
a[0] = 42
print(a)
```

```
try:
a[0] = 'This is not an integer!'
except ValueError as e:
print("ValueError: {}".format(e))
```

We can also use the function `np.arange()`

in a similar way to the built-in `range()`

function.

```
a = np.arange(5)
print(a)
a = np.arange(10, 15)
print(a)
a = np.arange(0, 10, 2)
print(a)
```

To generate linearly spaced numbers in some range, we can use the `np.linspace()`

function.

```
a = np.linspace(0, 1, 11)
print(a)
```

By default, giving only a start and a stop number will make an array with 50 values.

```
a = np.linspace(0, 98)
print(a)
```

We can create arrays of zeros or one with the `np.zeros()`

and `np.ones()`

functions.

```
a = np.zeros(10)
print(a)
a = np.ones(10)
print(a)
```

### Multi-dimensional arrays¶

`numpy`

naturally deals with n-dimensional arrays.

We can create a multi-dimensional array in similar ways to a 1-d array. We can pass `np.array()`

a sequence of sequences as shown below.

```
a = np.array([[1, 2], [3, 4]])
print(a)
```

Note that this will do weird things if we pass some non-rectanglar nested lists. In particular, numpy will interpret this as a 1d array of lists rather than a 2d array.

```
a = np.array([[1, 2, 3], [4, 5]])
print(a)
```

We can make multi-dimensional arrays of ones and zeros by passing a sequence of integers to `np.ones()`

and `np.zeros()`

respectively.

```
a = np.ones([3, 5])
print(a)
```

```
a = np.zeros([5, 2])
print(a)
```

### Attributes and methods of arrays¶

We can get the size (total number of elements) and shape (dimensionality) of an array with the `size`

and `shape`

attributes respectively.

```
a = np.zeros([5, 2])
print('a.size = {}, a.shape = {}'.format(a.size, a.shape))
```

There are a vast number of useful methods built into arrays. You can get a sense of what is available in the online documentation.

`np.min()`

and `np.max()`

find the minimum and maximum of an array. `np.argmin()`

and `np.argmax()`

find the indices of the minimum and maximum values of an array. These functions can either be accessed through the top-level numpy module with an array as an argument, or as methods of an individual array.

```
a = np.array([3,78,3,7,3,21,7.1])
print(a)
```

```
print("minimum of a is {}, occurs at index i = {}".format(a.min(), a.argmin()))
print("maximum of a is {}, occurs at index i = {}".format(a.max(), a.argmax()))
```

We can sort an array with `np.sort()`

. Note that calling this function as `np.sort(a)`

*does not* sort the array in place! Rather, it returns a copy of the sorted array. If we call it as a method as `a.sort()`

, then it does sort the array in place.

```
b = np.sort(a)
print(a)
print(b)
```

We can calculate some basic statistics for an array.

```
print("mean of a is {:.2f}".format(np.mean(a)))
print("median of a is {:.2f}".format(np.median(a)))
print("standard deviation of a is {:.2f}".format(np.std(a)))
```

```
a = np.array([3,78,3,7,3,21,7.1])
print(a)
print("indices 2 -> 6: {}".format(a[2:6]))
```

Recalling that the third integer in this indexing "slice" gives us the number of steps to skip by, we can find a tricky way to reverse an array.

```
b = a[::-1]
print(a)
print(b)
```

Note that indexing into an array like this will return a *view* or a *reference* into the array. That means that if we make a "new" array by indexing into an old one, changing one will change the other. If we explicitly call the `np.copy()`

function, then the two arrays will not be connected.

```
b = a.copy() # make a copy of a
c = b[:] # take a view into all of b and store in c
print(b)
print(c)
b[0] = 100
print(b)
```

```
print("copied array: a = {}".format(a))
print("viewed array: c = {}".format(c))
```

We can index into a multi-dimensional array by separating slices with commas. Left to right in indexing goes from outermost axis to innermost axis. Let's create a 3-d array with the `np.reshape()`

function to try this out.

```
# create a 3 by 4 by 5 shaped array from 0 to 59
# outermost axis has dimension 3, middle axis has dimension 4, and innermost axis has dimension 5
a = np.arange(3 * 4 * 5).reshape((3, 4, 5))
print(a)
```

We can pick out a single slice in this 3-d array.

```
print(a[0, :, :])
```

```
print(a[:, 0, :])
```

```
print(a[:, :, 0])
```

Or we could pick out multiple slices by giving a range of indices.

```
print(a[:, 1:3, 0])
```

### Broadcasting and mathematical operations¶

Many functions in numpy work on arrays of any size or shape. Many of these functions can do calculations on an element-by-element basis much for effectively than we can by looping. You can find all of these available functions in the documentation. Many of these also work with the normal binary operators. For example, to add two arrays together elementwise, we can just do `a + b`

rather than having to type out `np.add(a, b)`

.

```
a = np.arange(3 * 4).reshape((3, 4))
b = np.ones((3, 4))
print("a = \n{}\n".format(a))
print("b = \n{}\n".format(b))
print("a + b = \n{}\n".format(a + b))
print("exp(a) = \n{}\n".format(np.exp(a)))
print("sin(a) = \n{}\n".format(np.sin(a)))
```

## What's the benefit of using arrays?¶

Arrays are in many ways less flexible than lists, as we can't change their size or data type very easily. So why use them? Let's do a short experiment to find out. We'll try implementing one nice feature (easy elementwise addition) in lists and see how fast we can make it.

```
# create two large lists
n = 1000
a = list(range(n))
b = list(range(n))
```

There are a few ways of doing this, some of which may be more efficient than others.

```
def add_lists_elementwise(a, b):
c = []
for i in range(len(a)):
c.append(a[i] + b[i])
return c
```

The `%timeit`

magic macro in IPython will run a set of timing tests on a line of python code.

```
%timeit add_lists_elementwise(a, b)
```

I get a little over 120 microseconds. Not too bad, certainly faster than I'll ever notice. Let's try the same task with arrays.

```
a_array = np.array(a)
b_array = np.array(b)
```

```
%timeit a_array + b_array
```

And here I get around 1 microsecond, a speed up of over *two orders of magnitude*. Now if we only had to do this once, it would be no big deal. But if we have to do calculations like this millions of times on arrays that are even larger, this speedup makes difficult tasks easy and makes impossibly slow tasks feasible.