4. Loops#

Things are getting more advanced now. There is one programming practice that you should master to fully control your data.

What we’ve seen so far is straightforward, defining collections like lists and dictionaries, and retrieving data from them through indexing. Sometimes, we want to do something with each element in a data collection - perhaps call a function on it, read a file, remove a piece of data, etc. How can we do this without manually accessing each element manually?

The answer is the for loop, a special piece of syntax that allows us to iterate over a sequance or collection and retrieve the values inside by temporarily passing them to a new variable of whatever name we choose.

The format looks like this:

for variable in sequence:
    do something

Not the indentation on the second line - as you’ll see, this is part of the code. Python uses this to differentiate code that is in the for loop with that outside of it.

For loops can be confusing, so some examples are required!

# Make a list - these values all need to be rounded to two decimal places!
data_list = [12.394, 9.0948, 1.02, 39.4023, 2.392, 50.12948912]

# Write a for loop that prints each element in turn
for score in data_list:
    print(score)
12.394
9.0948
1.02
39.4023
2.392
50.12948912

At each iteration, the value of score changed to be the next element in the list. So it started with the element in the position [0], then [1], and then onwards until data_list was finished.

Let’s say we want to modify each element of a list, by rounding the numbers. This could be achieved as follows:

# First, make an empty list to store our new data - just square brackets with no variables
rounded_data = []

# Iterate over the original list - round each element - and append to our empty list
for value in data_list:
    round_value = round(value, 2)
    rounded_data.append(round_value)
    
print(rounded_data)
[12.39, 9.09, 1.02, 39.4, 2.39, 50.13]

4.1. Building complexity with iteration#

You aren’t limited to iterating just once. You can nest an iteration in the same way you can nest lists and dictionaries. This sometimes happens if you need to dig deeper into data structures to manipulate values.

# Define a nested list
nest_list = [ ['001', 23, 'female'], ['002', 31, 'male'], ['003', 40, 'female'] ]

# Iterate over the main list
for participant in nest_list:
    
    # Iterate over the sub-list
    for score in participant:
        print(score)
001
23
female
002
31
male
003
40
female
# Or access the sublists to do what we want!
for participant in nest_list:
    
    line = 'Participant ID = ' + participant[0] + ', age = ' + str(participant[1]) + ', sex = ' + participant[2]
    print(line)
Participant ID = 001, age = 23, sex = female
Participant ID = 002, age = 31, sex = male
Participant ID = 003, age = 40, sex = female

4.2. enumerate#

It should be clear from the above examples that when iterating over a list using for, we get back a variable that changes at each iteration. This is usually fine, but sometimes, we want to know the actual position of the element and do something with that.

The enumerate function is used for dealing with this. We call enumerate on a sequence when we set up a for loop, and it returns a pair of variables - a counter, that returns the index, and a value, that returns the element, just like a normal for loop.

The syntax looks like this!

for index, value in enumerate(sequence):

`do stuff!`
# Define a list
data_list = [123, 381, 'content', 'value']

for index, value in enumerate(data_list):
    line = 'Item in position ' + str(index) + ' is ' + str(value)
    print(line)
Item in position 0 is 123
Item in position 1 is 381
Item in position 2 is content
Item in position 3 is value
# To demonstrate the use of index to actually access elements!
for index, value in enumerate(data_list):
    print(data_list[index])
123
381
content
value
# Or to more efficiently replace elements - the original list is now lost
data_list = [12.394, 9.0948, 1.02, 39.4023, 2.392, 50.12948912]
print(data_list)

for index, value in enumerate(data_list):
    
    # Use the index to access and assign a new variable
    data_list[index] = round(value, 2)

print(data_list)
[12.394, 9.0948, 1.02, 39.4023, 2.392, 50.12948912]
[12.39, 9.09, 1.02, 39.4, 2.39, 50.13]

4.3. Iterating over dictionaries#

Dictionaries are a somewhat more complex structure compared to lists. For example, they don’t ‘remember’ the order things were put in them, depending on the version of Python you use, so iterating over them in the same way as lists doesn’t always make sense.

If we remember the methods dictionaries had, we can iterate over those to access and play around with created dictionaries.

# Make a dictionary
data_dict = {'Score_A': 45.2912, 'Score_B': 68.2945, 'Score_C': 88.9873}
# Print out the keys
for key in data_dict.keys():
    print(key)
    
Score_A
Score_B
Score_C
# Or use the keys to access the data!
for key in data_dict.keys():
    print(data_dict[key])
45.2912
68.2945
88.9873
# Use this to grab the values - beware they may not be in the order you think
for value in data_dict.values():
    print(value)
45.2912
68.2945
88.9873
# Finally, use .items to get both key:value pairs out!
for key, value in data_dict.items():
    line = 'The key ' + key + ' returns the value: ' + str(value)
    print(line)
The key Score_A returns the value: 45.2912
The key Score_B returns the value: 68.2945
The key Score_C returns the value: 88.9873
# Makes sense to use this to modify the original dictionary
# Raise each value to the power of 3 to two decimal places
for k, v in data_dict.items():
    data_dict[k] = round(v ** 3, 2)
    
print(data_dict)
{'Score_A': 92905.51, 'Score_B': 318535.02, 'Score_C': 704667.25}

The basics are covered!

That’s a lot of information to take in, but these are the fundamentals of working with data Python. Next, we will cover the NumPy and Pandas libraries, which are built to process data efficiently and are the real workhorses of doing data analysis in Python.

If you’re struggling or are stuck, do not worry. The learning curve is high and it will take time. The only way to improve is to keep coding. Try the exercises, and use the internet to look for answers where you aren’t sure.

from IPython.display import YouTubeVideo
display(YouTubeVideo('HluANRwPyNo'))