# list comprehension and generators

list comprehension and generators

# list comprehensions and generators¶

### Nested list comprehensions¶

• [[output expression] for iterator variable in iterable]
• Collapse for loops for building lists into a single line
• Components
• Iterable
• Iterator variable (represent members of iterable)
• Output expression
In [1]:
# Create a 5 x 5 matrix using a list of lists: matrix
matrix = [[col for col in range(5)] for row in range(5)]

# Print the matrix
for row in matrix:
print(row)

[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]

In [7]:
pair_2=[(num1, num2) for num1 in range(0, 2) for num2 in range(6, 8)]
pair_2

Out[7]:
[(0, 6), (0, 7), (1, 6), (1, 7)]

### Using conditionals in comprehensions¶

• [ output expression for iterator variable in iterable if predicate expression ].
In [2]:
# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']

# Create list comprehension: new_fellowship
new_fellowship = [member for member in fellowship if len(member) >= 7]

# Print the new list
print(new_fellowship)

['samwise', 'aragorn', 'legolas', 'boromir']

In [3]:
# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']

# Create list comprehension: new_fellowship
new_fellowship = [member if len(member) >= 7 else '' for member in fellowship]

# Print the new list
print(new_fellowship)

['', 'samwise', '', 'aragorn', 'legolas', 'boromir', '']


## Dict comprehensions¶

• Recall that the main difference between a list comprehension and a dict comprehension is the use of curly braces {} instead of []. Additionally, members of the dictionary are created using a colon :, as in key:value
• Create dictionaries
• Use curly braces {} instead of brackets []
In [4]:
# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']

# Create dict comprehension: new_fellowship
new_fellowship = {member:len(member) for member in fellowship}

# Print the new list
print(new_fellowship)

{'aragorn': 7, 'frodo': 5, 'samwise': 7, 'merry': 5, 'gimli': 5, 'boromir': 7, 'legolas': 7}


# Generator expressions¶

• Recall list comprehension
• Use ( ) instead of [ ]
In [9]:
g = (2 * num for num in range(10))
g

Out[9]:
<generator object <genexpr> at 0x0000000004335A20>

## List comprehensions vs. generators¶

• List comprehension - returns a list
• Generators - returns a generator object
• Both can be iterated over
In [13]:
(num for num in range(10*1000000) if num % 2 == 0)

Out[13]:
<generator object <genexpr> at 0x0000000004335E10>

## Generator functions¶

#### Generator functions are functions that, like generator expressions, yield a series of values, instead of returning a single value. A generator function is defined as you do a regular function, but whenever it generates a value, it uses the keyword yield instead of return.¶

• Produces generator objects when called
• Defined like a regular function - def
• Yields a sequence of values instead of returning a single value
• Generates a value with yield keyword
In [15]:
def num_sequence(n):

"""Generate values from 0 to n."""
i = 0
while i < n:
yield i
i += 1

In [17]:
test=num_sequence(7)
print type(test)

<type 'generator'>

In [21]:
next(test)

Out[21]:
3
In [22]:
test.next()

Out[22]:
4

## List comprehensions for time-stamped data¶

### the pandas Series¶

• single-dimension arrays
• Extract the column 'created_at' from df and assign the result to tweet_time. Fun fact: the extracted column in tweet_time here is a Series data structure!
• reate a list comprehension that extracts the time from each row in tweet_time. Each row is a string that represents a timestamp, and you will access the 11th to 18th characters in the string to extract the time. Use entry as the iterator variable and assign the result to tweet_clock_time.
In [27]:
import pandas as pd

# Extract the created_at column from df: tweet_time
tweet_time = df['created_at']

# Extract the clock time: tweet_clock_time
tweet_clock_time = [entry[11:19] for entry in tweet_time]

# Print the extracted times
print(tweet_clock_time[:100])

['05:24:51', '05:24:57', '05:25:38', '05:25:42', '05:25:48', '05:25:53', '05:25:58', '05:26:12', '05:26:27', '05:26:30', '05:26:35', '05:26:48', '05:27:56', '05:28:28', '05:28:28', '05:28:40', '05:28:55', '05:30:06', '05:30:18', '05:30:20', '05:30:53', '05:30:55', '05:31:41', '05:32:20', '05:32:23', '05:32:32', '05:34:11', '05:34:17', '05:36:07', '05:38:17', '05:38:26', '05:39:39', '05:39:48', '05:40:07', '05:40:19', '05:40:58', '05:41:06', '05:41:21', '05:41:34', '05:41:51', '05:42:13', '05:42:51', '05:43:20', '05:43:24', '05:43:34', '05:44:36', '05:45:16', '05:45:40', '05:46:38', '05:46:40', '05:46:56', '05:47:07', '05:47:36', '05:47:44', '05:47:50', '05:48:01', '05:48:19', '05:49:10', '05:49:31', '05:49:36', '05:49:39', '05:49:39', '05:49:48', '05:49:52', '05:49:54', '05:50:04', '05:50:07', '05:50:16', '05:50:21', '05:50:35', '05:50:46', '05:50:49', '05:50:49', '05:50:56', '05:51:15', '05:51:26', '05:51:28', '05:51:43', '05:52:27', '05:52:32', '05:52:35', '05:52:45', '05:53:00', '05:53:33', '05:53:37', '05:53:55', '05:53:59', '05:54:14', '05:54:26', '05:54:55', '05:54:59', '05:55:25', '05:55:31', '05:55:39', '05:55:53', '05:55:57', '05:56:02', '05:56:14', '05:56:17', '05:56:29']


### Conditional list comprehesions for time-stamped data¶

• add a conditional expression to the list comprehension so that you only select the times in which entry[17:19] is equal to '19'
In [28]:
# Extract the created_at column from df: tweet_time
tweet_time = df['created_at']

# Extract the clock time: tweet_clock_time
tweet_clock_time = [entry[11:19] for entry in tweet_time if entry[17:19] == '19']

# Print the extracted times
print(tweet_clock_time)

['05:40:19', '05:48:19', '06:02:19', '06:03:19', '04:56:19', '05:40:19', '05:48:19', '06:02:19', '06:03:19', '03:31:19', '03:54:19', '04:23:19']

In [ ]: