list comprehensions and generators¶

Nested list comprehensions¶

[[output expression] for iterator variable in iterable]

Collapse for loops for building lists into a single line
- Components
  - Iterable
  - Iterator variable (represent members of iterable)
  - Output expression

# Create a 5 x 5 matrix using a list of lists: matrix
matrix = [[col for col in range(5)] for row in range(5)]

# Print the matrix
for row in matrix:
    print(row)

[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]

pair_2=[(num1, num2) for num1 in range(0, 2) for num2 in range(6, 8)]
pair_2

[(0, 6), (0, 7), (1, 6), (1, 7)]

Using conditionals in comprehensions¶

[ output expression for iterator variable in iterable if predicate expression ].

# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']

# Create list comprehension: new_fellowship
new_fellowship = [member for member in fellowship if len(member) >= 7]

# Print the new list
print(new_fellowship)

['samwise', 'aragorn', 'legolas', 'boromir']

# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']

# Create list comprehension: new_fellowship
new_fellowship = [member if len(member) >= 7 else '' for member in fellowship]

# Print the new list
print(new_fellowship)

['', 'samwise', '', 'aragorn', 'legolas', 'boromir', '']

Dict comprehensions¶

Recall that the main difference between a list comprehension and a dict comprehension is the use of curly braces {} instead of []. Additionally, members of the dictionary are created using a colon :, as in key:value
- Create dictionaries
- Use curly braces {} instead of brackets []

# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']

# Create dict comprehension: new_fellowship
new_fellowship = {member:len(member) for member in fellowship}

# Print the new list
print(new_fellowship)

{'aragorn': 7, 'frodo': 5, 'samwise': 7, 'merry': 5, 'gimli': 5, 'boromir': 7, 'legolas': 7}

Generator expressions¶

Recall list comprehension
- Use ( ) instead of [ ]

g = (2 * num for num in range(10))
g

<generator object <genexpr> at 0x0000000004335A20>

List comprehensions vs. generators¶

List comprehension - returns a list
Generators - returns a generator object
Both can be iterated over

(num for num in range(10*1000000) if num % 2 == 0)

<generator object <genexpr> at 0x0000000004335E10>

Generator functions¶

Generator functions are functions that, like generator expressions, yield a series of values, instead of returning a single value. A generator function is defined as you do a regular function, but whenever it generates a value, it uses the keyword yield instead of return.¶

Produces generator objects when called

Defined like a regular function - def

Yields a sequence of values instead of returning a single value

Generates a value with yield keyword

def num_sequence(n):
    
    """Generate values from 0 to n."""
    i = 0
    while i < n:
        yield i
        i += 1

test=num_sequence(7)
print type(test)

<type 'generator'>

next(test)

3

test.next()

4

List comprehensions for time-stamped data¶

the pandas Series¶

single-dimension arrays

Extract the column 'created_at' from df and assign the result to tweet_time. Fun fact: the extracted column in tweet_time here is a Series data structure!

reate a list comprehension that extracts the time from each row in tweet_time. Each row is a string that represents a timestamp, and you will access the 11th to 18th characters in the string to extract the time. Use entry as the iterator variable and assign the result to tweet_clock_time.

import pandas as pd

df = pd.read_csv('tweets.csv')
    
# Extract the created_at column from df: tweet_time
tweet_time = df['created_at']

# Extract the clock time: tweet_clock_time
tweet_clock_time = [entry[11:19] for entry in tweet_time]

# Print the extracted times
print(tweet_clock_time[:100])

['05:24:51', '05:24:57', '05:25:38', '05:25:42', '05:25:48', '05:25:53', '05:25:58', '05:26:12', '05:26:27', '05:26:30', '05:26:35', '05:26:48', '05:27:56', '05:28:28', '05:28:28', '05:28:40', '05:28:55', '05:30:06', '05:30:18', '05:30:20', '05:30:53', '05:30:55', '05:31:41', '05:32:20', '05:32:23', '05:32:32', '05:34:11', '05:34:17', '05:36:07', '05:38:17', '05:38:26', '05:39:39', '05:39:48', '05:40:07', '05:40:19', '05:40:58', '05:41:06', '05:41:21', '05:41:34', '05:41:51', '05:42:13', '05:42:51', '05:43:20', '05:43:24', '05:43:34', '05:44:36', '05:45:16', '05:45:40', '05:46:38', '05:46:40', '05:46:56', '05:47:07', '05:47:36', '05:47:44', '05:47:50', '05:48:01', '05:48:19', '05:49:10', '05:49:31', '05:49:36', '05:49:39', '05:49:39', '05:49:48', '05:49:52', '05:49:54', '05:50:04', '05:50:07', '05:50:16', '05:50:21', '05:50:35', '05:50:46', '05:50:49', '05:50:49', '05:50:56', '05:51:15', '05:51:26', '05:51:28', '05:51:43', '05:52:27', '05:52:32', '05:52:35', '05:52:45', '05:53:00', '05:53:33', '05:53:37', '05:53:55', '05:53:59', '05:54:14', '05:54:26', '05:54:55', '05:54:59', '05:55:25', '05:55:31', '05:55:39', '05:55:53', '05:55:57', '05:56:02', '05:56:14', '05:56:17', '05:56:29']

Conditional list comprehesions for time-stamped data¶

add a conditional expression to the list comprehension so that you only select the times in which entry[17:19] is equal to '19'

# Extract the created_at column from df: tweet_time
tweet_time = df['created_at']

# Extract the clock time: tweet_clock_time
tweet_clock_time = [entry[11:19] for entry in tweet_time if entry[17:19] == '19']

# Print the extracted times
print(tweet_clock_time)

['05:40:19', '05:48:19', '06:02:19', '06:03:19', '04:56:19', '05:40:19', '05:48:19', '06:02:19', '06:03:19', '03:31:19', '03:54:19', '04:23:19']

list comprehension and generators

list comprehensions and generators¶

Nested list comprehensions¶

Using conditionals in comprehensions¶

Dict comprehensions¶

Generator expressions¶

List comprehensions vs. generators¶

Generator functions¶

Generator functions are functions that, like generator expressions, yield a series of values, instead of returning a single value. A generator function is defined as you do a regular function, but whenever it generates a value, it uses the keyword yield instead of return.¶

List comprehensions for time-stamped data¶

the pandas Series¶

Conditional list comprehesions for time-stamped data¶

Leave a Reply Cancel reply

Data Science Notebook

list comprehension and generators

list comprehensions and generators¶

Nested list comprehensions¶

Using conditionals in comprehensions¶

Dict comprehensions¶

Generator expressions¶

List comprehensions vs. generators¶

Generator functions¶

Generator functions are functions that, like generator expressions, yield a series of values, instead of returning a single value. A generator function is defined as you do a regular function, but whenever it generates a value, it uses the keyword yield instead of return.¶

List comprehensions for time-stamped data¶

the pandas Series¶

Conditional list comprehesions for time-stamped data¶

Leave a Reply Cancel reply