This is a quick data visualization project that consolidates four streaming services data sets from Shivam Bansal‘s Kaggle repo. The streaming services included are Amazon Prime, Disney Plus, Hulu, and Netflix. All datasets are current as of Dec 12, 2021.
I implement the project using the following tools and steps:
Jupyter Notebook, Python – with the csv files downloaded, I clean and combine the various data sets
Google Drive (Google Sheets) – upload the database for storage and later retrieval
Tableau (Public) – use the built-in Google Sheets connector and visualize the data using a dashboard
Results
Jupyter Notebook
I use Pandas to transform the CSV files into dataframes and combine them. The initial result includes listings for movies and TV shows, so movies are later removed. Some columns for cohorts (such as release_decade) are also included in the final output to anticipate categorizations in the visualization. The file can be downloaded using the link below.
This is my foray into a more ‘fluid’ layout, making strong use of floating objects (vs. tiled), and opting out of the default tabular headers (and creating my own labels using icons and other graphic cues).
I’m not necessarily following Basho’s advice, but I figured if I were to stray, I’d make the detour pythonic. So I made a haiku generator called randomBasho, which uses a simple randomizer to derive ‘new’ haikus from over a hundred Basho haikus. My goals are as follows:
to put a fresh spin on something centuries old; to generate poems that still retain the same contemplative energy and poetic tone as their source, but unearth new interpretations or meanings behind Basho’s lines
from a technical standpoint, reoriented myself with basic Pythonic concepts, such as iterators, functions, and data sets
from a creative standpoint, use this as a starting ground to experiment and explore poetic possibilities in code
I wrote my first few attempts at randomizing with the goal of just wanting to get reacquainted with Python. Having learned some of these basic concepts a few years ago (in Python 2.x), I want to check my comfort level in Pythonic building blocks and in Python 3.x updates.
Step 1. Randomization
So I could focus on this task, I initially lessened the technical scope by creating three lists with distinct items. The lists are line1, line2 and line3, which respectively contain (and correspond to) the haikus’ first lines, second lines, and third lines.
By making the poem number and line number explicit in the item names in all of the lists below, I was able to test whether the randomization actually worked. My expected end result was a Frankenstein haiku, with lines from different poems.
I used an iterator, since the task of printing a haiku line (after grabbing an item) needed to be repeated for each list.
def randombasho1(x,y,z):
r = [x,y,z]
j = 0
for i in r:
print r[j][random.randint(0,4)]
j = j + 1
randombasho1(line1,line2,line3)
I dropped the j = 0 and j = j + 1 and opted for a range(0, 3), since my haikus all have 3 lines.
def randombasho2(x,y,z):
r = [x,y,z]
for i in range(0,3):
print r[i][random.randint(0,4)]
randombasho2(line1,line2,line3)
I removed the r = [x, y, z] and used *args so I could repurpose the code in non-haiku use cases.
def randombasho3(*args):
for i in range(0,len(args)):
print args[i][random.randint(0,4)]
randombasho3(line1,line2,line3)
Additional Code
After I figured out the randomization code, the next milestone for me was to automatically generate lists line1, line2 and line3.
I decoupled the data (in this case, the haikus) from the code itself, so I can potentially reuse the randomization code in another application (for instance, another haiku poet, or potentially, another poetic form.)
The .py file of the haiku data had one list with three items: set1, set2, and set3. Each item is a long string, which contains multiple full haikus. The list below is from set1.
# Some of the poems in the rbasho02_haikus list
The door of thatched hut
Also changed the owner.
At the Doll\'s Festival.
Spring is passing.
The birds cry, and the fishes fill
With tears on their eyes.
Grasses in summer.
The warriors\' dreams
All that left.
The early summer rain
Leaves behind
Hikari-do.
Ah, tranquility!
Penetrating the very rock,
A cicada\'s voice.
The early summer rain,
Gathering it and fast
Mogami River.
To an old pond
A frog leaps in.
And the sound of the water.
Saying something,
The lip feeling cold.
The Autum wind.
Tieing the Chimaki,
Other hand hold,
Her bangs.
I ended up using .splitlines() to handle the splicing. Then I used index() to determine a line’s haiku position, and in what list they should be placed. As an homage, I named the final argument basho, which is a list of line1, line2, and line3.
import random
import rbasho02_haikus as rbh
haikus = rbh.haikus.splitlines()
h_list = []
h_dict = {}
line1 = []
line2 = []
line3 = []
basho = [line1,line2,line3]
for h in haikus:
if len(h) > 0:
h = h[0].upper() + h[1:]
h_list.append(h)
for h in h_list:
k = h_list.index(h) + 1
h_dict.update({k: h})
for h in h_dict:
if h % 3 == 0:
line3.append(h_dict[h])
elif h % 2 == 0:
line2.append(h_dict[h])
else:
line1.append(h_dict[h])
def randomhaiku(h):
for i in h:
r = random.randint(0,len(i)-1)
print '[{0:02}] {1}'.format(r+1,i[r])
randomhaiku(basho)
Notes
Samples
Here are some sample generated poems:
[07] To an old pond [19] The shallows— [36] Look like someone else
[41] Trickles all night long [22] Indeed this is just [02] With tears on their eyes.
Other potential projects
Here are some potential projects that can make good use of the existing code:
contemporaryHaiku – a haiku generator that uses the classic 5-7-5 form, but references themes of contemporary / modern life, especially technology
randomRilke – almost the same internal code, but references Rilke’s Sonnets to Orpheus
Attribution
Translated versions came from the following sources: