Six years of my Spotify playlists

ggplot2 gganimate spotifyr data wrangling data visualization

An analysis of acoustic features with {spotifyr}

June Choe (University of Pennsylvania Linguistics)https://live-sas-www-ling.pantheon.sas.upenn.edu/
07-29-2020

Background

One of my longest running habits is making monthly playlists. At the start of every month, I create a new playlist for that month and add songs that I like. Some songs are carried over from the previous month’s playlist and others are songs that I newly discover, but they’re all representative of the songs that I’m “into” for that month.

I’ve been doing this for many years, and have the best record of my monthly playlists for the past 6 years, which is how long I’ve been using spotify. So when I saw people talking about {spotifyr} - an R wrapper for Spotify’s web API - on twitter, I decided to take a stab at analyzing my monthly playlists (code here).

Analysis #1 - Size of monthly playlists over time

When I first pulled information about my Spotify account, I noticed that I had some gaps in my monthly playlists. This was a special case of non-random missing data: when I didn’t make a new playlist for the month, it’s because I didn’t think that there was a substantial change in what I’m jamming to from the previous month. The {zoo} package, which I didn’t know about before, came in very handy here for dealing with this missingness with its na.locf() (Last Observation Carried Forward) function.

After some more cleaning steps, I first made a simple plot that counted the number of songs that each monthly playlist had.

There are some interesting things I notice, and here’s some context for them.

Analysis #2 - Audio features

The real deal of Spotify API is actually the audio features, which Spotify calculates using their special algorithms. Some of the features are listed in the table below (adopted from the documentation). Of these, I decided to narrow down to acousticness, danceability, energy, and valence because others didn’t really show much variation (e.g., I don’t listen to live-recorded music on Spotify, so liveness is always near zero).

Feature Description
acousticness A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic.
danceability Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.
energy Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy.
instrumentalness Predicts whether a track contains no vocals. “Ooh” and “aah” sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly “vocal”. The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content. Values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0.
liveness Detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live.
loudness The overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track and are useful for comparing relative loudness of tracks. Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude). Values typical range between -60 and 0 db.
speechiness Speechiness detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value. Values above 0.66 describe tracks that are probably made entirely of spoken words. Values between 0.33 and 0.66 describe tracks that may contain both music and speech, either in sections or layered, including such cases as rap music. Values below 0.33 most likely represent music and other non-speech-like tracks.
tempo The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration.
valence A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).

I was interested in looking at how my music taste changed over time, so for each monthly playlist, I calculated the mean values for these four features and made a line plot:

Some things that pop out:

Analysis #3 - Songs during college

Next, I wanted to focus on my years in college, from Fall 2016 to Spring 2020. For this analysis, I defined time in terms of school years and quarters. While I was a college student, I often felt like the passage of time was defined in terms of quarters, so this scale felt appropriate.

Here is the same line plot, except the feature values are averaged by quarter instead of month, and the plot is now faceted by school year:

Observations:

Analysis #4 - “My Top 100” playlists

In my last analysis, I move from my monthly playlists to the end-of-the-year playlists that Spotify makes for you every year.

For this, I grabbed audio features of songs in my yearly top 100 playlists from 2016-2019. In this graph, each line represents a song and the top 10 most listened to song of each year are emphasized in black. The thick red line in each panel represents the average of the songs for that year.

Some observations on the variation in audio features among my top 100 playlists:

Conclusion

I didn’t really dig too deeply into the acoustic profile of the songs I listen to in this post, and I doubt that Spotify’s list of audio features are comprehensive enough to describe my music taste, but this was a cool exercise!

And although I ignored several of the audio features because they weren’t very informative for the songs I listen to, I thought I should at least leave a summary table showing the mean values for all features that I gave in the table above!

Feature Description Average
acousticness A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic. 0.16
danceability Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable. 0.64
energy Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy. 0.75
instrumentalness Predicts whether a track contains no vocals. “Ooh” and “aah” sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly “vocal”. The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content. Values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0. 0.02
liveness Detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live. 0.19
loudness The overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track and are useful for comparing relative loudness of tracks. Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude). Values typical range between -60 and 0 db. -4.95
speechiness Speechiness detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value. Values above 0.66 describe tracks that are probably made entirely of spoken words. Values between 0.33 and 0.66 describe tracks that may contain both music and speech, either in sections or layered, including such cases as rap music. Values below 0.33 most likely represent music and other non-speech-like tracks. 0.08
tempo The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration. 120.25
valence A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry). 0.55