<aside>

Table of Contents

</aside>

If the dashboard doesn't load or display fully, view it directly on my Tableau page.

Key Takeaways

2022 was my biggest year for streaming: 15 485 total streams and 768 hours of listening.
I listened to more unique artists and tracks in 2020 (1 205 artists and 3 549 tracks) than in 2022 (817 artists, 2 705 tracks), even though I spent more time listening in 2022. This shows 2020 was a more diverse year, while 2022 was focused on favorites and repeat listens.
My peak listening week was the week of January 30, 2022, during which I streamed over 44 hours of music, the highest in my history.
My listening habits shifted over time:
- From 2016 to 2018, I mainly listened around 7 AM and 5 PM, likely during commutes.
- From 2019 on, patterns slowly started to change due to remote classes (COVID-19 lockdowns), internships and later office work. Listening became more evenly spread out during the day, with peak times in the afternoon and evening.

Data Overview

Using Spotify’s “Download your data” tool, users can request a copy of their personal data. Spotify offers three types of data packages, which can be downloaded individually or together:

Account data, Extended streaming history and Technical log information.

For this project, I requested my extended streaming history, which contains detailed records of all audio and video content I’ve streamed since opening my account, including track metadata, timestamps, and playback behavior. According to Spotify, preparation can take up to 30 days.

A few days after submitting my request, I received a ZIP file containing 11 JSON files, each representing a segment of my listening history:

Streaming_History_Audio_2014-2018_0.json
Streaming_History_Audio_2018-2019_1.json
Streaming_History_Audio_2019-2020_2.json

...
Streaming_History_Audio_2023-2024_8.json
Streaming_History_Audio_2024-2025_9.json
Streaming_History_Video_2020-2024.json

Here’s a sample JSON object and what each field means:

 {
  "ts": "2024-11-11T03:30:29Z",                     // Date and time when the stream ended (UTC)
  "platform": "ios",                                // Platform used to stream the track
  "ms_played": 1514,                                // Duration the track was played (in milliseconds)
  "conn_country": "CA",                             // Country code where the stream occurred
  "ip_addr": "24.202.7.143",                        // IP address used during the stream
  "master_metadata_track_name": "Supernova",        // Name of the track
  "master_metadata_album_artist_name": "aespa",     // Name of the artist or band
  "master_metadata_album_album_name": "Armageddon - The 1st Album",  // Name of the album
  "spotify_track_uri": "spotify:track:5lKnZbdGCBViitE1Ce5TZh",       // Spotify URI identifying the track
  "episode_name": null,                             // Name of the podcast episode (if applicable)
  "episode_show_name": null,                        // Name of the podcast show (if applicable)
  "spotify_episode_uri": null,                      // Spotify URI identifying the podcast episode
  "audiobook_title": null,                          // Name of the audiobook (if applicable)
  "audiobook_uri": null,                            // Spotify URI identifying the audiobook
  "audiobook_chapter_uri": null,                    // Spotify URI identifying the audiobook chapter
  "audiobook_chapter_title": null,                  // Name of the audiobook chapter
  "reason_start": "clickrow",                       // Why the track started (e.g., clickrow, autoplay)
  "reason_end": "endplay",                          // Why the track ended (e.g., endplay, forwardbutton)
  "shuffle": true,                                  // Whether shuffle mode was used
  "skipped": true,                                  // Whether the user skipped the track
  "offline": false,                                 // Whether the track was played offline
  "offline_timestamp": 1731295828,                  // Timestamp of when offline mode was used (if used)
  "incognito_mode": false                           // Whether the track was played in a private session
}

This data structure is consistent across all audio and video streaming history files, with one JSON object representing each individual stream.

For detailed definitions of each field, Spotify provides a reference guide titled “Read Me First – Extended Streaming History.” You can access it on my GitHub.

Data Preparation

To prepare the data for analysis and visualization in Tableau, I followed these key steps using Python and pandas:

<aside>

1. Merge all files: Scanned the target folder for all .json files, loaded each with pandas, and concatenated them into a single master dataset.

2. Convert timestamps: Parsed the raw UTC timestamps (ts) into timezone-aware datetime objects.

3. Filter for music only: Removed all rows containing podcast or audiobook metadata based on non-null media-specific columns.

4. Remove duplicates: Dropped exact duplicate rows across all fields to ensure data integrity.

5. Handle missing data: Filtered out any rows missing either the track name or artist name.

6. Adjust for local time: Converted UTC timestamps to local time using country-specific time zones (defaulted to UTC when unavailable).

7. Export to CSV: Dropped temporary helper columns and saved the final cleaned dataset as a .csv file, ready for use in Tableau.

</aside>

Full Python code (click to expand)