Trouble retrieving artist name from billboard top 100 site using beautiful soup

All we need is an easy explanation of the problem, so here it is.

I am trying to retrieve the most popular songs from this url using the python package BeautifulSoup. When I go to grab the span with the artist name, it grabs the proper span, but when I call ‘.text’ on the span it doesnt grab the text between the span tags.

Here is my code:

import requests
from bs4 import BeautifulSoup

r = requests.get('https://www.billboard.com/charts/hot-100/')
soup = BeautifulSoup(r.content, 'html.parser')
result = soup.find_all('div', class_='o-chart-results-list-row-container')
for res in result:
    songName = res.find('h3').text.strip()
    artist = res.find('span',class_='c-label a-no-trucate a-font-primary-s [email protected] [email protected] u-letter-spacing-0021 lrv-u-display-block a-truncate-ellipsis-2line u-max-width-330 [email protected]').text
    print("song: "+songName)
    print("artist: "+ str(artist))
    print("___________________________________________________")

Which currently prints the following per song:

song: Waiting On A Miracle
artist: <span class="c-label a-no-trucate a-font-primary-s [email protected] [email protected] u-letter-spacing-0021 lrv-u-display-block a-truncate-ellipsis-2line u-max-width-330 [email protected]">

        Stephanie Beatriz
</span>
___________________________________________________

How do I pull only the artist’s name?

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

If there’s one single character off in the class, it won’t catch it. I’d just simplify it by once getting the song title, the artist follows in the next <span> tag. So get that <h3> tag like you do for the song, then use .find_next() to get the artist:

import requests
from bs4 import BeautifulSoup

r = requests.get('https://www.billboard.com/charts/hot-100/')
soup = BeautifulSoup(r.content, 'html.parser')
result = soup.find_all('div', class_='o-chart-results-list-row-container')
for res in result:
    songName = res.find('h3').text.strip()
    artist = res.find('h3').find_next('span').text.strip()
    print("song: "+songName)
    print("artist: "+ str(artist))
    print("___________________________________________________")

Output:

song: Heat Waves
artist: Glass Animals
___________________________________________________
song: Stay
artist: The Kid LAROI & Justin Bieber
___________________________________________________
song: Super Gremlin
artist: Kodak Black
___________________________________________________
song: abcdefu
artist: GAYLE
___________________________________________________
song: Ghost
artist: Justin Bieber
___________________________________________________
song: We Don't Talk About Bruno
artist: Carolina Gaitan, Mauro Castillo, Adassa, Rhenzy Feliz, Diane Guerrero, Stephanie Beatriz & Encanto Cast
___________________________________________________
song: Enemy
artist: Imagine Dragons X JID
___________________________________________________

....

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply