beautifulsoup Tutorial => Accessing internal tags and their...

Example

Let's assume you got an html after selecting with soup.find('div', class_='base class'):

from bs4 import BeautifulSoup

soup = BeautifulSoup(SomePage, 'lxml')
html = soup.find('div', class_='base class')
print(html)

<div class="base class">
  <div>Sample text 1</div>
  <div>Sample text 2</div>
  <div>
    <a class="ordinary link" href="https://example.com">URL text</a>
  </div>
</div>

<div class="Confusing class"></div>
'''

And if you want to access <a> tag's href, you can do it this way:

a_tag = html.a
link = a_tag['href']
print(link)

https://example.com

This is useful when you can't directly select <a> tag because it's attrs don't give you unique identification, there are other "twin" <a> tags in parsed page. But you can uniquely select a parent tag which contains needed <a>.

PDF - Download beautifulsoup for free

Previous Next

beautifulsoup

Fastest Entity Framework Extensions

Example

Got any beautifulsoup Question?

beautifulsoup

beautifulsoup Locating elements Accessing internal tags and their attributes of initially selected tag

Fastest Entity Framework Extensions

Example

Got any beautifulsoup Question?