web analytics
Unlock the Magic of Your kitchen with Our Cookbook!

By: Leonardo

admin
By admin
4 Min Read
Disclosure: This website may contain affiliate links, which means I may earn a commission if you click on the link and make a purchase. I only recommend products or services that I personally use and believe will add value to my readers. Your support is appreciated!

This recipe looks so fresh and simple! Blistered peas with lemon and salt sound like the perfect balance of bright and savory flavors, especially as a quick side or light snack. Dishes like this really show how a few quality ingredients can create something delicious and satisfying. It also pairs nicely with a refreshing drink—something from the 7 brew menu usa would be a great addition.

version ai :
python
import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin

def get_full_article_html(article_url):
“””
Fetches the full article content from a Smitten Kitchen URL,
making image src, srcset, and link href attributes absolute.

Args:
article_url (str): The URL of the Smitten Kitchen article.

Returns:
str: The full HTML content of the article, or an error message if not found.
“””
# Ensure the URL points to the base article, not a comment section
# The original URL was https://smittenkitchen.com/2024/06/blistered-peas-in-the-pod-with-lemon-and-salt/#comment-2727739
# The actual article URL is https://smittenkitchen.com/2024/06/blistered-peas-in-the-pod-with-lemon-and-salt/
base_article_url = article_url.split(‘#’)[0]

try:
response = requests.get(base_article_url, timeout=10)
response.raise_for_status() # Raise an HTTPError for bad responses (4xx or 5xx)
except requests.exceptions.RequestException as e:
return f”

Error fetching article from {base_article_url}: {e}

soup = BeautifulSoup(response.content, ‘html.parser’)

# Find the main components of the article
article_title_tag = soup.find(‘h1′, class_=’entry-title’)
article_meta_tag = soup.find(‘div’, class_=’entry-meta’) # Contains date, author, etc.
article_content_tag = soup.find(‘div’, class_=’entry-content’)

# Create a new container to hold the extracted article parts
output_soup = BeautifulSoup(”, ‘html.parser’)
article_container = output_soup.new_tag(‘div’)
article_container[‘class’] = ‘extracted-smittenkitchen-article’ # Add a class for identification

# Add title to the container (BeautifulSoup moves the tag if appended directly)
if article_title_tag:
article_container.append(article_title_tag)

# Add meta information (like date/author) if found
if article_meta_tag:
article_container.append(article_meta_tag)

# Add main content and process its elements
if article_content_tag:
# Create a deep copy of the content tag to modify without affecting the original soup
# This is important if original tags might be reused or if we want clean modification.
content_clone_soup = BeautifulSoup(str(article_content_tag), ‘html.parser’)
cloned_entry_content = content_clone_soup.find(‘div’, class_=’entry-content’)

# Make all image src and link href attributes absolute within the cloned content
for img in cloned_entry_content.find_all(‘img’):
if img.get(‘src’):
img[‘src’] = urljoin(base_article_url, img[‘src’])
if img.get(‘srcset’): # Handle srcset for responsive images
srcset_parts = img[‘srcset’].split(‘,’)
new_srcset_parts = []
for part in srcset_parts:
part = part.strip()
if part:
# Split by space, the first part is the URL, rest are descriptors (e.g., ‘1x’, ‘750w’)
src_url_part = part.split(‘ ‘)[0]
desc_parts = part.split(‘ ‘)[1:]
new_src_url = urljoin(base_article_url, src_url_part)
new_srcset_parts.append(f”{new_src_url} {‘ ‘.join(desc_parts)}”)
img[‘srcset’] = ‘, ‘.join(new_srcset_parts)

for a_tag in cloned_entry_content.find_all(‘a’):
if a_tag.get(‘href’):
a_tag[‘href’] = urljoin(base_article_url, a_tag[‘href’])

# Ensure any