{ "cells": [ { "cell_type": "markdown", "id": "2ed9a4c2", "metadata": {}, "source": [ "# Beautiful Soup\n", "\n", ">[Beautiful Soup](https://www.crummy.com/software/BeautifulSoup/) is a Python package for parsing \n", "> HTML and XML documents (including having malformed markup, i.e. non-closed tags, so named after tag soup). \n", "> It creates a parse tree for parsed pages that can be used to extract data from HTML,[3] which \n", "> is useful for web scraping.\n", "\n", "`Beautiful Soup` offers fine-grained control over HTML content, enabling specific tag extraction, removal, and content cleaning. \n", "\n", "It's suited for cases where you want to extract specific information and clean up the HTML content according to your needs.\n", "\n", "For example, we can scrape text content within `
,