
Validating Sitemap Namespace with Pytest
Basic Pytest setup
Sitemaps help search engines index your website correctly. But a malformed or incorrectly-namespaced sitemap can cause issues with indexing. This post walks through using pytest
to validate that your sitemap uses the correct XML namespace.
Why Namespace Validation Matters
The XML namespace ensures the sitemap conforms to the sitemaps.org protocol. If the namespace is incorrect or missing, search engines may ignore your sitemap, leading to poor visibility.
The Test Code
Below is the code that checks whether the sitemap at https://www.americanfreight.com/sitemap.xml
is valid and correctly namespaced:
import pytest
import requests
import xml.etree.ElementTree as ET
from urllib.parse import urlparse
# Define the sitemap URL
SITEMAP_URL = "https://www.americanfreight.com/sitemap.xml"
@pytest.fixture
def sitemap_content():
"""Fixture to fetch and return the sitemap XML content."""
response = requests.get(SITEMAP_URL)
assert response.status_code == 200, f"Failed to fetch sitemap: {response.status_code}"
return response.content
@pytest.fixture
def sitemap_root(sitemap_content):
"""Fixture to parse the sitemap XML and return the root element."""
return ET.fromstring(sitemap_content)
def test_sitemap_fetch(sitemap_content):
"""Test that the sitemap can be fetched successfully."""
assert sitemap_content is not None, "Sitemap content is empty"
assert b"<?xml" in sitemap_content, "Sitemap is not valid XML"
def test_sitemap_root_namespace(sitemap_root):
"""Test that the sitemap has the correct XML namespace."""
expected_namespace = "http://www.sitemaps.org/schemas/sitemap/0.9"
assert sitemap_root.tag == f"{{{expected_namespace}}}urlset",
f"Unexpected root tag or namespace: {sitemap_root.tag}"
Running the Test
To run the tests, use the following command in your terminal:
pytest test_sitemap_namespace.py
Test Breakdown
- sitemap_content: Fetches the sitemap and ensures it's accessible (status 200).
- sitemap_root: Parses the XML content for further inspection.
- test_sitemap_fetch: Confirms the sitemap starts with the
<?xml
declaration. - test_sitemap_root_namespace: Verifies the root element includes the correct namespace defined by sitemaps.org.
Conclusion
These simple but powerful tests can catch issues early in your CI pipeline. Whether you're managing SEO for a large e-commerce platform or a personal blog, namespace correctness should be a non-negotiable in your automated checks.
PermalinkHow to Check URL Validity
Quick and Easy
A quick guide to validating URLs after a release using Pytest and Python's requests library.
Introduction
After a software release, ensuring that all URLs are accessible is critical. This blog post demonstrates how to use Pytest and the requests library to check if URLs return a 200 status code, indicating they are valid and accessible. This approach is fast, reliable, and easy to integrate into your testing pipeline.
Prerequisites
- Python 3.6 or higher
- Pytest (
pip install pytest
) - Requests library (
pip install requests
)
Sample Code
Below is a simple Pytest script to check if a URL returns a 200 status code:
import pytest
import requests
def test_url_returns_200():
url = "https://www.cryan.com"
try:
response = requests.get(url, timeout=5)
assert response.status_code == 200
except requests.RequestException as e:
pytest.fail(f"Request failed: {e}")
This test sends a GET request to the specified URL and checks if the response status code is 200. If the request fails (e.g., due to a timeout or network issue), the test fails with an error message.
Running the Test
Save the code in a file (e.g., test_urls.py
) and run it using the following command:
pytest test_urls.py -v
The -v
flag provides verbose output, showing the test results in detail.
Scaling to Multiple URLs
To test multiple URLs, you can use Pytest's parameterization feature. Here's an example:
import pytest
import requests
@pytest.mark.parametrize("url", [
"https://www.cryan.com",
"https://www.example.com",
"https://www.python.org",
])
def test_url_returns_200(url):
try:
response = requests.get(url, timeout=5)
assert response.status_code == 200
except requests.RequestException as e:
pytest.fail(f"Request failed for {url}: {e}")
This script tests multiple URLs in a single test function, making it efficient for checking several endpoints after a release.
Best Practices
- Set a reasonable timeout (e.g., 5 seconds) to avoid hanging tests.
- Use parameterization to test multiple URLs efficiently.
- Integrate tests into your CI/CD pipeline for automated checks post-release.
- Log failures with detailed messages to aid debugging.
Conclusion
Using Pytest and the requests library, you can quickly validate URLs after a release. This approach is simple, scalable, and integrates well with automated testing workflows. By incorporating these tests into your pipeline, you can ensure your application's URLs remain accessible and reliable.
In Pytest, what is the Best Date/Time Format for Filenames
Basic Filename Format to find Files
The best date/time format for filenames is ISO 8601 with modifications to ensure compatibility and readability. I recommend using YYYYMMDD_HHMMSS
(e.g., 20250430_143022
) for the following reasons:
- Sortability: ISO 8601 (year-month-day) ensures files sort chronologically when listed.
- Uniqueness: Including seconds prevents filename collisions during rapid test runs.
- Readability: The format is clear and universally understood.
- Filesystem Safety: Replacing colons (
:
) with underscores (_
) avoids issues on filesystems that don't allow colons in filenames.
Here's an example of generating a timestamped filename in pytest
:
import pytest
from datetime import datetime
def get_timestamped_filename(base_name, extension):
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
return f"{base_name}_{timestamp}.{extension}"
# Example usage in a test
def test_example():
filename = get_timestamped_filename("screenshot", "png")
# Save screenshot with filename like "screenshot_20250430_143022.png"
print(f"Saving screenshot as {filename}")
Tip: If you need microseconds for high-frequency tests, use datetime.now().strftime("%Y%m%d_%H%M%S_%f")
to include microseconds (e.g., 20250430_143022_123456
).
Alternative: For human-readable logs, you might include a readable date in the file content but keep the filename simple and sortable. For example, save a log with a header like Test Run: April 30, 2025 14:30:22
inside the file, but name the file log_20250430_143022.txt
.
Check Image Sizes on Production Websites
Using Python and Pytest to Catch Bloated Images Before They Slow Down Your Site
Large images can slow down page performance and negatively impact user experience. During one of my past QA roles, I created a simple Python function to detect when an image might be too large based on its content length header.
Use Case
Our QA team was tasked with verifying the size of images on the production site to ensure they met
optimization standards. While tools like Lighthouse can highlight these issues, I needed something
scriptable - and Pytest
+ requests
was a lightweight and perfect solution.
Sample Code
Here’s a simplified version of the utility function I used in our automated tests:
import requests
def check_image_size(img_url, base_url="https://www.cryan.com"):
# Construct full URL if necessary
if not img_url.startswith(('http://', 'https://')):
img_url = base_url + img_url if img_url.startswith('/') else base_url + '/' + img_url
try:
response = requests.get(img_url, stream=True)
response.raise_for_status()
# Check content length if provided in headers
content_length = response.headers.get('Content-Length')
if content_length and int(content_length) > 1000:
return f"Image at {img_url} might be too large: Header indicates {int(content_length)/1000}KB"
except Exception as e:
return f"Error checking image size: {e}"
return f"Image at {img_url} appears to be an acceptable size."
How to Use It in Pytest
Integrate this into your test suite by looping through known image paths:
def test_image_sizes():
image_paths = [
"/images/logo.png",
"/media/banner.jpg",
"/assets/hero-large.jpg"
]
for img in image_paths:
result = check_image_size(img)
assert "too large" not in result, result
Pro Tip
Content-Length
header, which isn't always accurate for dynamically
generated or CDN-compressed images. For full accuracy, you could download the stream and measure bytes.
Dependencies
This example uses the requests
library:
pip install requests
Permalink
Verifying Your Pytest Setup with a Simple Selenium Test
Quick Test to make sure proper installation
In this post, we'll walk through creating a basic Pytest test that uses Selenium WebDriver to navigate to
www.google.com
and validate that the word "Google" appears in the page title. This test serves
as a lightweight check to ensure your Pytest environment is ready to go.
Prerequisites
Before we dive in, make sure you have the following installed:
- Python: Version 3.7 or higher.
- Pytest: Install it using
pip install pytest
orpython3 -m pip install pytest --user
. - Selenium WebDriver: Install it using
pip install selenium
orpython3 -m pip install selenium --user
. - Web Browser Driver: For this example, we'll use
ChromeDriver. Download it from the ChromeDriver website and ensure it matches your Chrome
browser version. Place the
chromedriver
executable in your system’s PATH or specify its location in the code.
Setting Up the Project
- Create a new directory for your project, e.g.,
pytest_setup_test
. - Inside the directory, create a virtual environment (optional but
recommended):
python -m venv venv source venv/bin/activate # On Windows: venvScriptsactivate
- Install the required packages:
pip install pytest selenium
- Create a file named
test_google.py
to hold our test.
Writing the Pytest Test
Here’s the code for test_google.py
, which uses Selenium to navigate to
www.google.com
and checks the page title:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
import pytest
# Fixture to set up and tear down the WebDriver
@pytest.fixture
def browser():
# Specify the path to ChromeDriver if not in PATH
# service = Service('/path/to/chromedriver') # Uncomment and update if needed
service = Service()
driver = webdriver.Chrome(service=service)
yield driver
driver.quit()
# Test to validate the page title of google.com
def test_google_title(browser):
browser.get("https://www.google.com")
title = browser.title
assert "Google" in title, f"Expected 'Google' in title, but got '{title}'"
Explanation
- Fixture (
browser
): Thepytest.fixture
decorator defines a reusable setup/teardown function. Here, it initializes a Chrome WebDriver instance and yields it to the test. After the test runs,driver.quit()
closes the browser. - Test Function (
test_google_title
): This function uses thebrowser
fixture to:- Navigate to
https://www.google.com
usingbrowser.get()
. - Retrieve the page title with
browser.title
. - Assert that the word "Google" is in the title. If not, the test fails with a descriptive message.
- Navigate to
- ChromeDriver Path: If ChromeDriver is not in your system’s
PATH, uncomment the
service
line and specify the path to thechromedriver
executable.
Running the Test
To run the test, navigate to your project directory in the terminal and execute:
pytest test_google.py -v
Expected Output
If everything is set up correctly, you should see output similar to this:
============================= test session starts ==============================
platform linux -- Python 3.9.5, pytest-7.4.3, pluggy-1.0.0
collected 1 item
test_google.py::test_google_title PASSED [100%]
============================== 1 passed in 2.34s ===============================
- PASSED: Indicates the test ran successfully, confirming that Pytest and Selenium are configured correctly, and the word "Google" was found in the page title.
- FAILED: If the test fails, check the error message. Common
issues include:
- ChromeDriver not found (verify the path or PATH configuration).
- Network issues preventing access to
www.google.com
. - Mismatched ChromeDriver and Chrome browser versions.
Troubleshooting Tips
- ChromeDriver Issues: Ensure ChromeDriver matches your
Chrome version (check Chrome’s version in
Settings > About Chrome
). Update or reinstall ChromeDriver if needed. - Pytest Not Found: Verify Pytest is installed in your active
environment (
pip show pytest
). - Selenium Errors: Confirm Selenium is installed
(
pip show selenium
) and that you’re using a compatible version with your browser.
﹤li class="list-group-item">Test Fails Due to Title: Occasionally, Google’s title might
vary (e.g., due to localization). You can modify the assertion to be more flexible if needed.
Why This Test?
This test is a simple yet effective way to validate your Pytest setup because it:
- Confirms Pytest is installed and running tests correctly.
- Verifies Selenium WebDriver is configured and can interact with a browser.
- Checks network connectivity and browser compatibility.
- Provides a clear pass/fail outcome with minimal code.
Next Steps
Once this test passes, your Pytest environment is ready! You can expand your test suite to include more complex scenarios, such as:
- Testing form submissions or button clicks.
- Validating other websites or web applications.
- Integrating with CI/CD pipelines for automated testing.
Feel free to tweak this test to suit your needs. For example, you could test a different website or validate other page elements like text or links. Happy testing!
Spell-Check Your Site Using Pytest
Use Automation to Check Spelling
Here's a clever way to catch embarrassing spelling mistakes on your website using pytest
and spellchecker
. This script can be scheduled to run periodically to ensure nothing slips through the cracks!
Why Check for Spelling?
Spelling errors can reduce trust, affect SEO, and just look unprofessional. With this script, you can automatically scan your site's content and get alerted if anything seems off.
Dependencies
pytest
requests
beautifulsoup4
pyspellchecker
Install them with:
pip install pytest requests beautifulsoup4 pyspellchecker
The Test Code
Here is the full test you can drop into your test suite:
import pytest
import requests
from bs4 import BeautifulSoup
from spellchecker import SpellChecker
from urllib.parse import urljoin
def get_visible_text(url):
try:
response = requests.get(url, timeout=10)
response.raise_for_status()
except requests.RequestException as e:
pytest.fail(f"Failed to fetch {url}: {e}")
soup = BeautifulSoup(response.text, 'html.parser')
for element in soup(['script', 'style', 'header', 'footer', 'nav', 'aside']):
element.decompose()
text = soup.get_text(separator=' ', strip=True)
return text
def check_spelling(text, custom_words=None):
spell = SpellChecker()
if custom_words:
spell.word_frequency.load_words(custom_words)
words = text.split()
misspelled = spell.unknown(words)
return misspelled
def test_spelling_cryan_com():
url = "https://www.cryan.com"
custom_words = ["cryan", "blog", "tech", "xai"]
text = get_visible_text(url)
misspelled_words = check_spelling(text, custom_words=custom_words)
assert not misspelled_words, (
f"Spelling errors found on {url}: {misspelled_words}"
)
if __name__ == "__main__":
pytest.main(["-v", __file__])
Customization Tips
- Add more custom words to avoid false positives like brand names or domain-specific terms.
- Expand to multiple pages by looping through URLs and running the same logic.
- Integrate with CI/CD for automatic detection during deployment.
Using pytest.raises to Validate Exceptions Like a Pro
Negative Tests are useful too!
As a QA Engineer or automation enthusiast, writing tests that validate correct behavior is only half the battle. The other half? Making sure the app handles wrong behavior gracefully. That's where negative testing comes in - and pytest.raises
is your secret weapon.
In this post, we'll explore how pytest.raises
lets you assert exceptions are raised without failing the test. This is perfect for validating edge cases, bad input, or failed operations.
What is pytest.raises
?
In Pytest, if your code raises an exception during a test, the test normally fails - as it should. But what if you're expecting the exception? That's where pytest.raises
comes in.
It wraps a block of code and passes the test only if the specified exception is raised. If it's not raised, the test fails.
Why Use pytest.raises
?
- Makes negative testing clean and readable
- Helps document edge-case handling
- Prevents false positives in error conditions
- Encourages testing of robust, defensive code
A Real-World Example
Let's say we're testing a simple division function that raises a ZeroDivisionError
when the denominator is zero.
def safe_divide(x, y):
return x / y
Now for the test:
import pytest
def test_safe_divide_zero_division():
with pytest.raises(ZeroDivisionError):
safe_divide(10, 0)
This test will pass if safe_divide(10, 0)
throws ZeroDivisionError
. If it doesn't (for example, if the code silently returns None
), the test fails - telling us something's broken.
Accessing the Exception
You can even inspect the exception message or attributes:
def test_value_error_with_message():
with pytest.raises(ValueError) as excinfo:
int("hello") # not a valid integer
assert "invalid literal" in str(excinfo.value)
This is powerful when you want to verify the type and details of the exception.
Clean Up with pytest.raises
Before pytest.raises
, Python developers would clutter tests with try/except blocks and fail manually. Compare:
Old way:
def test_safe_divide_old():
try:
safe_divide(10, 0)
assert False, "Expected ZeroDivisionError"
except ZeroDivisionError:
pass
Pytest way:
def test_safe_divide_pytest():
with pytest.raises(ZeroDivisionError):
safe_divide(10, 0)
Much cleaner, right?
Use Case Ideas for pytest.raises
- Invalid API parameters (
TypeError
,ValueError
) - Database connection failures (
ConnectionError
) - File not found or permission issues (
IOError
,PermissionError
) - Custom business rule exceptions
Final Thought
In automation testing, you should never be afraid of exceptions - you should expect them when the input is bad. pytest.raises
gives you the confidence to write bold, bulletproof test cases that ensure your code handles errors on purpose - not by accident.
Have a favorite exception handling trick or a real bug you caught using pytest.raises
? Share it in the comments below.
Level Up Your Pytest WebDriver Game
Essential Options for SQA Engineers
Why WebDriver Options Matter
WebDriver options allow you to customize the behavior of your browser instance, enabling you to optimize performance, handle specific scenarios, and mitigate common testing challenges. By strategically applying these options, you can create more robust, stable, and efficient automated tests.
1. Headless Mode with GPU Disabled: Speed and Stability Combined
Running tests in headless mode-without a visible browser window-is a game-changer for speed and resource efficiency. However, GPU-related issues can sometimes lead to crashes. The solution? Disable the GPU while running headless.
--headless=new
: Activates the newer, more efficient headless mode.--disable-gpu
: Prevents GPU-related crashes, ensuring test stability.
This combination provides a significant performance boost and enhances the reliability of your tests, especially in CI/CD environments.
2. Evading Detection: Disabling DevTools and Automation Flags
Websites are increasingly sophisticated in detecting automated browsers. To minimize the risk of your tests being flagged, disable DevTools and automation-related flags.
--disable-blink-features=AutomationControlled
: Prevents thenavigator.webdriver
property from being set totrue
.excludeSwitches
,enable-automation
: Removes the "Chrome is being controlled by automated test software" infobar.useAutomationExtension
,False
: Disables the automation extension.
3. Ignoring Certificate Errors: Simplifying HTTPS Testing
When testing HTTPS websites with self-signed or invalid certificates, certificate errors can disrupt your tests. The --ignore-certificate-errors
option allows you to bypass these errors.
This option is invaluable for testing development or staging environments where certificate issues are common. However, remember to avoid using this in production tests, as it can mask real security vulnerabilities.
4. Disabling Extensions and Popup Blocking: Minimizing Interference
Browser extensions and pop-up blockers can interfere with your tests, leading to unpredictable behavior. Disabling them ensures a clean and consistent testing environment.
--disable-extensions
: Prevents extensions from loading, reducing potential conflicts.--disable-popup-blocking
: Stops pop-ups from appearing, simplifying test interactions.
Integrating with Pytest Fixtures
To streamline your Pytest setup, encapsulate your WebDriver options within a fixture.
This fixture sets up a Chrome browser with your desired options and makes it available to your test functions.
Conclusion
Mastering WebDriver options is essential for SQA engineers seeking to optimize their Pytest automation workflows. By leveraging these options, you can create faster, more stable, and reliable tests, ultimately improving the overall quality and efficiency of your testing efforts. Experiment with these options and discover how they can enhance your testing practices.
Capturing Screenshots in Fixture Teardown
Cool Trick with Teardown
Pytest has solidified its position as a go-to testing framework for Python developers due to its simplicity, extensibility, and powerful features. In this blog post, we'll dive deep into using Pytest, specifically focusing on its integration with Playwright for browser automation, and explore how to capture screenshots during fixture teardown for enhanced debugging and result analysis.
Capturing Screenshots in Fixture Teardown
To capture a screenshot before the browser closes, we can modify the page fixture to include a teardown phase. This will help make debugging a bit easier and a chance to look at automation to see if there's any weirdness.
Any code in the Fixture that appears after "yield page" will run at the conclusion of the test.
import pytest
from playwright.sync_api import sync_playwright
import os
@pytest.fixture
def page(request):
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()
yield page
def fin():
screenshot_path = f"screenshots/{request.node.name}.png"
os.makedirs(os.path.dirname(screenshot_path), exist_ok=True)
page.screenshot(path=screenshot_path)
browser.close()
request.addfinalizer(fin)
def test_example_with_screenshot(page):
page.goto("https://www.cryan.com")
assert "cryan.com" in page.title()
def test_example_fail(page):
page.goto("https://www.cryan.com")
assert "Wrong Title" in page.title()
After running the tests, you'll find screenshots in the screenshots directory. These screenshots will help you understand the state of the browser at the end of each test, especially during failures.
Benefits of Screenshot Capture
Debugging: Quickly identify issues by visually inspecting the browser state. Reporting: Include screenshots in test reports for better documentation. Visual Validation: Verify UI elements and layout.
PermalinkParametrization in Pytest
Use the same code over and over
Parametrization in Pytest allows you to run the same test function multiple times with different inputs. Instead of writing separate test functions for each set of data, you can define a single test and provide various argument sets using the @pytest.mark.parametrize decorator. This approach is especially useful for testing functions that need to handle a variety of inputs, edge cases, or data types.
Why Use Parametrization?
- Code Reusability: Write one test function and reuse it for multiple test cases.
- Efficiency: Reduce boilerplate code and make your test suite easier to maintain.
- Clarity: Clearly define the inputs and expected outputs for each test case.
- Comprehensive Testing: Easily test a wide range of scenarios without extra effort.
Code Example
This code will check to see if various internal sites are up and running. I ran similar code in the past. This was done so that I could see if there are any issues before the morning standup.
If I didn't use parametrization here, there would be multiple test cases which could cause overhead issues if changes needed to be done.
import pytest
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.common.exceptions import WebDriverException
# List of websites to test
WEBSITES = [
"https://www.company.com",
"https://qa1.company.com",
"https://qa2.company.com",
"https://stage.company.com"
]
@pytest.fixture
def chrome_driver():
"""Fixture to set up and tear down Chrome WebDriver"""
# Set up Chrome options
chrome_options = Options()
chrome_options.add_argument("--headless") # Run in headless mode
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage")
# Initialize driver
driver = webdriver.Chrome(options=chrome_options)
driver.set_page_load_timeout(30) # Set timeout to 30 seconds
yield driver
# Teardown
driver.quit()
@pytest.mark.parametrize("website", WEBSITES)
def test_website_is_up(chrome_driver, website):
"""
Test if a website loads successfully by checking:
1. Page loads without timeout
2. HTTP status is 200 (implicitly checked via successful load)
3. Page title is not empty
"""
try:
# Attempt to load the website
chrome_driver.get(website)
# Check if page title exists and is not empty
title = chrome_driver.title
assert title, f"Website {website} loaded but has no title"
# Optional: Check if body element exists
body = chrome_driver.find_element(By.TAG_NAME, "body")
assert body is not None, f"Website {website} has no body content"
print(f"✓ {website} is up and running (Title: {title})")
except WebDriverException as e:
pytest.fail(f"Website {website} failed to load: {str(e)}")
except AssertionError as e:
pytest.fail(f"Website {website} loaded but content check failed: {str(e)}")
if __name__ == "__main__":
pytest.main(["-v"])
Permalink
About
Welcome to Pytest Tips and Tricks, your go-to resource for mastering the art of testing with Pytest! Whether you're a seasoned developer or just dipping your toes into the world of Python testing, this blog is designed to help you unlock the full potential of Pytest - one of the most powerful and flexible testing frameworks out there. Here, I'll share a treasure trove of practical insights, clever techniques, and time-saving shortcuts that I've gathered from years of writing tests and debugging code.
Check out all the blog posts.
Blog Schedule
Saturday 24 | Internet Tools |
Sunday 25 | Misc |
Monday 26 | Media |
Tuesday 27 | QA |
Wednesday 28 | Pytest |
Thursday 29 | PlayWright |
Friday 30 | Macintosh |
Other Posts
- Using pytest.raises to Validate Exceptions Like a Pro
- Check Image Sizes on Production Websites
- Spell-Check Your Site Using Pytest
- In Pytest, what is the Best Date/Time Format for Filenames
- Naming Screenshots Dynamically in Pytest
- PyTest Install
- Level Up Your Pytest WebDriver Game
- Verifying Your Pytest Setup with a Simple Selenium Test
- mocker.spy
- How to Check URL Validity
- Capturing Screenshots in Fixture Teardown
- Validating Sitemap Namespace with Pytest
- Parametrization in Pytest