SitemapReaderQuick module

Read and parse through a sitemap and return the data.

class SitemapReaderQuick.SitemapReaderQuick(sitemap_url, sitemap_data={}, conn_limit=None, target_loc=None, target_scheme=None, verbosity=50)

Bases: object

Sitemap reader that uses asynchronous communications for fast functionality.

Parameters
  • sitemap_url (str) – URL to the sitemap

  • sitemap_data (dict, optional) – data from a parsed sitemap (default: {})

  • conn_limit (int, optional) – maximum number of connections to use at once (default: 100)

  • verbosity (int, optional) – verbosity setting (default: 50)

sitemap_url

URL to the sitemap

Type

str

sitemap_data

data from a parsed sitemap

Type

dict

conn_limit

maximum number of connections to use at once

Type

int

verbosity

verbosity setting

Type

int

Note

See https://docs.python.org/3/library/logging.html#logging-levels for more information on using the verbosity setting.

get_sitemap_data() → dict

Getter for the parsed sitemap data.

Returns

provides the URL and lastmod data from the sitemap

Return type

dict

get_sitemap_url() → str

Getter for the sitemap_url.

Returns

URL to the sitemap

Return type

str

async parse_sitemap()

Process the data within the sitemap.

print_stats() → None

Print out the number of URLs found in the sitemaps.