generic.spiders.base#

Module Contents#

Classes#

GenericSpiderConfig

A base config for spiders.

GenericSpider

A base spider class that inherits scrapy.Spider.

Data#

T

API#

class generic.spiders.base.GenericSpiderConfig(/, **data: Any)#

Bases: pydantic.BaseModel

A base config for spiders.

Initialization

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

urls: Union[pydantic.HttpUrl, List[pydantic.HttpUrl]] = None#

A string of comma-separated URLs or List of URL strings.

classmethod split_urls(v)#

Convert comma-separated URLs to a list of string

classmethod convert_to_string(v)#

Convert HttpUrl object to string after validation.

generic.spiders.base.T = 'TypeVar(...)'#
class generic.spiders.base.GenericSpider(*args, **kwargs)#

Bases: scrapy_spider_metadata.Args[generic.spiders.base.T], scrapy.Spider, typing.Generic[generic.spiders.base.T]

A base spider class that inherits scrapy.Spider.

Initialization

allowed_domains = []#
abstractmethod classmethod get_config_class() Type[generic.spiders.base.T]#

Returns configuration class for the class. The configuration class must be either GenericSpiderConfig or its subclass.