# DirectorySpider A spider that crawls pages under a directory. This spider is useful when scraping ArticleItem from a part of website. ## How It Works The directory to crawl is the base directory of the last component of the given URL. When the URL is `http://example.org/index.html`, it crawls all the URLs. When the URL is `http://example.org/foo/index.html`, it crawls pages under `/foo/`. When the URL is `http://example.org/foo/bar/index.html`, it crawls pages under `/foo/bar/` but not `/foo/bar.html`. When start_urls is `http://example.org/a/b/c.html`: - it crawls `/a/b/index.html`. - it crawls `/a/b/foo.html`. - it crawls `/a/b/c/bar.html`. - it does not crawl `/index.html` - it does not crawl `/a/index.html`