HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Exclude File Name
Author: Santiago
Date: 07/31/2025 06:11
 
If you are using Scrapy, you can modify the start_requests method in your
spider to filter out the URLs you don't want:
import scrapy <https://fngames.io>

class MySpider(scrapy.Spider):
    name = "my_spider"

    def start_requests(self):
        urls = ['<URL>']
        for url in urls:
            yield scrapy.Request(url=url, callback=self.parse)

    def parse(self, response):
        for href in response.css('a::attr(href)').getall():
            if 'Wf6hTr0x' not in href:  # Exclude unwanted files
                yield response.follow(href, self.save_file)

    def save_file(self, response):
        # Logic to save the file
        pass
 
Reply Create subthread


All articles

Subject Author Date
Exclude File Name

07/27/2025 16:15
Re: Exclude File Name

07/31/2025 06:11




d

Created with FORUM 2.0.11