@get-set-fetch/extension is an open source browser extension scraper with csv and zip export capabilities.
With a modular architecture, the extension provides a series of scraping scenarios with predefined default values for fast, minimal configuration scraping. All scraped resources are saved in the IndexedDB browser database.
Binary data (images, pdf files, …) can be exported as zip archives. Text based data can be exported as csv files.
Create a new project
See Examples to get an idea on what types of content can be scraped.
You can see the newly created project in the project list page. Clicking “scrape” from the action column will start the scraping process. Urls to be scraped will sequentially open in an additional tab with a delay defined at project creation.
You can end the scraping process at any time by closing the responsible browser tab. Next time you start scraping, the process will resume from where it was interrupted.
Export scraped data
From the project list page, actions column, click “results”. All resources scraped so far will be displayed in a tabular form.
Depending on the selected scraping scenario, you can export the data as either csv or zip.
Look for warning or error entries in the logs page.
You can adjust the log level from the settings page.
If you find a bug, please open an issue and attach in the comment any relevant log entries.