https://github.com/potahtml/mpa-archive Skip to content Navigation Menu Toggle navigation Sign in * Product + Actions Automate any workflow + Packages Host and manage packages + Security Find and fix vulnerabilities + Codespaces Instant dev environments + GitHub Copilot Write better code with AI + Code review Manage code changes + Issues Plan and track work + Discussions Collaborate outside of code Explore + All features + Documentation + GitHub Skills + Blog * Solutions For + Enterprise + Teams + Startups + Education By Solution + CI/CD & Automation + DevOps + DevSecOps Resources + Learning Pathways + White papers, Ebooks, Webinars + Customer Stories + Partners * Open Source + GitHub Sponsors Fund open source developers + The ReadME Project GitHub community articles Repositories + Topics + Trending + Collections * Enterprise + Enterprise platform AI-powered developer platform Available add-ons + Advanced Security Enterprise-grade security features + GitHub Copilot Enterprise-grade AI features + Premium Support Enterprise-grade 24/7 support * Pricing Search or jump to... Search code, repositories, users, issues, pull requests... Search [ ] Clear Search syntax tips Provide feedback We read every piece of feedback, and take your input very seriously. [ ] [ ] Include my email address so I can be contacted Cancel Submit feedback Saved searches Use saved searches to filter your results more quickly Name [ ] Query [ ] To see all available qualifiers, see our documentation. Cancel Create saved search Sign in Sign up You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert {{ message }} potahtml / mpa-archive Public * Notifications You must be signed in to change notification settings * Fork 1 * Star 122 * Crawls a Multi-Page Application to a zip file, serve the Multi-Page Application from the zip file. A MPA archiver. Could be used as a Site Generator License MIT license 122 stars 1 fork Branches Tags Activity Star Notifications You must be signed in to change notification settings * Code * Issues 0 * Pull requests 0 * Actions * Projects 0 * Security * Insights Additional navigation options * Code * Issues * Pull requests * Actions * Projects * Security * Insights potahtml/mpa-archive This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. master BranchesTags Go to file Code Folders and files Name Name Last commit message Last commit date Latest commit History 21 Commits src src .gitignore .gitignore .npmignore .npmignore LICENSE LICENSE package.json package.json readme.md readme.md release.bat release.bat tsconfig.json tsconfig.json View all files Repository files navigation * README * MIT license Multi-Page Application Archive Crawls a Multi-Page Application into a zip file. Serve the Multi-Page Application from the zip file. A MPA archiver. Could be used as a Site Generator. Installation npm install -g mpa-archive Usage Crawling mpa http://example.net Will crawl the url recursively and save it in example.net.zip. Once done, it will display a report and can serve the files from the zip. Serving mpa Will create a server for each zip file on the current directory. Host is localhost with a port seeded to the zip file path. Features * It uses headless puppeteer * Crawls http://example.net with cpu count / 2 threads * Progress is displayed in the console * Fetches sitemap.txt and sitemap.xml as a seed point * Reports HTTP status codes different than 200, 304, 204, 206 * Crawls on site urls only but will fetch external resources * Intercepts site resources and saves that too * Generates mpa/sitemap.txt and mpa/sitemap.xml * Saves site sourcemaps * Can resume if process exit, save checkpoint every 250 urls to consider * save it in an incremental compression format, that doesnt require re-compressing the whole file when it changes, maybe already does that? * urls to externals resources are not re-written to be local resources, if this is done then stuff loaded from the root will break * it should crawl the site by clicking the links instead of opening a full tab About Crawls a Multi-Page Application to a zip file, serve the Multi-Page Application from the zip file. A MPA archiver. Could be used as a Site Generator Resources Readme License MIT license Activity Custom properties Stars 122 stars Watchers 3 watching Forks 1 fork Report repository Releases No releases published Packages 0 No packages published Languages * JavaScript 99.2% * Batchfile 0.8% Footer (c) 2024 GitHub, Inc. Footer navigation * Terms * Privacy * Security * Status * Docs * Contact * Manage cookies * Do not share my personal information You can't perform that action at this time.