https://github.com/potahtml/mpa-archive
Skip to content
Navigation Menu
Toggle navigation
Sign in
* Product
+
Actions
Automate any workflow
+
Packages
Host and manage packages
+
Security
Find and fix vulnerabilities
+
Codespaces
Instant dev environments
+
GitHub Copilot
Write better code with AI
+
Code review
Manage code changes
+
Issues
Plan and track work
+
Discussions
Collaborate outside of code
Explore
+ All features
+ Documentation
+ GitHub Skills
+ Blog
* Solutions
For
+ Enterprise
+ Teams
+ Startups
+ Education
By Solution
+ CI/CD & Automation
+ DevOps
+ DevSecOps
Resources
+ Learning Pathways
+ White papers, Ebooks, Webinars
+ Customer Stories
+ Partners
* Open Source
+
GitHub Sponsors
Fund open source developers
+
The ReadME Project
GitHub community articles
Repositories
+ Topics
+ Trending
+ Collections
* Enterprise
+
Enterprise platform
AI-powered developer platform
Available add-ons
+
Advanced Security
Enterprise-grade security features
+
GitHub Copilot
Enterprise-grade AI features
+
Premium Support
Enterprise-grade 24/7 support
* Pricing
Search or jump to...
Search code, repositories, users, issues, pull requests...
Search
[ ]
Clear
Search syntax tips
Provide feedback
We read every piece of feedback, and take your input very seriously.
[ ] [ ] Include my email address so I can be
contacted
Cancel Submit feedback
Saved searches
Use saved searches to filter your results more quickly
Name [ ]
Query [ ]
To see all available qualifiers, see our documentation.
Cancel Create saved search
Sign in
Sign up
You signed in with another tab or window. Reload to refresh your
session. You signed out in another tab or window. Reload to refresh
your session. You switched accounts on another tab or window. Reload
to refresh your session. Dismiss alert
{{ message }}
potahtml / mpa-archive Public
* Notifications You must be signed in to change notification
settings
* Fork 1
* Star 122
*
Crawls a Multi-Page Application to a zip file, serve the Multi-Page
Application from the zip file. A MPA archiver. Could be used as a
Site Generator
License
MIT license
122 stars 1 fork Branches Tags Activity
Star
Notifications You must be signed in to change notification settings
* Code
* Issues 0
* Pull requests 0
* Actions
* Projects 0
* Security
* Insights
Additional navigation options
* Code
* Issues
* Pull requests
* Actions
* Projects
* Security
* Insights
potahtml/mpa-archive
This commit does not belong to any branch on this repository, and may
belong to a fork outside of the repository.
master
BranchesTags
Go to file
Code
Folders and files
Name Name Last commit message Last commit date
Latest commit
History
21 Commits
src src
.gitignore .gitignore
.npmignore .npmignore
LICENSE LICENSE
package.json package.json
readme.md readme.md
release.bat release.bat
tsconfig.json tsconfig.json
View all files
Repository files navigation
* README
* MIT license
Multi-Page Application Archive
Crawls a Multi-Page Application into a zip file. Serve the Multi-Page
Application from the zip file. A MPA archiver. Could be used as a
Site Generator.
Installation
npm install -g mpa-archive
Usage
Crawling
mpa http://example.net
Will crawl the url recursively and save it in example.net.zip. Once
done, it will display a report and can serve the files from the zip.
Serving
mpa
Will create a server for each zip file on the current directory. Host
is localhost with a port seeded to the zip file path.
Features
* It uses headless puppeteer
* Crawls http://example.net with cpu count / 2 threads
* Progress is displayed in the console
* Fetches sitemap.txt and sitemap.xml as a seed point
* Reports HTTP status codes different than 200, 304, 204, 206
* Crawls on site urls only but will fetch external resources
* Intercepts site resources and saves that too
* Generates mpa/sitemap.txt and mpa/sitemap.xml
* Saves site sourcemaps
* Can resume if process exit, save checkpoint every 250 urls
to consider
* save it in an incremental compression format, that doesnt require
re-compressing the whole file when it changes, maybe already does
that?
* urls to externals resources are not re-written to be local
resources, if this is done then stuff loaded from the root will
break
* it should crawl the site by clicking the links instead of opening
a full tab
About
Crawls a Multi-Page Application to a zip file, serve the Multi-Page
Application from the zip file. A MPA archiver. Could be used as a
Site Generator
Resources
Readme
License
MIT license
Activity
Custom properties
Stars
122 stars
Watchers
3 watching
Forks
1 fork
Report repository
Releases
No releases published
Packages 0
No packages published
Languages
* JavaScript 99.2%
* Batchfile 0.8%
Footer
(c) 2024 GitHub, Inc.
Footer navigation
* Terms
* Privacy
* Security
* Status
* Docs
* Contact
* Manage cookies
* Do not share my personal information
You can't perform that action at this time.