Show HN: Webclone.js – A simple tool to clone websites
github.comI needed a lightweight way to archive documentation from a website. wget and similar tools failed to clone the site reliably (missing assets, broken links, etc.), so I ended up building a full website-cloning tool using Node.js + Puppeteer.
Repo: https://github.com/jademsee/webclone
Feedback, issues, and PRs are very welcome.
Really nice tool, Thank you! I had confronted some issues after cloned the repo and installed the node_modules. I'm using node.js v20, when I tried "node webclone.js --help" the terminal catch a error of "Error [ERR_REQUIRE_ESM]: require() of ES Module /Users/fangyexu/Desktop/dev-s/_github/webclone/node_modules/yargs/index.mjs not supported."
Then I just throwed it to my Cursor and it help me solved the issue.
By the way, would it be possible to share how the issue was resolved on your end?
Thank you for the feedback. I will check it out.
Looks good! You could push to npm so that running it could be as easy as:
npx webclone URL (no repo cloning required)
Also, FYI, when running the example code
node webclone.js https://www.example.com/
It fails (at least for me) until I either install yt-dlp or ignore videos via:
node webclone.js https://www.example.com/
Hello. I have tested this and it indeed looks for yt-dlp at the beginning even though the site is not a video platform. I have logged the issue on GitHub and working on a fix. Thank you for the feedback!
Great feedback! Will get this fixed. Thank you.
can it also clone games? you know these web html games
I haven't tried. Would you be able to share some links so I could test?
[dead]
Hey thanks! This is quite handful, I often do this "manually", lol.
Glad you find it useful. Please feel free to share with your friends.