Self-hosted Archiving and Digital Preservation (DP)
Software that empowers institutions to safeguard and perpetuate digital cultural heritage by developing, managing, and preserving complex archives for future generations.
-   ArchiveBox ⭐️ 20.0k Self-hosted _wayback machine_ that creates HTML & screenshot archives of sites from your bookmarks, browsing history, RSS feeds, or other sources. 
-   Wallabag ⭐️ 9.8k Wallabag, formerly Poche, is a web application allowing you to save articles to read them later with improved readability. 
-   CKAN ⭐️ 4.3k CKAN is a tool for making open data websites. 
-   Tube Archivist ⭐️ 4.1k Organize, search, and enjoy your YouTube collection. Subscribe, download, and track viewed content with metadata indexing and a user-friendly interface. 
-   bitmagnet ⭐️ 2.0k A self-hosted BitTorrent indexer, DHT crawler, content classifier and torrent search engine with web UI, GraphQL API and Servarr stack integration. 
-   Wayback ⭐️ 1.7k A self-hosted toolkit for archiving webpages to the Internet Archive, archive.today, IPFS, and local file systems. 
-   Omeka S ⭐️ 389 Omeka S is a web publication system for universities, galleries, libraries, archives, and museums. It consists of a local network of independently curated exhibits sharing a collaboratively built pool of items, media, and their metadata. 
-   Ganymede ⭐️ 378 Twitch VOD and Live Stream archiving platform. Includes a rendered chat for each archive. 
-   ArchivesSpace ⭐️ 329 Archives information management application for managing and providing Web access to archives, manuscripts and digital objects. 
-   LiveStreamDVR ⭐️ 299 An automatic Twitch recorder capable of capturing live streams, chat messages and stream metadata. 
-   Collective Access - Providence ⭐️ 283 Highly configurable Web-based framework for management, description, and discovery of digital and physical collections supporting a variety of metadata standards, data types, and media formats. 
-   Webarchive ⭐️ 88 Lightweight self-hosted _wayback machine_ that creates HTML and PDF files from your bookmarks. 
-   Sosse Selenium based search engine and crawler with offline archiving. 
-   Readeck Readeck is a simple web application that lets you save the precious readable content of web pages you like and want to keep forever. See it as a bookmark manager and a read later tool.