/webarchive/ will be pruned of most archives

Most if not all of the warc (warc.gz, cdx, log) and 7z files from https://www.quaddicted.com/files/webarchive/ (aka https://www.quaddicted.com/webarchive/) will be taken offline soon (end of May?). The website archives hosted in a browser-accessible way there, will stay where they are for now.

I might put them up on archive.org but might not. If you would like to do that, please feel welcome to and simply give a note. If you do, please make sure to mention their provenience for future archivists. If you ever need any of those files in the future, just contact me as I should still have them. I might also reupload them at some point in the future, in a different kind of archive, but you know how that goes…

Why: Trying to make the project more maintainable I am cleaning up some of the messy old cruft.

My 2cents: (it’s so cool of you to ask first)

If the content is not available elsewhere (e.g., plundering the Wayback Machine), then I’d recommend uploading it to archive.org.

If it is a local duplicate of the wayback archive, then it’s less critical to re-archive.

My guess is that a lot of it is based on your own crawls. In which case, it complements the wayback archive and is important/valuable to preserve publicly, such as on archive.org (unless its expensive in time/money to do so).

1 Like

Cheers!

Those were all done by me. Some of those sites might already be well archived in the Wayback Machine, others not. I decided to go the extra mile today instead of working on other stuff so I am currently uploading them to archive.org. They might be ingested in the Wayback Machine later.

The script is uploading and removing them as it goes, so if someone is looking for a particular WARC, check https://archive.org/details/@o373nmu6mf?and[]=mediatype%3A"web"

I will publish all the archive.org URLs here once done.

1 Like

The first bunch is uploaded and removed from quaddicted.com:

And some more, which were missing either .cdx, .log or both:

This means all WARC archives are migrated off the site.

I will upload the 7zip archives some other day.

This is a labour of love. Thank you !

2 Likes