Comparison between the Wayback Machine and Archive.Today
Article far from finished. Everyone is invited to contribute.
Archive.Today was fonuded in 2012 as a Wayback Machine alternative with support for fragments (
#) in URLs and so-called hash bang URLs (
The Internet Archive, which is the organisation that powers the Wayback Machine, was founded in 1996.
Archive.Today stores pages in a modified memento format.
All CSS is converted to inline CSS code in one HTML document, stripping off selectors such as
The Wayback Machine retains a verbatim copy of site's contents of which the original source code of each original file is retrievable using URL parameters.[which?]
Supported media types
Of HTML5 videos, only the placeholder will be archived.
Video archival however is supported on Imgur.
While Archive.Today often handles URL submissions in a queue, the Wayback Machine can archive immediately.
Both services usually index captures immediately, making them retrievable and searchable. However, before 2013, the Wayback Machine indexed pages with a delay of 6 to 18 months. In late 2019, the Wayback Machine suffered from several technical issues with their indexing, causing indexing delays and temporary unavailability of pages.
In mid-2019, both the Wayback Machine and Archive.Today have imposed strict rate limiting, both of which have changed over time.
The Wayback Machine limits submissions to a seemingly a random number of pages per minute, and sometimes shows HTTP 429 on the first submission, although a former HTTP 429 error page suggested a limit of 15 pages per minute.
Archive.Today's rate limitations are seemingly an unspecified number of pages over an unspecified time span. One hour may be used as rule of thumb.
While the Wayback Machine shows a HTTP 429 error page whose appearance changed over time, Archive.Today used to stop responding to browser requests. Later, Archive.Today started showing Google Captcha to users hitting rate limits.
Using the respective submission forms, Archive.Today allows archiving the same page every 5 minutes while the Wayback Machine only allows only the same page once per 20 minutes (formerly 10 and 5 minutes).
The Wayback Machine allows saving the outlinks of pages along the page. Since early 2020, this feature is only available to users who are logged in.
On bottomless pages such as YouTube comments and Quora, Archive.Today has the ability to scroll down and crawl a few screen heights of additional content from bottomless pages.
Archive.Today has more seamless support for crawling websites that rely on AJAX and XHR.
Progressive web applications
On November 29th 2019, Archive.Today has upgraded their headless' browsing engine for crawling sites from the abandonware PhantomJS to Chromium, enabling the archival of content served through progressive web applications such as Instagram and Twitter Lite.
However, only pages captured via PhantomJS can not be downloaded as ZIP file.
When archiving a page through the live web, which means appending the URL to
web.archive.org/save/, the user agent of the used browser will get passed through the live web to the website, which can influence the appearance of sites utilizing dynamic serving.
In 2020, this feature has occasionally (including March 13 for a few days and October 13th) been replaced with a behaviour where the site is archived as if it were submitted to the submission form.
The request to open the site with the live web URL will be responded with a redirect to the permanent URL (with time stamp) as soon as it is finished, which however has always been the default behaviour for non-HTML content such as images, videos, plain text files and other binary data.
Archival of URL fragments
Archive.Today supports the archival of URLs with URL fragments (hashtag sign
#) and hash bangs (
Archive.Today supports backing up pages from the Wayback Machine, WebCitation.org, Google Cache and more sites while adapting the date and time to that of the source, indicated as original time stamp.
In comparison, the Wayback Machine is unable to back up pages from Archive.Today.
- Imgur video archived by Archive.Today (short URL)
- Baron, Alexander (2013-10-23). "The new Internet Archive Wayback Machine now online". www.digitaljournal.com. Retrieved 2020-09-10.
- Poal.co post: Did you know?: Before 2013, the Wayback Machine indexed new crawls in 6 to 18-month intervals.