Home > FAQ

Frequently Asked Questions about WARP

Questions

<General Information about Web Archiving Project (WARP)>

<How WARP Archives Websites>

<Online Periodicals>

<Browsing Archived Websites>

<Websites of Official Institutions>

<Privately Operated Websites>

Questions and Answers

<General Information about Web Archiving Project (WARP)>

What kinds of websites are archived by WARP?
We archive the websites of the national, prefectural, and municipal governments, including those of prefectures, designated cities, cities, and towns as well as committees for municipal mergers, independent administrative corporations, semi-governmental corporations or agencies, universities, events, online periodicals, and similar sites.

To Top

How often do you archive these websites?
National institution websites are archived once per month, and the websites of other institutions subject to collection under the provisions of the National Diet Library Law are archived four times per year. Websites of institutions that are not subject to collection under the provisions of the National Diet Library Law are generally archived once per year. Online periodicals are archived as they are published to avoid omissions and according to whether or not past issues are available.

To Top

For how long do you archive data?
Our goal is to preserve and to make available the data we archive for as long as technically possible.

To Top

Our institution would like to link our website to WARP. Are there any procedures we should follow?
We welcome and encourage other websites to link to pages of the WARP. No permission is necessary to do so, but please be sure that links are labelled clearly to indicate that WARP is the destination site.

To Top

<How WARP Archives Websites>

By what means does WARP archive websites?
We use an automated program, known as a web crawler, to archive websites.

To Top

Do web crawlers cause congestion at the servers of the websites to be archived?
No, they do not. Web crawlers are designed to ensure that they do not cause congestion. For example, the interval between downloads is always at least one second.

To Top

Do you also archive data intended only for internal use?
No, we do not. We only archive data that has been made available to the public via the Internet.

To Top

Can you archive all kinds of files?
No, we cannot. There are technical limitations on the types of files that we can archive. We do not archive some files, including the types listed below.
  • Files that are stored in a database
  • Files that can be streamed or played back as they are downloaded
  • Files that are set to exclude bots
  • Files for which links are generated dynamically with JavaScript
  • Style sheets and JavaScript files
  • Files with corrupt character codes

To Top

<Online Periodicals>

What is an online periodicals?
At the National Diet Library (NDL), the term "online periodicals" is used to refer to any digital information that is published periodically on a network under a single title and on an ongoing basis with successive volume numbers and dates. Of these, the NDL preserves those that are available free of charge on the Internet.

To Top

What volume numbers are used for the online periodical metadata?
We use the same volume numbers used by the online periodicals themselves, and take the volume number from the original article or selection itself. Volume numbers for issues that contain only tables of contents are not recorded in the metadata even when archived.

To Top

<Browsing Archived Websites>

What is the purpose of the banner displayed at the top of archived webpages?
Websites archived by WARP are displayed with a banner at the top of the webpage that provides information about when the page was archived. The WARP banner is displayed to help ensure that users are aware the content they are browsing is an archive and the information is quite possibly out of date.
* These notes do not always display properly due to the layout or kinds of files.

To Top

What does it mean when an item is labelled as available only at the NDL?
All websites archived by WARP are available for browsing on the premises at the NDL. In cases where the copyright holder has granted permission to do so, the NDL also makes archived content available via the Internet. There are some websites, however, for which the copyright holder has not granted such permission, and these are available for browsing only on the premises at the NDL.

To Top

I can't get some areas to display.
There are technical limitations on the types of files that we can archive. We do not archive some files, including the types listed below.
  • Files that are stored in a database
  • Files that can be streamed or played back as they are downloaded
  • Files that are set to exclude bots
  • Some for which links are generated dynamically with JavaScript
  • Style sheet files and JavaScript files
  • Files with a corrupted character code
Please note that some links will direct you to the live website rather than of an archived page. You can verify whether the webpage you are viewing has been archived by WARP or is part of a live website by checking the URL in the address bar of your browser. Also, archived websites generally have the WARP banner displayed at the top of the page.

To Top

The page does not refresh even after clicking the link.
WARP uses JavaScript to display archived content. Please enable JavaScript in your browser settings. Issues with your network may also cause a delay when following links. If you have trouble loading a particular webpage, wait a while and then try to load it again.

To Top

The time and date displayed on the archived webpage doesn't match the date shown in the WARP banner.
WARP archives information from the Web by copying it to the NDL server and then providing access to users. Thus, webpages that display the time and date dynamically might not always display this information correctly.

To Top

I continually get the error message "This file was not saved on [date]."
WARP archives websites according to date. This error message is displayed when the target page of the link you followed was not archived on the same date as the previous page. Even when this message is displayed, the page you are looking for could be included in another title or under a different date. Please search again based on the results that appear on the webpage.

To Top

There are no search results for the URL I am looking for.
The URL might have been abbreviated when it was archived. Try leaving out the file name "index.html" or "index.htm."

To Top

Are archived websites still protected by copyright?
The copyright to archived websites is retained by the original copyright owner. Care is needed when using archived materials to respect the copyright and limit your reuse of such material to the extent permitted by copyright law. You are responsible for obtaining permission from the original copyright holder when you intend to reprint images, documents, articles, data, or other content from archived websites.

To Top

What does it mean to say that search results are sorted by "relevance?"
Search results are sorted by relevance according to the frequency with which keywords appear in the content. Webpages with the highest frequency appear at the top of the search results. In addition to frequency, document size and other factors affect the order of relevance.

To Top

Can users browse archived data at the NDL?
In general, users can browse all archived data at all three premises of the NDL: the Tokyo Main Library, the Kansai-kan of the NDL, and the International Library of Children's Literature.

To Top

Is archived data available to the general public via the Internet?
The NDL makes available to the general public via the Internet any archived webpages for which it has obtained permission from the copyright holder. Pages for which permission has not been obtained are available for browsing by the general public only at the Tokyo Main Library, the Kansai-kan of the NDL, and the International Library of Children's Literature.

To Top

Can I request photoduplication services at the NDL for archived webpages?
The NDL will accept requests for photoduplication of archived webpages from websites for which it has already obtained permission from the copyright holder. But we cannot make copies of webpages from websites for which the NDL has not already obtained permission from the copyright holder.

To Top

There are some archived PDF files for which printing has been disabled. Can I still request photoduplication services for these files?
The NDL is unable to provide photoduplication services for PDF files that are print disabled.

To Top

Is the NDL willing to modify or remove archived material for which a claim of copyright violation is made after the material is archived?
The NDL will consider usage restrictions and other measures based on its rules and regulations. Please contact the NDL directly via email to initiate discussion of this kind of issue.

To Top

<Websites of Official Institutions>

What is meant by the phrase "collection under the provisions of the National Diet Library Law?"
The National Diet Library Law allows the NDL to archive the websites of official institutions that are available to the general public via the Internet.

To Top

Does this include the websites of all official institutions?
Yes. Every website that is published by an official institution and is available to the general public via the Internet is subject to collection under the provisions of the National Diet Library Law.

To Top

If these websites contain links to non-government agencies, are the targets of those links archived as well?
No, they are not.

To Top

How is content from institutional repositories archived?
When content from institutional repositories is available to the general public via the Internet, it is considered to be continually available to the public for an extended period of time. Since this availability is not subject to change except under special circumstances, there is no need for the NDL to archive such content. Therefore, content on the National Institute of Informatics Current Institutional Repositories (IR) List is not archived by the NDL.

To Top

<Privately Operated Websites>

What is the process for archiving content from privately operated websites?
We archive the content of privately operated websites from which we have obtained permission to do so. Our main focus is on archiving content from the websites of non-profit incorporated associations and foundations, private universities, political parties, and international or cultural events as well as those pertaining to the East Japan Great Earthquake, online periodicals, and others. The process is as follows.
  • 1. The NDL selects candidates.
  • 2. The NDL sends the website operator a request for permission to archive.
  • 3. Interested website operators respond to the request.
  • 4. The NDL configures a web crawler based on information contained in the response and begins to archive the website's content. In the event of technical issues, archiving might be cancelled.
  • 5. Webpages become available to the general public via WARP once we have confirmed that there are no technical issues related to archiving.

To Top