Saltar al menú principal
Saltar al contenido

Spanish Web Archive

What is an internet archive?

The term “internet archive” refers to the collection formed by the automated collection of websites. An internet web archive comprises websites and pages the contents of which have been designed for publishing on communication networks. The purpose of an internet web archive is to preserve and disseminate these “digital native” resources so they can be tools of knowledge for the generations of today and of the future.

Spanish Web Archive

In 2009, the Biblioteca Nacional de España (BNE) created the Spanish Web Archive with the aim of preserving and facilitating future access to Spanish content published online (websites, blogs, forums, distribution lists, documents, images, videos, etc.). In Spain, the PADICAT (Patrimonio Digital de Cataluña) and the ONDARENET (Archivo del Patrimonio Digital Vasco) have been managing the Catalan and Basque digital heritage archives since 2005 and 2007, respectively.

Based on the UNESCO Charter for the Preservation of Digital Heritage (2003) and on the Recommendation by the European Commission on the digitisation and online accessibility of cultural material and online preservation, the BNE captures Spanish websites and pages hosted on the .es domain, as well as other generic domains and subdomains (.com, .edu, .gob, .org, .net, etc.).

In the framework of this project, the BNE has been a member of the International Internet Preservation Consortium (IIPC) since 2010, an organisation that brings together the most important initiatives in web archiving worldwide and which includes national libraries from all over the word as well as heritage institutions such as university and research archives and libraries;it has also formed part of its Management Board since 2014.

Since the kick-off of the BNE project in 2009 to the end of 2013, there have been eight broad web crawls performed on the .es domain, and two focused crawls. The aim of the first selective crawl was to give monographic coverage of the General Election of 20 November 2011, and the second undertook to gather Spanish resources in the field of Humanities. The result of these crawls, carried out by Internet Archive for the BNE, was transferred to the Biblioteca's servers at the end of 2014, thanks to a cooperation agreement signed with Red.es. Red.es cooperates actively with the Biblioteca to develop technology and infrastructure to manage the legal deposit of online publications.

In 2014, the Biblioteca installed the open-code NetarchiveSuite toolkit in a test environment to track and archive the web. This open-code software is used by other national libraries to track their respective websites, such as those of Denmark, France, Austria and Estonia. With this system the Biblioteca has since carried out various focused crawls on relevant events for Spanish history and culture, such as the death of Adolfo Suárez, the abdication of Juan Carlos I, the proclamation of Felipe VI, the 9N inquiry in Catalonia, the 2014 European elections, and the 2015 local and regional elections.

With the approval of the Royal Decree regulating the Legal deposit of online publications, the BNE and the conservation centre of the Autonomous Communities have the legal backing qualifying them to gather websites, as part of the mission to preserve documentary heritage which is their own.