The National Széchényi Library is home to the largest state-of-the-art digitization center in Central Europe. What does this mean in practice? What kinds of (potentially digital) proportions are we talking about?
The National Széchényi Library stores hundreds of millions of pages of documents. That is a vast amount. In the earlier decades, the library was left behind a bit when it came to mass digitization. Some good initiatives were launched, but the institution was slow to move from large-scale production to “mass production,” mass content services. Private companies, on the other hand, had already started doing this work. When the machines finally arrived and the digitization center was established, we did not have the right workflow to meet our objectives, so we had to reorganize our work procedures completely. We managed to multiply our skills with basically the same staff, with training and retraining, and last year, we were able to start mass-producing digital documents at the National Széchényi Library. At present, we digitize 1 to 1.1 million pages every month, which is quite significant even at an international level. According to the plans, we digitize journals, books, and special documents, as well as codices, sheet music, in collaboration with the Haydneum–Hungarian Centre for Early Music, and maps and posters.
Could you say a word or two about what a digitization center like this would look like to the layman?
Imagine an area of over a thousand square meters with robotic scanners and V-shaped book scanners, which are used to digitize sensitive documents. There are tools that make it possible to digitize large-sized materials, and high-quality cameras are used to digitize codices and other manuscripts. Post-production is also done here. We have a storage department for the raw material produced, which stores files so that they can be repeatedly converted to other formats without losing data, and the resulting digital objects are also made ready here for content services. The workflow is more complex here than at for-profit companies, since we need to preserve the original documents as well.
Would the ultimate goal be to digitize the entire collection?
The goal is to make all Hungarian printed matter available online, but this means digitizing hundreds of millions of pages. Even if we double the annual capacity, that is, were we to digitize 20 million pages per year, we would still need decades. Our microfilmed stock, which is more than 100 million pages, could be made available in ten years’ time, because we can work at a speed of tens of thousands of pages a day. However, if we consider the fact that there are about 16-17 thousand books published every year, and clearly one can hardly ignore the innumerable periodicals that also come out, it is quite likely that the entire collection will only be digitized in my lifetime if I am lucky enough to live an exceptionally long life.
Some scientific publications and fiction books are published online, and also on Facebook, for instance. Is there any protocol for adding these works to the collection?
The internet was not originally invented for the various uses to which it is so often put today. No one thought thirty years ago that it would become a medium we simply could not live without. When librarians realized that it had indeed become this, i.e. that it had become a new place and a new means of generating and transmitting knowledge, they started to ponder how the content generated on the web could be preserved. The very first Internet archives appeared after the turn of the millennium, managed by national libraries in more than 50 countries. It is impossible to archive every status of every website, or every new news item on a news website. The established method is that each country preserves its own web space, so we save the .hu web domain and all Hungarian-related sites beyond the borders twice a year. We have a robot run through the .hu web space, down to a certain depth, saving the state of the page at that moment. In the case of some major events, such as the 2021 International Eucharistic Congress, the Olympics, and the UEFA Euro championship, we preserve every single news article and web content. This is referred to as event-based web archiving.
Full digitization does seem to be the ultimate goal, then. If this is achieved and everything is made available in a digital format 70 years after the author’s death when the copyright expires, will there be any need for libraries?
We are already processing content with this mission in mind: all publications in the public domain, which means works the authors of which passed away more than 70 years ago or works for which a given institution has permission to make accessible, are digitized and made available as public domain. There is an intermediate area: EU copyright law now allows publications that are no longer commercially available but are still protected by copyright to be made available as part of the public domain after they have been “parked” in the EU database for six months, with the provision that the author or the heir can remove them at any time. This is usually the case with works published before 1999, so we digitize and make available works published in the 1990s. Our goal is not only to digitize as much of our cultural heritage as possible but to make it available, either online or on a network dedicated to this. Will there be any need for a library? I imagine there will. If we cannot understand the library as something more than a collection of printed books, this would mean that we consider the library as an institution to be little more than a warehouse. But a library is much more than that, of course. Libraries mean books, whether in a physical or digital format, but they also mean qualified professionals, that is, librarians, not to mention readers. The digitized content online with all the metadata is hardly a sufficient substitute or replacement for a library. Libraries are places that provide reliable, credible information, places where culture and literacy are nurtured, encouraged, and facilitated by qualified staff. Libraries have been around for thousands of years – in various forms, using various media – and I believe they will be around for many more decades and centuries to come. Perhaps they will not be located in buildings, or perhaps not in buildings of this size, but there will always be a need for people who can help guide us through the flood of information.
Although we are talking about a library, books are only one part of the collection. What else does the National Széchényi Library have in its collections?
In addition to books, periodicals are another huge and frequently used part of the collection, but half of our documents are, in fact, posters and small print. The latter come in an infinite variety of forms, such as business cards, program guides, menu cards, invitations, stamps, or obituaries. Small print is the largest unit of our collection, and this collection also available online. It is so vast that processing all this material is both difficult and slow. The National Széchényi Library has the largest collection of manuscripts in the country, with approximately one and a half million items and legacies from renowned figures, such as Ferenc Kölcsey and Sándor Petőfi, both well-known figures in the Hungarian literary canon. The library contains manuscripts from the library of Hungarian king Mátyás Hunyadi, as well as seven hundred codices, mostly in Latin but also in Slavonic, German, and Greek, and one or two codices written entirely in Hungarian. We have the largest collection of old books in the country, including the Buda Chronicle, the very first book printed in Hungary. Ferenc Széchényi’s collection of maps includes both manuscripts and printed works. We also have the largest collection of gramophone records in Hungary, and our photographic archive contains 600,000 photographs. There are more than 1,500 historical interviews to choose from. The library is practically an inexhaustible resource.
How long can all this be saved from decay?
If we asked a hundred people, most would guess that the oldest items in the holdings of the National Széchényi Library are codices from the medieval period, but this is not the case. The oldest items are two papyrus fragments, over three thousand years old. Our codices are written on all sorts of materials, from parchment to cotton rag paper and thick, high-quality paper. There is evidence that the latter can survive for hundreds or thousands of years. On the other hand, some documents that were written or printed on paper used during wartime, for example, or acidic, poor-quality materials have a much lower chance of survival than an 800-year-old codex, not to mention today’s low-quality paperback editions or books bound using glue. To survive for hundreds or thousands of years, books need continuous preservation work, done by our conservators. In an ideal case, several generations will be able to admire the library’s holdings. A common problem with manuscripts, including the manuscript of the poem Himnusz by Ferenc Kölcsey, which is now the text of the Hungarian national anthem, is that they were written using unsuitable inks or have been exposed to forces (light, temperature, etc.) that affect their condition and their future. Documents or manuscripts written in iron gall ink or on which iron gall ink was spilled require conservation efforts if the paper is going to survive.
So, the holdings of the National Széchényi Library are intended, ideally, to last forever, and some of the items in the collection are also preserved in the Arctic World Archive in Svalbard. Which are the works designated for „survival”?
People perhaps sometimes do not fully grasp that librarians are experimental and innovative. We are constantly searching for new technologies that help the long-term preservation of data and digital objects. As early as the 2000s, a seed bank was established in Svalbard, preserving a huge proportion of the seeds of the world in permafrost. In a similar way, an archive was set up a few years ago along the same lines. It is used to store film copies of the digitized material from each major library in permafrost. The National Széchényi Library was one of the first national institutions and the first national library worldwide to join this endeavor. We have recorded digitized versions of the manuscripts from the library of Mátyás Hunyadi, Count Széchényi’s map collection, as well as posters from the 1920s and 1930s on special tapes. This is another way of ensuring long-term preservation. Experiments indicate that the film reels can be preserved for a thousand years, at least, in the ice cave.
The original version of this interview was published in the special issue of the journal Magyar Kultúra, titled “Books” (2023/5).