The success of many professionals depends on the ability to extract information from a large collection of documents. For example, financial analysts form valuations of a company by reading material published by that company including stakeholder reports.
Many of these document collections are digital. In the future, an increasing number of these document collections will be stored in a digital medium. Digital libraries are less expensive to maintain; for example, printing costs are avoided, and digital libraries do not demand the physical storage space which is necessary for physical libraries. Moreover, it is much faster to extract information from digital libraries than from a physical collection. Digital libraries can be integrated with search engines to enable faster information retrieval for the user.
The most popular information retrieval system is Google, which retrieves information from files made public on the world wide web. Over the past 22 years, Google has dedicated itself to building saleable search engine software, capable of querying the massive amount of information available on the world wide web (as of October 2020, the web contains at least 5.43 billion pages). Suffice to say, the search engine software currently used at Google is extremely powerful.
Google’s search engine works when a user is interested in retrieving information from the world wide web, but many professionals need customized search engines which are configured for a specific digital library. Many professional document collections are private and therefore are not indexed by search engines. For example, law firms review documents which can not be published on the world wide web due to privacy concerns. In addition, professional users want to query their company’s library without querying documents outside the library. For these reasons, customized search engines are an attractive solution for many companies.