READ: PRHLT indexing and search engine. Guidelines for the Bentham collection

  1. Overview
  2. Confidence level
  3. Maximum results
  4. Searching
  5. Viewing search results
  6. Starting a new search
  7. Advanced searching
  8. Example search queries
  9. Additional help

Overview

This interface allows users to search over 90,000 images comprising the main collections of manuscripts written by the English philosopher Jeremy Bentham (1748-1832), which are held by Library University College London and The British Library. This interface is in its testing stage. Feedback is welcomed transcribe.bentham@ucl.ac.uk

The PRHLT research center has processed the Bentham Papers with cutting-edge handwritten text recognition and probabilistic word indexing technologies. The result is that this vast collection of Bentham's papers can now be efficiently searched with a fair level of accuracy, including those papers that have not yet been transcribed.

This work is the result of a collaboration of the PRHLT research center at the Universitat Politècnica de València with the Bentham Project at University College London, as part of the READ project which has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No. 674943.

Happy searching!

Confidence level

At the top of the page, a text box is provided for you to enter particular words and phrases that you wish to find among the manuscripts.

Beneath the search box at the top of the page, a confidence box and a confidence slider are provided. These allow you to specify, as a number between 1 and 100, the degree of confidence that you wish to search at.

If the confidence level is set at a high number, the platform will return fewer results but the retrieved words are more likely to be correct. If the confidence level is set at a lower number, the platform will return more results but the retrieved words are less likly to be correct.

Maximum results

A maximum results box is also provided to allow you to specify the number of search results you wish to see.

Searching

To begin searching, set your desired confidence level and maximum number of results. Type your query into the text box and click 'Search'. The default confidence level is 50%,

Viewing search results

Search results are displayed at 3 hierarchical levels: 1. collection, 2. box and 3. page image. It is best to open each level in a new tab in order to retain all of your search results.

Viewing search results - at collection level

After making your search, the system will display the results at collection level. You are presented with a banner stating the number of boxes which contain the relevant word or phrase.

Viewing search results - at box level

Click this banner and you will see each box listed individually, along with the number of pages it contains which match your search query.

Click on the thumbnail image of each box to view the search results for that box. It is best to open each box in a new tab in order to retain all of your search results.

Viewing search results - at page level

Click on the thumbnail image of an individual box in order to see a display of relevant manuscript images.

Each entry specifies the page number, the name of the penner who wrote the page, a thumbnail of the manuscript, the number of matching words on that page and a confidence bar. By hovering your mouse over the confidence bar, the precise confidence value will be shown as a percentage.

You can also hover your mouse over the thumbnail image of a page to see its exact box and folio number. The page numbers represent the number of the image; the folio number represents the Library catalogue number -- there are often several images, and thus several page numbers, for each Folio -- see a detailed explanation here.

Clicking on the thumbnail image of a manuscript will open the page in question. Again, it is best to open each page in a new tab in order to retain all of your search results.

The folio number and penner of each page (i.e., the hand) is displayed at the top of the screen.

The results of the search queries (called 'spots') will be highlighted in the manuscript image. The colour of the box surrounding the relevant word indicates the confidence level, with green being the highest.

Starting a new search

You can start a new search at any time by typing a query into the search box.

If you are already viewing a particular box/page, the platform will only search that particular box/page.

To search the entirety of the Bentham papers, click 'HOME' at the top of the manuscript image or click on the 'Bentham Papers Indexing and Search' link at the top left of the webpage.

To search within a particular box, go to the searching homepage, click the banner and then click the thumbnail of the box you wish to search in.

Advanced searching

In order to receive more specific results and to achieve a greater level of accuracy, a wide range of query formatting options are available, and several hints and tips for searching the collection are provided below.

Example search queries

Single words

Compound queries: OR

Compound queries: AND

Compound queries: NOT

Compound queries: mixed AND/OR/NOT

Compound queries: proximity

Compound queries: phrases

Compound queries: phrases + boolean

  • [New York] || [New Orleans]
  • [New (York || Orleans)]
  • [New York] && [New Jersey]
  • bentham &10& samuel - [Samuel Bentham]
  • [pain (and||or) pleasure] || [pleasure (and||or) pain]
  • [pain (and||or) pleasure] && [pleasure (and||or) pain]

Wildcard and Approximate queries

Additional help

Please feel free to contact transcribe.bentham@ucl.ac.uk with any questions or comments.