OCR

Source criticism, bias, and representativeness in the digital age: A case study of digitized newspaper archives

Historians must critically scrutinize their sources, a task further complicated in the digital age by the need to evaluate the technical infrastructure of digital archives. This article critically examines digital newspaper archives, revealing error rates in optical character recognition (OCR) that compromise result reliability, and word frequency-based datasets that introduce biases due to issues in […]

Source criticism, bias, and representativeness in the digital age: A case study of digitized newspaper archives Read More »

Are searches in OCR-generated archives trustworthy?

Digitised archives are revolutionary tools for research that, in a few seconds, generate results that earlier often took years to obtain. But do they provide all results for the terms searched for? The accuracy of searches was tested by performing sample searches of leading newspaper databases. The test revealed several weaknesses in the search process,

Are searches in OCR-generated archives trustworthy? Read More »

Scroll to Top