A repository of raw text of magazines and newspapers written in English, Spanish and Tagalog between 1898 and 1972.
Sometimes, to analyse texts with distant reading, you need a list of words without a semantic content. This is a list for Tagalog.
In order to obtain better results in the OCR of newspapers, which sometimes were written in two or more languages, we trained a model on Transkribus, which combined printed documents in the three languages, Tagalog, English and Spanish. Would you like to use this model? Just let us know and we will share it with your collection!
Sometimes, to analyse texts with distant reading, you need a list of words without a semantic content. This is a list for Tagalog.
Here you can find a list of repositories that hold digitized versions of Philippine newspapers written between 1860 and 1960.
Whe have explained some of the results of this project in an online exhibition on Philippine newspapers.