Text Recognition for own materials
The Transcribus way
Transcribus is a tool made for researchers, archives for text recognition, layout analysis both for handwritten and printed text for several languages. Currently free to use.
Install transcribus software to your own machine. It acts as the way to transfer data to the Transcribus servers, where all the "heavy-lifting" with the materials is done.
It might be useful to also create your own account to find own materials better.
Video example of Transcribus
Start Transcribus software locally
Login

Upload material to Transcribus server
Note the usage terms vs. the terms of which your material have. If there are no limitations, then upload your material to the Transcribus.
- Pick 'Import' icon from top toolbar.
- Select a folder from your machine to upload images to Transcribus
Wait for a while to get images transferred to the server. Depending of material amount and other users on the server this can take a while or be ready immediately.
Create a collection if needed. Open up the collection .
Run the OCR
- Go to Tools > 'Text Recognition'
Transcribus can use HTR (Handwritten Text Recognition) or basic OCR (Optical Character Recognition) at the moment.
Pick OCR and click 'Run...'
Wait a moment, Tesseract background servers will do text recognition and store the results.
Export the OCR results
On the top menu bar, pick 'Export' and select the desired output formats.
Click then 'OK' , you'll get the export to your email. (TODO: how long does the exporting take...(?))