-
Notifications
You must be signed in to change notification settings - Fork 215
Closed
Description
If text is extracted by OCR, it will be cleaned afterwards by Docsplit::TextCleaner which starts by converting the text to ASCII, replacing all non-ASCII characters with '?'. Therefore all German umlauts (and probably special characters from other languages, too) are lost.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels