I have some noisy OCR data. I want to train an LLM on it. What are the typical strategies/programs to clean noisy OCR data for the purpose of training LLMs?
submitted by /u/Franck_Dernoncourt to r/learnmachinelearning
[link] [comments]
Laisser un commentaire