Pythonåãæ·±å±¤å¦ç¿ãã¼ã¹ã®OCRã½ãªã¥ã¼ã·ã§ã³
docTRã使ç¨ãã¦ç»åããé«ç²¾åº¦ãªããã¹ãæ½åºã»èªèãå®ç¾
PythonåãdocTR APIã¨ã¯ï¼
docTRï¼Document Text Recognitionï¼ã¯ãPythonåãã®æ·±å±¤å¦ç¿ãã¼ã¹ã®å 妿åèªèï¼OCRï¼ãªã¼ãã³ã½ã¼ã¹ã©ã¤ãã©ãªã§ããã¹ãã£ã³ããææ¸ãç»åãPDFã«å¯¾ãã¦æå 端ã®ããã¹ãæ¤åºã»èªèæ©è½ãæä¾ãã¾ããç¾ä»£çãªæ·±å±¤å¦ç¿ã¢ã¼ããã¯ãã£ãæ´»ç¨ãããã¨ã§ãææ¸æ§é ãä¿ã¡ã¤ã¤é«ç²¾åº¦ãã¤å¹ççãªããã¹ãæ½åºãå®ç¾ãã¾ãã
docTRã¯ææ¸ã®ãã¸ã¿ã«åãèªåãã¼ã¿æ½åºãAIãã¼ã¹ã®ããã¹ãèªèã¢ããªã±ã¼ã·ã§ã³ã«åºãå©ç¨ããã¦ãã¾ããè¤æ°è¨èªå¯¾å¿ãææ¸ãæåèªèãGPUã¢ã¯ã»ã©ã¬ã¼ã·ã§ã³ã«ããããã©ã¼ãã³ã¹åä¸ããµãã¼ããã¦ãã¾ãã
docTR APIã®ä¸»ãªæ©è½
- é«åº¦ãªæ·±å±¤å¦ç¿OCR: ãã¥ã¼ã©ã«ãããã¯ã¼ã¯ã使ç¨ããç²¾å¯ãªããã¹ãæ¤åºã»èªè
- ãã«ããã©ã¼ããã対å¿: ç»åãPDFãã¹ãã£ã³ææ¸ã¨ã·ã¼ã ã¬ã¹ã«é£æº
- ææ¸ãæåèªè: é©ç°çãªç²¾åº¦ã§ææ¸ãããã¹ããæ¤åºã»æ½åº
- å¤è¨èªèªè: æ§ã ãªè¨èªã¨æåä½ç³»ããµãã¼ã
- é度æé©å: GPUã¢ã¯ã»ã©ã¬ã¼ã·ã§ã³ã«ããå¹ççãªããã¹ãæ½åº
- ææ¸ã¬ã¤ã¢ã¦ãä¿æ: ããã¹ãèªèæã«ææ¸æ§é ãç¶æ
- ã¹ã±ã¼ã©ãã«ã§ãªã¼ãã³ã½ã¼ã¹: ç¡æã§å©ç¨å¯è½ãç¶ç¶çã«æ¹å
docTR APIã®å§ãæ¹
docTRãã¤ã³ã¹ãã¼ã«ããã«ã¯ã次ã®pipã³ãã³ãã使ç¨ãã¾ãï¼
docTRã®ã¤ã³ã¹ãã¼ã«
pip install python-doctr
ããé«éãªå¦çã®ããã«GPUã¢ã¯ã»ã©ã¬ã¼ã·ã§ã³ãæå¹ã«ããå ´åã¯ã追å ã®ä¾åé¢ä¿ãã¤ã³ã¹ãã¼ã«ãã¦ãã ããï¼
GPUä¾åé¢ä¿ã®ã¤ã³ã¹ãã¼ã«
pip install tensorflow-gpu torch torchvision
docTR APIã使ç¨ããããã¹ãæ½åºã³ã¼ãä¾
以ä¸ã«ãdocTRã使ç¨ãã¦ç»åãææ¸ããããã¹ããæ½åºããããã¤ãã®ä¾ã示ãã¾ãã

ä¾1: ç»åããã®ããã¹ãæ½åº
ãã®ä¾ã§ã¯ãç»åãèªã¿è¾¼ã¿ãdocTRã§OCRãé©ç¨ããããã¹ããæ½åºããæ¹æ³ã示ãã¾ããæ½åºãããããã¹ãã«ã¯ç»åå ã®ä½ç½®æ å ±ãå«ã¾ãã¦ãããæ§é åææ¸å¦çã«å½¹ç«ã¡ã¾ãã
ç»åããã®ããã¹ãæ½åº
from doctr.io import DocumentFile
from doctr.models import ocr_predictor
doc = DocumentFile.from_images("sample.png")
model = ocr_predictor(pretrained=True)
result = model(doc)
print(result.export())
ä¾2: è¤æ°ãã¼ã¸PDFææ¸ã®å¦ç
è¤æ°ãã¼ã¸ãå«ãPDFãã¡ã¤ã«ããããã¹ããæ½åºããå¿ è¦ãããå ´åãdocTRã¯ããã»ã¹ãç°¡ç´ åãã¾ãã以ä¸ã®ä¾ã¯ãåãã¼ã¸ããå¹ççã«ããã¹ããæ½åºããæ¹æ³ã示ãã¦ãã¾ãã
PDFããã®ããã¹ãæ½åº
from doctr.io import DocumentFile
from doctr.models import ocr_predictor
doc = DocumentFile.from_pdf("sample.pdf")
model = ocr_predictor(pretrained=True)
result = model(doc)
print(result.export())
ä¾3: ææ¸ãæåã®èªè
docTRã¯ææ¸ãæåãèªèå¯è½ã§ãææ¸ãã¡ã¢ããã©ã¼ã ãæ´å²çææ¸ã®ãã¸ã¿ã«åã«æé©ã§ãããã®ä¾ã§ã¯ãåæææ¸ãææ¸ããã®ããã¹ãæ½åºã示ãã¾ãã
ææ¸ãããã¹ãã®æ½åº
from doctr.models import ocr_predictor
from doctr.datasets import synthetic_documents
doc = synthetic_documents()[0]
model = ocr_predictor(pretrained=True)
result = model(doc)
print(result.export())
ã¾ã¨ã
docTR APIã¯ãç»åãPDFãææ¸ãææ¸ããããã¹ããæ½åºããå¼·åãªæ·±å±¤å¦ç¿ãã¼ã¹ã®OCRã½ãªã¥ã¼ã·ã§ã³ã§ããææ¸æ§é ãä¿æããªããé«ç²¾åº¦ãå®ç¾ããAIé§åã®ææ¸å¦çãèªååããã¼ã¿æ½åºã«ä¸å¯æ¬ ãªãã¼ã«ã§ãã
ææ¸ãã¸ã¿ã«åãèªåãã¼ã¿å ¥åãAIãã¼ã¹ã®ããã¹ãèªèãªã©ããããããã¼ãºã«åãããæè»ã§å¹ççãªã½ãªã¥ã¼ã·ã§ã³ãæä¾ãã¾ãã