paddleocr package使用说明
本地构建并安装
pip install dist/paddleocr-0.0.3-py3-none-any.whl
- 检测+识别全流程
from paddleocr import PaddleOCR, draw_ocr
ocr = PaddleOCR() # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs/11.jpg'
result = ocr.ocr(img_path)
for line in result:
print(line)
# 显示结果
from PIL import Image
image = Image.open(img_path).convert('RGB')
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts, scores, font_path='/path/to/PaddleOCR/doc/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
结果是一个list,每个item包含了文本框,文字和识别置信度
[[[24.0, 80.0], [172.0, 80.0], [172.0, 104.0], [24.0, 104.0]], ['产品信息/参数', 0.98069626]]
[[[24.0, 109.0], [333.0, 109.0], [333.0, 136.0], [24.0, 136.0]], ['(45元/每公斤,100公斤起订)', 0.9676722]]
......
结果可视化
- 单独执行检测
from paddleocr import PaddleOCR, draw_ocr
ocr = PaddleOCR() # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs/11.jpg'
result = ocr.ocr(img_path,rec=False)
for line in result:
print(line)
# 显示结果
from PIL import Image
image = Image.open(img_path).convert('RGB')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
结果可视化

- 单独执行识别
ocr = PaddleOCR() # need to run only once to download and load model into memory
img_path = 'PaddleOCR/doc/imgs_words/ch/word_1.jpg'
result = ocr.ocr(img_path,det=False)
for line in result:
print(line)
结果是一个list,每个item只包含识别结果和识别置信度
['韩国小馆', 0.9907421]
查看帮助信息
paddleocr -h
- 检测+识别全流程
paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg
- 单独执行检测
paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --rec false
结果是一个list,每个item只包含文本框
[[26.0, 457.0], [137.0, 457.0], [137.0, 477.0], [26.0, 477.0]]
[[25.0, 425.0], [372.0, 425.0], [372.0, 448.0], [25.0, 448.0]]
[[128.0, 397.0], [273.0, 397.0], [273.0, 414.0], [128.0, 414.0]]
......
- 单独执行识别
paddleocr --image_dir PaddleOCR/doc/imgs_words/ch/word_1.jpg --det false
结果是一个list,每个item只包含识别结果和识别置信度
当内置模型无法满足需求时,需要使用到自己训练的模型。 首先,参照inference.md 第一节转换将检测和识别模型转换为inference模型,然后按照如下方式使用
paddleocr --image_dir PaddleOCR/doc/imgs/11.jpg --det_model_dir {your_det_model_dir} --rec_model_dir {your_rec_model_dir}