diff --git a/README.md b/README.md index 7f07d6e..35d7bcc 100644 --- a/README.md +++ b/README.md @@ -341,3 +341,36 @@ else fi ``` +**Instalacion OCR en Alpine (para el autorrelleno en servidor)** +```bash +apk update +apk add --no-cache \ + tesseract-ocr \ + tesseract-ocr-data-spa \ + tesseract-ocr-data-eng \ + python3 py3-pip py3-virtualenv \ + build-base python3-dev musl-dev \ + libffi-dev openssl-dev zlib-dev \ + jpeg-dev tiff-dev freetype-dev lcms2-dev openjpeg-dev + +python3 -m venv /opt/saludut/venv +. /opt/saludut/venv/bin/activate + +pip install --upgrade pip +pip install pdfplumber pymupdf pytesseract pillow +``` + +En `backend/.env`: +``` +PYTHON_PATH=/opt/saludut/venv/bin/python +TESSERACT_PATH=/usr/bin/tesseract +TESSDATA_PREFIX=/usr/share/tessdata +``` + +Verificacion: +```bash +tesseract --list-langs | grep spa +/opt/saludut/venv/bin/python -c "import pytesseract; print(pytesseract.get_tesseract_version())" +``` + +