Quantcast
Channel: GhostScript – Ephesoft Docs
Viewing all 23 articles
Browse latest View live

Create Multi Page Files (plugin)

$
0
0

2.4 Feature/Change

PDF creation command through ghost script is made configurable through the admin UI in export plugin. A new plugin property is added in Create MultiPage files plugin with which user can configure the parameters for PDF creation for custom PDF creation. Advantage of this is to provide user to create custom PDF files like PDF/A.





Convert PDF to TIF

$
0
0

Here is the imagemagick code, basically imagemagic works like this convert.exe [paramaters for openning the file] inputfile [parameters for saving the file] output Since PDFs are opened at 72DPI by default, you need to use open with higher resolution. Suggested command:

convert -density 300 input.pdf -compress group4 output.tif
300 is the DPI. You can use other values if you’d like. If -density is changing the page size, you should use “-resample 300×300” instead of -density 300×300. See here: http://www.imagemagick.org/script/command-line-options.php#resample

Imagemagick uses ghostscript behind the scenes and it might be slow on large PDF files. You can also just use ghostscript to convert PDFs to tiffs. Command:

gswin64c.exe -dNOPAUSE -r300 -sDEVICE=tiff24nc -dBATCH -sOutputFile=output.tif input.pdf

convert output.tif -compress group4 finaloutput.tif

-r300 specifies the DPI. You can change it.

Increase Performance/Output Quality Using Different Ghostscript commands

$
0
0

-sDEVICE parameter can be set to the following values:

-sDEVICE=tiff12nc

Produces 12-bit RGB output

-sDEVICE=tiff24nc

Produces 24-bit RGB output

-sDEVICE=tiff48nc

Produces 48-bit RGB output

-sDEVICE=tiff32nc

Produces 32-bit CMYK output

-sDEVICE=tiff64nc

Produces 64-bit CMYK output

-sDEVICE=tiffscaled24 -sCompression=lzw

Produces a 24 bit RGB image and allows the use of a special compression tag along with it which allows us to compress the size of the image.

-sDEVICE=tifflzw

Produces black-and-white output and can be combined with various compression options.

Following are the results of images produced by splitting a PDF with the given specifications under different GhostScript parameters:

PDF Size: 514Kb Number of pages in PDF: 26 Note: PDF contained mixture of colored and B/W images

Ghostscript-stats

So in case the user wants to retain the color of the images, please use -sDEVICE=tiffscaled24 -sCompression=lzw value and in case of when no colored images are present or the color is not of much importance to user, the user can go for -sDEVICE=tifflzw option. NOTE: These options are available only GhostScript 9.04 onwards. Please make sure the user has the required version. Although Version available with Ephesoft 3.0 onwards installer is 9.05.

Ghostscript – PDF to Tiff Conversion

$
0
0

Purpose

Used to convert PDF files to single page TIFF files to learn and test images.

Running Ghostscript from command line

1. Navigate to the the Ephesoft\dependencies\gs\bin (if the system is 32 bit navigate to Ephesoft\dependencies\gs32bit\bin)

2. Determine which version of Ghostscript you would like use gswin64 or gswin64c (gswin32 or gswin32c if running a on a 32bit system)

3. Open command prompt as administrator and use the following command:

Ephesoft\Dependencies\gs\bin\gswin64c.exe -dNOPAUSE -r300 -sDEVICE=tiffscaled24 -sCompression=lzw -dBATCH -sOutputFile= \folder\to\output-04%d.tif \folder\to\input.pdf>result.txt

EXAMPLE

Ghostpdftotifcmd.png

Make sure to specify the location you want to output the TIF files and the location of the PDF you wish to convert.

4. Once the command line is run the files will be outputted to the folder specified for output.

EXAMPLE

Ghostpdftotifoutput.png

Your files are now ready to be used for Learning and testing.




List of command line options for all Ephesoft tools

$
0
0

Following are the commands for Ephesoft’s included applications:

  1.       RECOSTAR HOCRING COMMAND

 

RecostarPlugin.exe “rsbatch file path” “FPR.rsp file path” bd OCR-core-mentioned-in-license “recostar license file path” _HOCR.xml ocrConfidenceSwitch deskewSwitch OFF/encryptionKey RECOSTAR_HOCR

 

Example  –

 

RecostarPlugin.exe “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI25\FileList.rsbatch” “F:\Ephesoft31\SharedFolders\BC5\fixed-form-extraction\Fpr.rsp” bd 4 “F:\\Ephesoft31\\Application/native/RecostarPlugin/bin\RSO2-NET.707” _HOCR.xml ON OFF OFF RECOSTAR_HOCR

 

RecostarPlugin.exe “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI26\FileList.rsbatch” “F:\Ephesoft31\SharedFolders\BC5\fixed-form-extraction\Fpr.rsp” bd 8 “F:\\Ephesoft31\\Application/native/RecostarPlugin/bin\RSO2-NET.707” _HOCR.xml ON OFF f97beb0a4808f0cd0ce1a8fa7d9b686e RECOSTAR_HOCR

 

  1.       RECOSTAR TIFF TO PDF CONVERSION COMMAND

 

RecostarPlugin.exe “TifToPDF rsbatch file path” “FPR_pdf.rsp file path” pdf OCR-core-mentioned-in-license “recostar license file path” “output pdf file path” OFF OFF OFF RECOSTAR_HOCR

 

Example  –

 

RecostarPlugin.exe “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI24\TifToPdf.rsbatchDOC2” “F:\Ephesoft31\SharedFolders\BC5\fixed-form-extraction\FPR_Pdf.rsp” pdf 4 “F:\\Ephesoft31\\Application/native/RecostarPlugin/bin\RSO2-NET.707” “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI24\tempfile_BI24_documentDOC2.pdf” OFF OFF OFF RECOSTAR_HOCR

 

  1.       RECOSTAR PDF TO TIFF CONVERSION COMMAND

 

saveimage.exe “pdf file path” compressionRatio appendPadding

 

Example-

 

saveimage.exe “pdf file path” 70 true

saveimage.exe “pdf file path” 10 false

 

 

 

 

  1.       RECOSTAR EXTRACTION COMMAND

 

RecostarPlugin.exe “rsbatch file path” “extraction rsp file path” xml OCR-core-mentioned-in-license “recostar license file path” ++++ OFF OFF OFF RECOSTAR_HOCR

 

Example-

 

RecostarPlugin.exe “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI2A\aaaa.rsp.rsbatch” “F:\Ephesoft31\SharedFolders\BC4\fixed-form-extraction\aaaa.rsp” xml 4 “F:\\Ephesoft31\\Application/native/RecostarPlugin/bin\RSO2-NET.707” ++++ OFF OFF OFF RECOSTAR_HOCR

 

  1.       GHOSTSCRIPT PDF TO TIFF CONVERSION COMMAND

 

gswin64c.exe -dNOPAUSE -r300 -sDEVICE=tiffscaled24 -sCompression=lzw -dBATCH -sOutputFile=”output-tiff-filename-%04d.tif” “input pdf file path”

 

Example-

 

gswin64c.exe -dNOPAUSE -r300 -sDEVICE=tiffscaled24 -sCompression=lzw -dBATCH -sOutputFile=”F:\Ephesoft31\SharedFolders\mailroom-import-copy1\multipage-pdf-pdf-1407824575554\multipage-pdf-%04d.tif”  “F:\Ephesoft31\SharedFolders\mailroom-import-copy1\multipage-pdf-pdf-1407824575554\multipage-pdf.pdf”

 

  1.       GHOSTSCRIPT PDF OPTIMIZATION COMMAND

 

gswin64c.exe -q -dNODISPLAY -P- -dSAFER -dDELAYSAFER — pdfopt.ps “input pdf file path” “output pdf file path”

 

Example-

 

gswin64c.exe -q -dNODISPLAY -P- -dSAFER -dDELAYSAFER — pdfopt.ps “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI25\tempfile_BI25_documentDOC1.pdf” “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI25\BI25_documentDOC1.pdf”

 

  1.       IMAGEMAGICK COMMANDS

 

Convert.exe conversion-param “input-file-path” “output-file-path”

 

Example-

 

[TIFF TO TIFF Conversion] Convert.exe  -limit area 100mb “F:\Ephesoft31\SharedFolders\mailroom-import-copy2\us-invoice\US-Invoice.tif” -compress LZW “F:\Ephesoft31\SharedFolders\mailroom-import-copy2\us-invoice\US-Invoice-%04d.tif”

[TIFF TO PNG Conversion] Convert.exe  “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI23\BI23_0.tif” -colorspace gray -alpha off “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI23\BI23_PG0.png”

[TIFF TO PNG Conversion] Convert.exe   “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI23\BI23_0.tif” -colorspace rgb -thumbnail 200×150 “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI23\BI23_PG0_displayThumb.png”

[COLORED TIFF TO PDF Conversion] Convert.exe   test.tif -quality 100.0 -compress LZW out.pdf

[NON-COLORED TIFF TO PDF Conversion] Convert.exe   test.tif -quality 100.0 -monochrome -compress LZW out.pdf

[PDF TO TIFF Conversion] Convert.exe -limit area 100mb “F:\Ephesoft31\SharedFolders\BC4\test-classification\multipage-pdf.pdf” -compress LZW “F:\Ephesoft31\SharedFolders\BC4\test-classification\multipage-pdf-%04d.tif”

 

  1.       TESSERACT HOCR COMMAND

 

TesseractConsole.exe “input TIFF file path” “output html file path without .html extension” “-l eng” +”hocr.txt file path”

 

Example-

“F:\\Ephesoft31\\Application/native/Tesseract-OCR\TesseractConsole.exe” “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI29\BI29_5.tif” “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI29\BI29_PG5”  “-l eng” +”F:\\Ephesoft31\\Application/native/Tesseract-OCR\hocr.txt”

 

  1.       ZXING COMMAND TO EXTRACT BARCODE VALUE

ZXing does not work on tiff file, but it works on PNG file.

 

java -cp zxing-1.6.0.jar;.; com.google.zxing.client.j2se.CommandLineRunner “png file path”

 

Example-

hasdf

 

 

 

 




KB0008528: Font substitution in Ghostscript

$
0
0

KB Articles

KB Article #8528

 

Topic/Category: GhostScript Font Substitution.

 

Issue: Ephesoft is not reading a PDF correctly and the fonts appear to be different in the original PDF and the document in Review or validate.

 

Root Cause: When Ghostscript converts a PDF to single page tif files during the folder import, Ghostscript converts the fonts to a more readable font. Some fonts are not supported out of the box for Ghostscript such as Lucida and are therefor substituted with a different font such as Helvetica.

 

Solution:

 

Instructions: The font for the document will have to be added to the Ghostscript font library manually.

  1. Open the following file in notepad: C:\Ephesoft\Application\native\RecostarPlugin\bin\GsRenderer\Fontmap.GS
  2. The standard entries for this file have the name of the original font for the document and then the name of the font to which it should be converted. If a font does not exist in the ghostscript directory
    /TimesNewRoman                          /TimesNewRomanPSMT ;
    /TimesNewRoman,Bold                  /TimesNewRomanPS-BoldMT ;
    /TimesNewRoman,Italic                  /TimesNewRomanPS-ItalicMT ;
    /TimesNewRoman,BoldItalic           /TimesNewRomanPS-BoldItalicMT ;
  3. At the bottom of the file, you will want to create a new entry for your font and with the font to which you would like it converted. In this example, I am using LucidaSansBold as the original font and I am converting to a the windows true type font l_10646
    /LucidaSans,Bold (C:/Windows/Fonts/l_10646.ttf) ;
    /Dialog (C:/Windows/Fonts/l_10646.ttf) ;
  4. Save the Fontmap.GS file and restart Ephesoft

KB0010266 – Error out at Folder Import process.

$
0
0

KB Article #: KB0010266

Topic/Category: GhostScript

Applies to: All versions.

Issue: During Folder Import process, IMPORT_MULTIPAGE_FILES module error out due to GhostScript directory is missing.

Root Cause: GhostScript directory is blank.

Solution:

Steps to resolve:

  1. Stop Ephesoft service.
  2. Go to <Ephesoft installed directory>\Dependencies and remove the gs directory.
    For 32-bit server: Rename gs32bit folder to gs folder.
    For 64-bit server: Rename gs64bit folder to gs folder.
  3. Start Ephesoft service.

 

Documentation Main Page  | How To Articles | Downloads and Updates |

KB0012936 – Exception in breaking the input file. Converted Tiff files count not equal to the TIFF pages count.

$
0
0

KB Article #: 12936

Issue:

When processing batches, user getting the exception error below during FOLDER IMPORT process.

“Exception in breaking the input file. Converted Tiff files count not equal to the TIFF pages count.”

Solution:

1. Error could be due to file length. Using Recostar, the max is 255 characters supported in file path and Ghostscript, the max is 238 characters supported in file path. If user exceed the max characters, the batch will fail.

2. User can make modification to the IMPORT_MULTIPAGEFILE plugin under FOLDER IMPORT module. Change PDF To TIFF Conversion Process to “Recostar”.

FolderImport

Documentation Main Page | How To Articles | Downloads and Updates |


KB20797: Ghostscript fails to convert PDF to TIFF due to file produced in Word 2016

$
0
0

Issue Description:

Batches failing in Folder import module due to PDF file produced in Word 2016.

 

Error Message:

Exception in breaking the input file. The command: “[gs, -dNOPAUSE, -r300, -sDEVICE=tiff24nc, -sCompression=lzw, -dBATCH, -dNumRenderingThreads=64, -sOutputFile=/ephesoft/ShareFolders/Encore_SG_v1/BC1F_ephesoft_17-01-201803_16_33/_pdf_processing_folder/1-%04d.tif, /ephesoft/ShareFolders/Encore_SG_v1/BC1F_ephesoft_17-01-201803_16_33/_pdf_and_tiff_backup/1.pdf] executing in the working directory: “/usr/local/bin” failed to execute successfully due to: Process execution Time-out

 

Please note that above Exception / Error Message may be due to some other error as well. The only way you can make sure that Folder Import module is failing due to file produced in Word 2016 is to execute the ghostscript command using command line: Reference wiki link

 

Correct Error Message:

 

ROOT CAUSE:

Current version of Ghostscript doesn’t comply with PDF files produced using Microsoft Word 2016.

WORKAROUND:

NOTE: Please note that these upgrade steps are only tested on LINUX Versions for Ephesoft Transact 4.1.2.0. It is expected that customer is following the below steps if they are on similar environment. If not then this issue is fixed in Ephesoft Transact 4.5 and will need to wait for the resolution. It is expected that different engine will be used to create PDF files.

Current workaround for this version is to upgrade Ghostscript to version 9.22

Please find below steps to upgrade Ghostscript 9.22 on Linux environment.

Steps to upgrade Ghostscript –

  1. Download the ghostscript-9.22-linux-x86_64.tgz file Ghostscript AGPL Release  ( based on system 32bit/64 bit){http://www.ghostscript.com/download/gsdnld.html}
  2. Extract and copy “gs-922-linux_x86_64”
  3. Rename file using following command:  mv gs-922-linux_x86_64 gs
  4. Give permissions to file as : chmod 755 gs
  5. Now check where the current Ghostscript is installed using command- Type gs
  6. Now rename existing Ghostscript using below command on path where current Ghostscript is installed: Mv  ${current_file_path}$ ${new_file_path}$
  7. Execute following command – ln -s  /path_where_new_gs_copied_with_file_name/   /path_where_current_gs_installed_with_file_name/

After upgrading GS restart the ephesoft service and restart the batch.

 

KB00021994: Different results captured when defining KV overlays in KV Extraction

$
0
0

Issue:
During KeyValue Training the boxes do not provide the correct content of the document, so the creation of regular expressions is not possible. As you can see in the below image the key overlay selected is extracting totally different content.

Analysis:

The issue is observed / seen when different ghost-script parameters are used and  ghostScript.command.png.params in application.properties file is not updated with the same parameters. To get the coreect results you need to make sure that ghost-script parameters are used and  ghostScript.command.png.params in application.properties file match. 

In above case we are using GhostScript property with density -r200 without changing ghostScript.command.png.params in application.properties due to which we are seeing irregularities in OCR results.

 

KB0022530 – Blank documents are getting generated with particular PDF.

$
0
0
KB Article #: KB0022530
Topic/Category: Folder Import module
Issue: User is having an issue with PDF file during PDF to Tiff conversion. Blank documents are getting generated. When using Recostar, the batch errors out at Folder Import module and when using Ghostscript, the documents are blank.
Solution: There are some issues with the PDF and is failing with Recostar. This is also failing with Ghostscript v9.05 available in Ephesoft v4110 but the issue is not replicable with Ghostscript v9.22 available in Ephesoft v4500. So, please correct the PDF from the originating source or upgrade to v4.5.
Adobe tool also shows some problems for the pdf as mentioned below.
a. Font not embedded
b. Required key missing
c. Value for this key not an indirect object

Convert PDFs to TIF Files

$
0
0

Converting PDFs to TIF Files

Overview

Ephesoft uses Imagemagick® for converting PDFs to TIF files.

Imagemagick® uses the following command syntax:

convert.exe [parameters for opening the file] input file [parameters for saving the file] output

Since PDFs are opened at 72 DPI by default, open with a higher resolution. Recommended command:

convert -density 300 input.pdf -compress group4 output.tif

Where:

Imagemagick® uses Ghostscript behind the scenes and may slow the process on large PDF files. It is also possible to use Ghostscript to convert PDFs to TIFFs.

Command example:

gswin64c.exe -dNOPAUSE -r300 -sDEVICE=tiff24nc -dBATCH -sOutputFile=output.tif input.pdf
convert output.tif -compress group4 finaloutput.tif
  • -r300 specifies the DPI (value is configurable)

The post Convert PDFs to TIF Files appeared first on Ephesoft Docs.

Increase Performance/Output Quality Using Different Ghostscript commands

$
0
0

-sDEVICE parameter can be set to the following values:

-sDEVICE=tiff12nc

Produces 12-bit RGB output

-sDEVICE=tiff24nc

Produces 24-bit RGB output

-sDEVICE=tiff48nc

Produces 48-bit RGB output

-sDEVICE=tiff32nc

Produces 32-bit CMYK output

-sDEVICE=tiff64nc

Produces 64-bit CMYK output

-sDEVICE=tiffscaled24 -sCompression=lzw

Produces a 24 bit RGB image and allows the use of a special compression tag along with it which allows us to compress the size of the image.

-sDEVICE=tifflzw

Produces black-and-white output and can be combined with various compression options.

Following are the results of images produced by splitting a PDF with the given specifications under different GhostScript parameters:

PDF Size: 514Kb Number of pages in PDF: 26 Note: PDF contained mixture of colored and B/W images

Ghostscript-stats

So in case the user wants to retain the color of the images, please use -sDEVICE=tiffscaled24 -sCompression=lzw value and in case of when no colored images are present or the color is not of much importance to user, the user can go for -sDEVICE=tifflzw option. NOTE: These options are available only GhostScript 9.04 onwards. Please make sure the user has the required version. Although Version available with Ephesoft 3.0 onwards installer is 9.05.

The post Increase Performance/Output Quality Using Different Ghostscript commands appeared first on Ephesoft Docs.

Ghostscript – PDF to TIFF Conversion

$
0
0

PDF to TIFF Conversion using Ghostscript

Purpose

Ephesoft uses Ghostscript to convert PDFs to single page TIF files to machine learn and test images.

Running Ghostscript from the command line

  1. Navigate to the the Ephesoft\dependencies\gs\bin (if the system is 32 bit navigate to Ephesoft\dependencies\gs32bit\bin)
  2. Determine which version of Ghostscript you would like use gswin64 or gswin64c (gswin32 or gswin32c if running a on a 32bit system)
  3. Open command prompt as administrator and use the following command:
{Ephesoft}\Dependencies\gs\bin\gswin64c.exe -dNOPAUSE -r300 -sDEVICE=tiffscaled24 -sCompression=lzw -dBATCH -sOutputFile="\folder\to\output-04%d.tif" "\folder\to\input.pdf">result.txt

Make sure to specify the location you want to output the TIF files and the location of the PDF you wish to convert.4.

4. After executing the command, files are outputted to the folder specified for output.

Output files are now ready to be used for machine learning and testing.




The post Ghostscript – PDF to TIFF Conversion appeared first on Ephesoft Docs.

List of command line options for all Ephesoft tools

$
0
0

Following are the commands for Ephesoft’s included applications:

  1.       RECOSTAR HOCRING COMMAND

 

RecostarPlugin.exe “rsbatch file path” “FPR.rsp file path” bd OCR-core-mentioned-in-license “recostar license file path” _HOCR.xml ocrConfidenceSwitch deskewSwitch OFF/encryptionKey RECOSTAR_HOCR

 

Example  –

 

[blockquote float=”left”]RecostarPlugin.exe “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI25\FileList.rsbatch” “F:\Ephesoft31\SharedFolders\BC5\fixed-form-extraction\Fpr.rsp” bd 4 “F:\\Ephesoft31\\Application/native/RecostarPlugin/bin\RSO2-NET.707″ _HOCR.xml ON OFF OFF RECOSTAR_HOCR[/blockquote]

 

[blockquote float=”left”]RecostarPlugin.exe “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI26\FileList.rsbatch” “F:\Ephesoft31\SharedFolders\BC5\fixed-form-extraction\Fpr.rsp” bd 8 “F:\\Ephesoft31\\Application/native/RecostarPlugin/bin\RSO2-NET.707” _HOCR.xml ON OFF f97beb0a4808f0cd0ce1a8fa7d9b686e RECOSTAR_HOCR[/blockquote]

 

  1.       RECOSTAR TIFF TO PDF CONVERSION COMMAND

 

RecostarPlugin.exe “TifToPDF rsbatch file path” “FPR_pdf.rsp file path” pdf OCR-core-mentioned-in-license “recostar license file path” “output pdf file path” OFF OFF OFF RECOSTAR_HOCR

 

Example  –

 

[blockquote float=”left”]RecostarPlugin.exe “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI24\TifToPdf.rsbatchDOC2” “F:\Ephesoft31\SharedFolders\BC5\fixed-form-extraction\FPR_Pdf.rsp” pdf 4 “F:\\Ephesoft31\\Application/native/RecostarPlugin/bin\RSO2-NET.707” “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI24\tempfile_BI24_documentDOC2.pdf” OFF OFF OFF RECOSTAR_HOCR[/blockquote]

 

  1.       RECOSTAR PDF TO TIFF CONVERSION COMMAND

 

saveimage.exe “pdf file path” compressionRatio appendPadding

 

Example-

 

[blockquote float=”left”]saveimage.exe “pdf file path” 70 true

saveimage.exe “pdf file path” 10 false[/blockquote]

 

 

 

 

  1.       RECOSTAR EXTRACTION COMMAND

 

RecostarPlugin.exe “rsbatch file path” “extraction rsp file path” xml OCR-core-mentioned-in-license “recostar license file path” ++++ OFF OFF OFF RECOSTAR_HOCR

 

Example-

 

[blockquote float=”left”]RecostarPlugin.exe “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI2A\aaaa.rsp.rsbatch” “F:\Ephesoft31\SharedFolders\BC4\fixed-form-extraction\aaaa.rsp” xml 4 “F:\\Ephesoft31\\Application/native/RecostarPlugin/bin\RSO2-NET.707” ++++ OFF OFF OFF RECOSTAR_HOCR[/blockquote]

 

  1.       GHOSTSCRIPT PDF TO TIFF CONVERSION COMMAND

 

gswin64c.exe -dNOPAUSE -r300 -sDEVICE=tiffscaled24 -sCompression=lzw -dBATCH -sOutputFile=”output-tiff-filename-%04d.tif” “input pdf file path”

 

Example-

 

[blockquote float=”left”]gswin64c.exe -dNOPAUSE -r300 -sDEVICE=tiffscaled24 -sCompression=lzw -dBATCH -sOutputFile=”F:\Ephesoft31\SharedFolders\mailroom-import-copy1\multipage-pdf-pdf-1407824575554\multipage-pdf-%04d.tif”  “F:\Ephesoft31\SharedFolders\mailroom-import-copy1\multipage-pdf-pdf-1407824575554\multipage-pdf.pdf”[/blockquote]

 

  1.       GHOSTSCRIPT PDF OPTIMIZATION COMMAND

 

gswin64c.exe -q -dNODISPLAY -P- -dSAFER -dDELAYSAFER — pdfopt.ps “input pdf file path” “output pdf file path”

 

Example-

 

[blockquote float=”left”]gswin64c.exe -q -dNODISPLAY -P- -dSAFER -dDELAYSAFER — pdfopt.ps “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI25\tempfile_BI25_documentDOC1.pdf” “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI25\BI25_documentDOC1.pdf”[/blockquote]

 

  1.       IMAGEMAGICK COMMANDS

 

Convert.exe conversion-param “input-file-path” “output-file-path”

 

Example-

 

[blockquote float=”left”][TIFF TO TIFF Conversion] Convert.exe  -limit area 100mb “F:\Ephesoft31\SharedFolders\mailroom-import-copy2\us-invoice\US-Invoice.tif” -compress LZW “F:\Ephesoft31\SharedFolders\mailroom-import-copy2\us-invoice\US-Invoice-%04d.tif”

[TIFF TO PNG Conversion] Convert.exe  “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI23\BI23_0.tif” -colorspace gray -alpha off “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI23\BI23_PG0.png”

[TIFF TO PNG Conversion] Convert.exe   “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI23\BI23_0.tif” -colorspace rgb -thumbnail 200×150 “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI23\BI23_PG0_displayThumb.png”

[COLORED TIFF TO PDF Conversion] Convert.exe   test.tif -quality 100.0 -compress LZW out.pdf

[NON-COLORED TIFF TO PDF Conversion] Convert.exe   test.tif -quality 100.0 -monochrome -compress LZW out.pdf

[PDF TO TIFF Conversion] Convert.exe -limit area 100mb “F:\Ephesoft31\SharedFolders\BC4\test-classification\multipage-pdf.pdf” -compress LZW “F:\Ephesoft31\SharedFolders\BC4\test-classification\multipage-pdf-%04d.tif”[/blockquote]

Note: In Ephesoft we are not using ImageMagick to convert from PDF to TIFF and if someone has to do that through external command then the one will need to copy the gs32bit files to gs directory path.

or

This path can be overridden in <Ephesoft>\Dependencies\ImageMagick\delegates.xml by replacing  @PSDelegate@ with the complete gs exe path which need to be executed as Imagemagick contructs this path from registry entry in “HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\GPL Ghostscript”.

 

  1.       TESSERACT HOCR COMMAND

 

TesseractConsole.exe “input TIFF file path” “output html file path without .html extension” “-l eng” +”hocr.txt file path”

 

Example-

[blockquote float=”left”]”F:\\Ephesoft31\\Application/native/Tesseract-OCR\TesseractConsole.exe” “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI29\BI29_5.tif” “F:\Ephesoft31\SharedFolders\ephesoft-system-folder\BI29\BI29_PG5”  “-l eng” +”F:\\Ephesoft31\\Application/native/Tesseract-OCR\hocr.txt”[/blockquote]

 

  1.       ZXING COMMAND TO EXTRACT BARCODE VALUE

ZXing does not work on tiff file, but it works on PNG file.

 

[blockquote float=”left”]java -cp zxing-1.6.0.jar;.; com.google.zxing.client.j2se.CommandLineRunner “png file path”[/blockquote]

 

Example-

hasdf

 

 

 

 




The post List of command line options for all Ephesoft tools appeared first on Ephesoft Docs.


KB0008528: Font substitution in Ghostscript

$
0
0

KB Articles

KB Article #8528

 

Topic/Category: GhostScript Font Substitution.

 

Issue: Ephesoft is not reading a PDF correctly and the fonts appear to be different in the original PDF and the document in Review or validate.

 

Root Cause: When Ghostscript converts a PDF to single page tif files during the folder import, Ghostscript converts the fonts to a more readable font. Some fonts are not supported out of the box for Ghostscript such as Lucida and are therefor substituted with a different font such as Helvetica.

 

Solution:

 

Instructions: The font for the document will have to be added to the Ghostscript font library manually.

  1. Open the following file in notepad: C:\Ephesoft\Application\native\RecostarPlugin\bin\GsRenderer\Fontmap.GS
  2. The standard entries for this file have the name of the original font for the document and then the name of the font to which it should be converted. If a font does not exist in the ghostscript directory
    /TimesNewRoman                          /TimesNewRomanPSMT ;
    /TimesNewRoman,Bold                  /TimesNewRomanPS-BoldMT ;
    /TimesNewRoman,Italic                  /TimesNewRomanPS-ItalicMT ;
    /TimesNewRoman,BoldItalic           /TimesNewRomanPS-BoldItalicMT ;
  3. At the bottom of the file, you will want to create a new entry for your font and with the font to which you would like it converted. In this example, I am using LucidaSansBold as the original font and I am converting to a the windows true type font l_10646
    /LucidaSans,Bold (C:/Windows/Fonts/l_10646.ttf) ;
    /Dialog (C:/Windows/Fonts/l_10646.ttf) ;
  4. Save the Fontmap.GS file and restart Ephesoft

The post KB0008528: Font substitution in Ghostscript appeared first on Ephesoft Docs.

KB0010266 – Error out at Folder Import process.

$
0
0

KB Article #: KB0010266

Topic/Category: GhostScript

Applies to: All versions.

Issue: During Folder Import process, IMPORT_MULTIPAGE_FILES module error out due to GhostScript directory is missing.

Root Cause: GhostScript directory is blank.

Solution:

Steps to resolve:

  1. Stop Ephesoft service.
  2. Go to <Ephesoft installed directory>\Dependencies and remove the gs directory.
    For 32-bit server: Rename gs32bit folder to gs folder.
    For 64-bit server: Rename gs64bit folder to gs folder.
  3. Start Ephesoft service.

 

Documentation Main Page  | How To Articles | Downloads and Updates |

The post KB0010266 – Error out at Folder Import process. appeared first on Ephesoft Docs.

KB0012936 – Exception in breaking the input file. Converted Tiff files count not equal to the TIFF pages count.

$
0
0

KB Article #: 12936

Issue:

When processing batches, user getting the exception error below during FOLDER IMPORT process.

“Exception in breaking the input file. Converted Tiff files count not equal to the TIFF pages count.”

Solution:

1. Error could be due to file length. Using Recostar, the max is 255 characters supported in file path and Ghostscript, the max is 238 characters supported in file path. If user exceed the max characters, the batch will fail.

2. User can make modification to the IMPORT_MULTIPAGEFILE plugin under FOLDER IMPORT module. Change PDF To TIFF Conversion Process to “Recostar”.

FolderImport

Documentation Main Page | How To Articles | Downloads and Updates |

The post KB0012936 – Exception in breaking the input file. Converted Tiff files count not equal to the TIFF pages count. appeared first on Ephesoft Docs.

KB20797: Ghostscript fails to convert PDF to TIFF due to file produced in Word 2016

$
0
0

Issue Description:

Batches failing in Folder import module due to PDF file produced in Word 2016.

 

Error Message:

Exception in breaking the input file. The command: “[gs, -dNOPAUSE, -r300, -sDEVICE=tiff24nc, -sCompression=lzw, -dBATCH, -dNumRenderingThreads=64, -sOutputFile=/ephesoft/ShareFolders/Encore_SG_v1/BC1F_ephesoft_17-01-201803_16_33/_pdf_processing_folder/1-%04d.tif, /ephesoft/ShareFolders/Encore_SG_v1/BC1F_ephesoft_17-01-201803_16_33/_pdf_and_tiff_backup/1.pdf] executing in the working directory: “/usr/local/bin” failed to execute successfully due to: Process execution Time-out

 

Please note that above Exception / Error Message may be due to some other error as well. The only way you can make sure that Folder Import module is failing due to file produced in Word 2016 is to execute the ghostscript command using command line: Reference wiki link

 

Correct Error Message:

 

ROOT CAUSE:

Current version of Ghostscript doesn’t comply with PDF files produced using Microsoft Word 2016.

WORKAROUND:

NOTE: Please note that these upgrade steps are only tested on LINUX Versions for Ephesoft Transact 4.1.2.0. It is expected that customer is following the below steps if they are on similar environment. If not then this issue is fixed in Ephesoft Transact 4.5 and will need to wait for the resolution. It is expected that different engine will be used to create PDF files.

Current workaround for this version is to upgrade Ghostscript to version 9.22

Please find below steps to upgrade Ghostscript 9.22 on Linux environment.

Steps to upgrade Ghostscript –

  1. Download the ghostscript-9.22-linux-x86_64.tgz file Ghostscript AGPL Release  ( based on system 32bit/64 bit){http://www.ghostscript.com/download/gsdnld.html}
  2. Extract and copy “gs-922-linux_x86_64”
  3. Rename file using following command:  mv gs-922-linux_x86_64 gs
  4. Give permissions to file as : chmod 755 gs
  5. Now check where the current Ghostscript is installed using command- Type gs
  6. Now rename existing Ghostscript using below command on path where current Ghostscript is installed: Mv  ${current_file_path}$ ${new_file_path}$
  7. Execute following command – ln -s  /path_where_new_gs_copied_with_file_name/   /path_where_current_gs_installed_with_file_name/

After upgrading GS restart the ephesoft service and restart the batch.

 

The post KB20797: Ghostscript fails to convert PDF to TIFF due to file produced in Word 2016 appeared first on Ephesoft Docs.

KB00021994: Different results captured when defining KV overlays in KV Extraction

$
0
0

Issue:
During KeyValue Training the boxes do not provide the correct content of the document, so the creation of regular expressions is not possible. As you can see in the below image the key overlay selected is extracting totally different content.

 

Analysis:

The issue is observed / seen when different ghost-script parameters are used and  ghostScript.command.png.params in application.properties file is not updated with the same parameters. To get the correct results you need to make sure that ghost-script parameters are used and  ghostScript.command.png.params in application.properties file match. 

In above case we are using GhostScript property with density -r200 without changing ghostScript.command.png.params in application.properties due to which we are seeing irregularities in OCR results.

 

The post KB00021994: Different results captured when defining KV overlays in KV Extraction appeared first on Ephesoft Docs.

Viewing all 23 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>