What is PDF Converter?
 
 

What is PDF Converter?

PDF Converter usually converts PDF file into another file format, such as Word, Excel, PowerPoint, Plain text, html, image, etc. It should have clear understanding of PDF document structure as well as target file format structure. For instance, a PDF to Word Converter must know PDF objects and Word file structure. In fact, there is no one to one mapping between PDF objects (text streams, images, shapes, etc) and Word document elements. Therefore, PDF Converter has to create compatible Word document elements for each PDF object. This process is further complicated because of the different PDF object attributes in different PDF versions.
PDF can be converted to various formats: doc, docx, xml, rtf, xls, xlsx, .htm, etc. It can also be converted to many image formats:
AVSJBIGPGMSUN
BMP MonoJNGPGM RAWSVG
BMP GrayJP2PGNMTGA
BMP Sep1JPCPGNM RAWTIF Gray
BMP Sep8JPGPNG MonoTIF 12 bit RGB
BMP 4 bitJPG GrayPNG GrayTIF 24 bit RGB
BMP 8 bitMNGPNG 4 bitTIF 48 RGB
BMP 24 bitMPEGPNG 8 bitTIF 32 bit CMYK
BMP 32 bitM2VPNG 24 bitTIF 64 bit CMYK
CINMTVPKSMTIF G3Fax no RLE
CMYKOTBPKSM RAWTIF G3Fax RLE
CMYKAP7PKMTIF 2DG3Fax
DCXPALMPKM RAWTIF G4Fax
DIBPAMPNMTIF LZW
DPXPBMPNM RAWTIF PackBits
EMFPBM RAWPPMTIF Sep
EPS 1PCDPPM RAWTIF Sep1
EPS 1 ColorPCDSPS 1UIL
EPS 2PCLPS 1 ColorUYVY
FAX G3PCX MonoPS2VICAR
FAX 2DG3PCX GrayPSD CMYKVIFF
FAX G4PCX 4 bitPSD RGBWBMP
FITSPCX 8 bitPTIFXBM
GIFPCX 24 bitPXL MonoXPM
GPLTPCX CMYKPXL ColorXWD
INFOPDBSGIYCbCr
There are various layout options available for PDF conversion. Most used option is to convert PDF in the same format with text, images, shapes etc. Other options are formatted text, plain text, or simply extracting images from PDF.
Some PDF documents have text on images. Scanned PDFs usually results in text on image. Such text on image can be extracted through OCR. Almost all GIRDAC PDF Converters use OCR technology to extract text and format from images.
PDF Converters do not convert documents having the security setting:
Content Copying: Not Allowed
or
Page Extraction: Not Allowed
One can see this information in Adobe Reader top-level menu
File -> Document Properties and clicking on Security tab.
Go to: What is Word document?
Go to: What is OCR?