Can Swordfish process PDF documents? Thread poster: Thomas Johansson
|
I will receive a PDF file with approx. 40,000 words and have been asked to process it with a CAT tool while generating a TM (for future versions). Is this something I can do with Swordfish?
Also, I got the impression Swordfish is written in Java (though I am not sure). Is it by any chance slow to work with or does it perform well?
Thomas
[Edited at 2011-04-21 19:12 GMT] | | |
Hi,
Swordfish doesn't support PDF files. You will have to use an OCR to extract the text into a better format (.docx for example).
Java is not slow, it is as fast as C++. Speed depends mostly on your hardware (memory & processor).
Regards,
Rodolfo | | | Laurent KRAULAND (X) France Local time: 18:03 French to German + ... PDF = pain in the back | Apr 22, 2011 |
Hi Thomas,
while I certainly understand that clients may have their reasons to request translations from PDF files, it must be said once again that PDF was thought to be a non-editable format.
And if the document is as you described it, there must be an original in an editable and CAT-compatible format somewhere. | | | It is in an "editable" format | Apr 22, 2011 |
Well, it is in an "editable" format, at least for instance in the sense that I can copy the text and paste it, say, to a Word document, if I like. So, OCR shouldn't really be needed. (I am not sure whether "editable" is the right word here, but it is in one of those modern PDF formats that started appearing a few years ago, where you can e.g. highlight text, copy it and paste it into some other file.)
Given this, is Swordfish still not able to process this? (I would prefer to delive... See more Well, it is in an "editable" format, at least for instance in the sense that I can copy the text and paste it, say, to a Word document, if I like. So, OCR shouldn't really be needed. (I am not sure whether "editable" is the right word here, but it is in one of those modern PDF formats that started appearing a few years ago, where you can e.g. highlight text, copy it and paste it into some other file.)
Given this, is Swordfish still not able to process this? (I would prefer to deliver the translation back to the client as a PDF file, i.e. in the same format as the source file.)
Or otherwise, what CAT tool could process PDF files of this sort?
Thomas ▲ Collapse | |
|
|
PDF files are best served by OCR | Apr 22, 2011 |
PDF files are best served by OCR, unless your client has tools to convert PDF into some DTP formats that Swordfish can import/export.
OCR processing has its drawbacks and you have to know the quirks of your OCR software, many clients do not like the way Finereader formats documents for example, so I learned to mark blocks for recognition manually to avert their rage.
I prepared a 24page PDF for translation... See more PDF files are best served by OCR, unless your client has tools to convert PDF into some DTP formats that Swordfish can import/export.
OCR processing has its drawbacks and you have to know the quirks of your OCR software, many clients do not like the way Finereader formats documents for example, so I learned to mark blocks for recognition manually to avert their rage.
I prepared a 24page PDF for translation manually once, it had graphics and tables and Greek characters sometimes. After that it looked like the original PDF, but it took me more than two days!
Regards,
Piotr ▲ Collapse | | | Milos Prudek Czech Republic Local time: 18:03 English to Czech + ... Print into PDF | May 6, 2011 |
I would prefer to deliver the translation back to the client as a PDF file, i.e. in the same format as the source file.)
Here is the workflow:
- Use OCR to convert PDF to MS Word (or read the text and translate it)
- Translate the MS Word file with any CAT
- Print the MS Word file into PDF (OpenOffice can do this) | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Can Swordfish process PDF documents? Trados Studio 2022 Freelance | The leading translation software used by over 270,000 translators.
Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop
and cloud solution, empowering you to work in the most efficient and cost-effective way.
More info » |
| Wordfast Pro | Translation Memory Software for Any Platform
Exclusive discount for ProZ.com users!
Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value
Buy now! » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |