If you upload a PDF file or a scanned image to a project, Smartcat will automatically recognize text in it using the Optical Character Recognition (OCR) technology and convert the file into a Word (DOCX) file.
The result of this process depends greatly on the image quality. Before you start translating, make sure that the file has been processed correctly and its content recognized without errors. In addition, if you need to translate the document into multiple languages, you can adjust its formatting before translation thus saving time as you won’t need to correct the formatting multiple times for each language at the end of the process.
Checking and correcting the source text and layout
When creating a new project or after uploading a file to an existing project, select the Check and correct the source layout option. This way you will be able to check the OCR results and make the necessary corrections before you start translating.
Note: The post-translation layout check is also often recommended with this type of documents. See further down for more explanation.
Please note that you have to choose this option separately for each file. The paint roller icon will appear next to the file name if you choose to correct the layout. Once you’re done, complete the project creation.
Note: The files for which you chose to correct the layout will not be processed using any of the linguistic assets (translation memories or machine translation) until the layout verification is completed. Any statistics generated at that point will not include the word count from the files being processed this way.
Once on the project page, choose your file from the list and click Source layout and text check.
This will take you to the Source layout check page. Download the prepared file from the Work files section and make sure it has been properly recognized.
If everything is correct, press the Complete checking of source layout button. If not, make your corrections to the file and press the Upload button in the Work files area to upload the corrected file. Please note that the corrected file may only be uploaded in Word (DOCX) format.
Press the Complete checking of source layout button. The corrected document will then be processed using your linguistic assets and the pretranslation rules that you have defined.
Note: Like any other task, you can assign the layout check task to anyone in your team or to suppliers from the Marketplace.
After the project is created, if you notice that the OCR results are not ideal, the source layout and text check can be enabled in the document settings.
Click on the gear icon for the document you want to process and choose Check and correct source layout in the dialog box that appears.
Please note that any completed translations will be lost. The process to check the layout is then the same as described above.
Checking the post-translation layout
A translation often happens to be significantly longer or shorter than the original. This might cause the translation’s layout to be distorted in the completed file. To add a task to check the layout in the translated file, select the Check and correct the post-translation layout option when uploading the original file to your project.
Note: This option can be selected for all formats supported by Smartcat and the task can be assigned to team members or freelancers. This is helpful if you offer desktop publishing services to your customers and want to be able to assign that task from within Smartcat to benefit from our collaboration features.
Once the translation is completed, select the document in the list and click Post-translation layout check.
Download the translated file from the Work files area and check the layout.
If everything is correct, press the Save button. If not, make your corrections to the file, then click Upload in the Work files area and select the corrected file.
Click Save. The corrected file can now be downloaded from the project page for client delivery.