Tag Archives: convert

Tutorial #4: Convert an Accessible Word Document into a DAISY eBook- 2

Introduction

The DAISY Add-In was designed to help content creators produce accessible documents, from Microsoft Word documents, for people with print disabilities. Installing this add-in permits the saving of Word documents into DAISY XML, and then DAISY Digital Talking Books (DTBs), automatically. The DAISY Add-In, for Microsoft Word 2003, 2007 and Word 2010 was released in December … Continue Reading ››

Tutorial #7: Explore IDEAL Group’s “Tesseract,” Online OCR Implementation

First, SIGN UP:

To Sign up for CRIS OCR, please go to SIGN UP or LOGIN and click on "I want to register".  A "New User Registration" dialog box will appear. See Figure 1.

Type in an eMail, Name, and Password. Click the Register button. Here are some credentials you can use to test the technology:

SECOND,  DOWNLOAD DOCUMENTS FOR TESTS

Test documents to download, submit to the OCR engine, and otherwise experiment with:

DOCUMENTATION AND INSTRUCTIONS:

Signup Page View
Signup Page View

 

Upon successful sign up you will be directly logged into the system. You will see the user dashboard as in Figure 2. Details of dashboard are described in Section 2 below.

Figure 2. Successful Sign Up
Figure 2. Successful Sign Up

 

Using the Archives System

When you login successfully or register successfully in the CRIS Archives Application you will see the "Logout" button on Top Right corner so that you can logout of the application when you have completed your work.

On the top left, there are two buttons for adding files to the  CRIS Archives Application.

  1. "Upload File" : Using this you can upload any PDF file into the archives application. The application then performs OCR on the PDF file uploaded to extract text from the PDF file uploaded.
  2. "Create File": Creates a fresh file rather than performing OCR on already existing file, then you can click on this button.

The two tables below are initially empty.  

Uploaded:

Here you will see  the PDF files that have uploaded or created using the buttons for "Upload File" or "Create File"

By default  100 records are shown, but you can customize the number of records you would like to see.

You can also type in Search box to find matching file names.

When you have uploaded the file to the system, you will see the following entries for a file in a single row.

  • File Name
  • Tesseract : (OCR engine) It has two buttons:
    • Edit Button: For Editing the OCR output generated from the PDF file or Edit the text file created.
    • Ebook Button: For downloading the ebook for the corresponding OCRed document.
  •  Action: Actions that you can perform on each file
    • Share: If you would like to invite any other user to edit the OCR output. After clicking the button,  enter the email id of the user who you would like to share the document.  They must have an account in the system.
    • Delete: If you would like to delete the entry for the file from the system.

Shared:

  • The system allows you to invite other collaborators to edit the same file that is on your system. Here you will see a list of files if anyone has invited you to edit a file "Uploaded" or "Created By" other users.

 

Steps for Uploading a file for OCR

Click on Upload File button.  You will taken to a page where you can drag and drop the file you would like to upload or you can click on the area to upload a file.

  • Once you select the file, please wait for the file uploader to complete 100% and show you the message "File uploaded successfully and queued for processing".
  • You can upload more files if you like using the same process, or you can click on "Check Files" to go back to list of files.

In the list of files in Uploaded section, you can search the name of the file you just uploaded.

  • Click on the "Edit" button. If the file process is not complete, it will show you the message "The file submitted by you is still being processed."
  • If the OCR process has completed successfully, you will be taken to the editor,  where you can see the original file uploaded and OCR output next to each other.

 

Steps for Creating an EPUB:

Once the OCR process has successfully completed, you are taken to the page where you can see the original file and the OCR output in an editor side by side.

The editor on the web-browser has all the standard editing functions of MS-Word. You can format the output of the OCR and correct to match the original document.

Please make sure to mark the headings in the document accordingly as they are used by the EPUB generator to create table of contents.

Once you have finished the formatting and correcting of the OCR output, you can click on EPUB button on the editor to export the document in EPUB Format.

The exported EPUB format is readable on any fully compatible EPUB reader.

 

Tutorial #1: Create an Accessible Word and PDF Document

If Adobe Acrobat Reader DC is not already installed on the computer you are using to take this tutorial, please install it from the following website: https://get.adobe.com/reader/

Background Information:

Microsoft Word is currently the most common word processor on the market. As such, the .docx format has become a popular format for … Continue Reading ››

Tutorial #2: Use Central Access Reader (CAR) to Read an Accessible Word Document

CAR Quick Tutorial
  1. Download and install Central Access Reader (CAR), a powerful open source accessible reader (Windows 64-bit only): http://archive.org/download/CARSetup64/CAR_Setup_64.exe
  2. We recommend making this reader available from your website if you decide to provide digital materials in accessible DOCX format.
  3. Open Central Access Reader.
  4. Go to the "Advanced Settings Menu",Continue Reading ››

Tutorial #3: Generate an MP3 File From an Accessible Word Document Using Central Access Reader

In this tutorial you will convert and save all, or portions, of the DOCX file you created as an MP3 file.
  1. Open Central Access Reader.
  2. Press Ctrl-M to save the complete DOCX file as an MP3 file.
  3. Play the MP3 file with any MP3 player.
Next,
  1. Edit-Copy the title of any article in the newsletter.
  2. Highlight … Continue Reading ››

Tutorial #5: Create an EPUB eBook From an Accessible OpenOffice Document

  1. Use this link to download and install Apache OpenOffice V4.
  2. Use this link to download and install Writer2ePub (W2E). Writer2ePub (W2E) is an extension for OpenOffice (OO) Writer that allows you to create an ePub file from any file format that OO Writer can read. Important note: This conversion, in and of … Continue Reading ››

Tutorial #6: Create an Accessible, Human Narrated, DAISY Ebook Using Tobi

Note:

This tutorial was added to introduce you to DAISY eBooks that contain human narrated recordings. Since this is a somewhat complex type of eBook to produce, you will need to listen to the video tutorials, read all the reference materials, and experiment on your own in order to become proficient in producing your own human-narrated DAISY eBooks. This tutorial only overviews the process.

Before going through this tutorial, we encourage you to unzip, open, and play the following, completed, human narrated DAISY eBook using DD Reader: Human-DTB.zip

You should have installed DDReader+ as part of tutorial #4 exercises. If you didn't, you can download and install DDReader+ now.  Here's the download link: DDReader+

Introduction

Tobi is multimedia book production software that enables creation of full-text, full-audio digital talking books. The electronic books created using Tobi are fully accessible, navigable, and feature-rich as they conform to DAISY 3 or accessible EPUB 3 specifications. We highly recommend reviewing the Tobi User manual at this point.

Besides many other salient features, Tobi pioneers in authoring capability of textual, pictorial, and audio image descriptions, making the illegible images comprehensible to people with disabilities.

Tobi is released under LGPL license. It is free to run or distribute.

It is open source; any commercial and non-for profit organization or individual can use its source code to make specialized or tailored solutions. Tobi is based on plug-in architecture, thereby making its customization fairly simple.

We suggest that you become familiar with Tobi's User Interface before beginning this tutorial.

 

Instructions: Getting Started

  1. Download and Install Tobi using this link.
  2. Download: Collection-of-Writings-from-WWI.docx
  3. Open the document in Word.
  4. See how the document was formatted using Headings 1 and 2.
  5. Click on the Accessibility Tab in Word’s Title Bar Menu.
  6. Click on the SaveAsDAISY Button.
  7. Select “DAISY XML (from single DOCX)." You will see the following dialog box:
  8.  Enter the information contained in each field. Click the "Translate" button. You will see the following dialog box:
  9. Upon successful translation, you will see the following dialog box:
  10. Your file has now been saved as a DAISY XML file with the following name: Collection-of-Writings-from-WWI.xml
  11. Open Tobi.
  12. Click on "File" then "Import" to import the XML file into Tobi. Doing this should look something like this:
  13. At the end of the import process you will see the project audio dialog box:
  14. Click OK. You should see a screen similar to the following:
  15. This is the point  you are expected to insert and synchronize your pre-recorded audio files with the text of the DAISY eBook.
  16. Download and unzip the following edited, human-narrated recordings: edited-audio-recordings.zip Note: The edited audio recordings were created by editing the original audio recordings (here they are in unedited form to download and listen to: unedited-audio-recordings.zip). We used an open source audio editor called Audacity to edit the audio files. While the use of Audacity is not covered in this tutorial, and if you are interested in synchronizing pre-recorded audio files with text, we recommend that you download, install, and experiment with/learn to use Audacity using this link.
  17. Using the edited audio files you just unzipped (edited-audio-recordings.zip), you can begin to add and synchronize the audio tracks with the text of the eBook. You can begin doing this by first highlighting the entry, "Cutting the Meat Bills with Milk" displayed in Tobi's Index Window (window to the Left of the window containing the actual text of the DAISY book.
  18. Next, click on the Blue file folder button on the bottom Left of the Tobi window.
  19. Select the file named,
    "1-Edited_Cutting_the_Meat_Bills_with_Milk.mp3." You will now see the wave form of the audio file you just opened into the eBook. Your screen will look something similar to this:

You are now going to synchronize the recording with sentences contained in the section entitled, "Cutting the Meat Bills with Milk." You are then going to repeat this process by highlighting the next section of the eBook entitled, "Are you an Apathist?" You will then click the Blue folder button and open the next recording, entitled, "2-Edited_Are_You_An_Apathist.mp3" and repeat the process above... and so on. You will be using the "Ctrl-Return" shortcut key to synchronize the audio with the text. To learn how to do this, please view the following video. We recommend that you listen to it several times before attempting the synchronization.

Important Note: It's important that you preview/compare all audio recordings with the corresponding text of a DAISY eBook before trying to synchronize the two. If there's not an exact match, you are going to run into major problems. If there is a mismatch you will need to correct it by either:

  1. Editing the text, or,
  2. Editing the audio file.

Before moving on, we recommend listening to the following additional videos: