Word Embedded Indexing: A Trend in Publishing?

Word Embedded Indexing: A Trend in Publishing?

The content published in a print book may also on be published on CD-ROM, on a website, or in a format we’re not yet aware of. To handle this repurposing, publishers are increasingly embedding book indexes into the Word manuscript documents so that the index marks are associated with the text and not with a particular publishing format.

image of embedded index marks

The index is considered “embedded” because the index entries are fields that can be “turned on or off,” that is, viewed or not seen at all. Therefore, the index is buried beneath the surface of or embedded within the text.

Why embed an index? If a work is going to be updated often, an embedded index can be regenerated quickly. If the document will be displayed online, it’s fairly easy to convert the index entries to links. Embedded indexing works well for custom publishing, where selected chapters are chosen from a book and perhaps combined with chapters from other books.

If you are considering writing an embedded index, I offer one suggestion: Use the James Lamb utility WordEmbed (http://www.jalamb.com/wordembed.html). The WordEmbed utility allows you to use your indexing software (Cindex, Sky, Macrex) to index in the usual manner: You can see the index grow; you can take advantage of your index software tools such as entry autofill, cross-reference checking, and the maximum page number allowed; and you can edit the index as usual. This results in a higher quality embedded index.

Let me give you a taste of embedded indexing without the WordEmbed utility so that you can appreciate the ease of using the utility: To mark an index entry in Word “by hand,” click the cursor into the text associated with the entry, bring up the Mark Index Entry dialog box, type in the main entry and subentry heading text, and click Mark. What you’re doing is inserting a field that contains the main entry and subentry heading texts, so you could copy and paste the field, altering the heading texts as necessary.

The Mark Index Entry dialog box can remain open to make the process more efficient. If text is selected, then that selected text is put into the main entry field, saving some typing. An index field is placed next to the selected text, or the index field is placed at the cursor location, if there is no selected text. If you want to indicate a range of text for an index entry, first define a bookmark for that chunk of text, then in your Mark Index Entry dialog box select the Page Range Bookmark and hit the dropdown list to find the name of the bookmark you just defined.

Indexer Lucie Haskins offers a PDF (http://luciehaskins.com/resources/Mauer_EmbeddedIndexing.pdf) with screen shots of the process of inserting index fields. James Lamb wrote an article for the British Society of Indexers journal (http://www.jalamb.com/2005 10 01 Embedded indexing The Indexer.pdf) that explains the embedding process. In that article, Lamb captured the mirror-image nature of embedded indexing by contrasting conventional indexes, in which each heading has a collection of references indicating locations to which the heading refers, against embedded indexes, in which each location in the document has associated headings.

Okay, this process of hand embedding doesn’t sound too bad—until you actually do it. For the entire length of a book. Then you understand the indexer estimate that embedded indexing takes three to five times as long as conventional indexing (see the Lucie Haskins PDF referred to earlier).

The current versions of indexing software provide another approach to embedded indexes: their drag-and-drop capability between the indexing software and Word documents. The index is written and edited as usual, then displayed in page order. Each entry is dragged to the proper location in the Word document; when dropped, an index field is created for that entry.

Now let’s see how embedded indexing works with my preferred method, using the James Lamb WordEmbed utility:

indexing software and Word document side by side

I keep my indexing software window side-by-side with my Word document window. I can select the text in the Word document or simply click the cursor, then hit the keyboard shortcut of Control-Shift-backslash (Ctrl-Shift-\ or Ctrl-|). This puts a comment into the Word document. The comment contains a number, which was also placed on the clipboard. In my indexing software, I enter the main heading and subheading—taking advantage of autofill—then paste the clipboard contents into the page field.

If you need to enter this number again later, it can be copied to the clipboard from the comment balloon or it can simply be typed in. The number consists of the current Word document page number then a period then the line number of the beginning of the selection. The final digit differentiates multiple markers on the same line. When the index is embedded at the end of the process, this number in the comment field matches up with the index entry to be embedded. With a click, these comments can be removed before the file is delivered.

Indexing then becomes a natural rhythm of marking the text electronically, hitting the keyboard shortcut, and creating my index entry in my indexing software. A range of text can be indicated in a number of ways, all intuitive. I enjoy the visual nature of WordEmbed. The toolbar has a Go To Locator tool that moves you to the location of the entered locator number. You know the exact text associated with that locator; there’s no need to reconstruct which text on the page is being referred to. This aspect makes editing a joy.

And editing can follow its full path: Use your indexing software cross-reference tool to verify all the cross-references. Run up single subentries. When deciding if subentries can be combined, it’s a big help to easily know the exact text being referred to.

When you are happy with the edited index, the index is written to RTF with an equal sign (=) marking the subentries (easily achieved). Click on the Embed Index tool on the WordEmbed toolbar, point to the RTF, and watch WordEmbed work through the document, putting in the index markers. The index appears at the end of the document.

Be aware that Word puts its main entry See also cross-references off the main heading text. The WordEmbed utility is able to put these See alsos as last subentries, which gives cleaner main headings that are more easily scanned by the index user. I currently hesitate to do this. My main embedded index client sends out to index writers instructions for hand-embedding in Word (along with a document that suggests using index cards), so I am assuming they are most familiar with the Word default cross-reference positioning.

However, you must set your indexing software to hang main entry See also cross-references off the main heading. If you set your software to put the cross-references as the last subentry, Word will eat all of the cross-references during the embed process: *poof* They’re gone.

I strongly encourage working with the book as one Word document rather than separate chapter files; though this is rarely an indexer’s decision to make, we can advise. If the book is sent as separate chapter files, then the indexer must ensure that each chapter is assigned a unique page range. The specific page range does not matter; it simply must be unique. Roman numerals for the front matter is not unique; the range of page numbers is what needs to be unique for each chapter, because the page reference numbers are based on Word document page number. Also, when working in separate chapter files, the cross-references must be suppressed (via a checkbox) for all chapters but one. If the book is one Word document, then the page ranges are not an issue, nor are the cross-references. In addition, it’s much easier to go to a specific locator when the book is one large document; you don’t have to figure out which chapter file the sought locator is in.

Now let’s talk about sorting, because Word determines the final sort order of the embedded index, not your indexing software, and in addition, there are special characters that need to be handled.

In Word, the tilde character ~ around text hides that text yet the text affects sort. So putting ~000~ in front of an entry will force the entry to the top of the index. In my Cindex software, I have to put two tildes, since the tilde is a special character in Cindex that tells Cindex to “ignore” the next character and just pass it on to the final RTF, so I have to put a tilde to tell Cindex to ignore the tilde: ~~000~~

Putting curly braces { } around text allows that text to show without it affecting sorting. So {The }Book Title allows “The” to show yet the entry sorts under “Book Title.” That’s straightforward enough, except that you need those special characters to survive export to RTF. In Cindex, this means that a tilde needs to be placed in front of each curly brace: ~{The ~}Company She Keeps. Curly braces are required when you want minor start words to be ignored in subentry sorting: ~{as ~}neutral or ~{of ~}children.

A colon is a special character in Word embedded indexing, separating entries and subentries. So if you need a colon in an entry, such as for a subtitle, it needs to be ignored by Word by being placed in curly braces (the space after included), and those curly braces need a tilde to survive export to the RTF: ~{The ~}Therapy of Desire~{: ~}Theory and Practice in Hellenistic Ethics.

Double quote mark are apparently very special characters in Word embedded indexing, because in Cindex it takes many backslashes with curly braces to get Word to treat them as simple double quotation marks: The double-quote mark needs a backslash in front of it to be ignored by Word, but the backslash needs a backslash to be ignored by Cindex. Then there’s something about two backslashes meaning something, so you need two more to counteract that. Then there are the curly braces you need to get Word to ignore the double quote mark when it sorts, and the tilde that Cindex needs to ignore the curly brace. So you get:

~{\\\\“~}words in double quotes~{\\\\”~}

I find it easier to just enter double quote marks as I’m gathering entries, then do a couple of Replaces when I’m finished gathering entries. (I need a couple because they’re curly quotes.)

I back up my Word files often. When I finish gathering entries, I back up that version of the Word embedded file. Then I copy it to a “test” directory, into which I put the exported RTF from my indexing software and do an embed of the index to check for errors. One potential error is mistyping a reference number from the comment balloon; WordEmbed notes the errant number to an error document. The embed process can have problems if Word is not happy with a selection. For example, a selection of text before a table and then into only part of the table may make Word unhappy. A new selection of the entire table or a portion fully within the table needs to be created, and the page references replaced in the index file. An index mark can’t be placed in the caption box associated with a photo; they can be placed in the main text and in footnotes. You don’t see the comment balloon holding the reference ID in a footnote, but footnote index marks work fine.

And of course there may be sorting issues.

Every time I do an embed, I use a “fresh” copy my embedded document; I don’t re-embed into a document that’s already been embedded into. If you have problems because the index RTF is huge, James Lamb gives instructions on breaking the RTF and embedding each piece, so re-embedding from that perspective is fine. But updating the index file then generating a new embedded index requires a “fresh” copy of the embedded document. I do all but my final embed(s) in a “test” folder on file copies so that I don’t affect my main embedded book file.

Remember: This main embedded book file is a live document in the publishing process.

My embedded indexes do not take multiples of time longer to write than conventional indexes; I find the WordEmbed process to be very efficient.

So what about placement of index marks within the paragraph? There was an interesting email discussion about translation jobs in InDesign that mentioned the practice of clustering the marks at the beginning or end of the paragraph, so that the translator could swipe the paragraph text without disrupting indexing tag markers. My current practice is to mark exact text, putting many of my marks buried within paragraphs.

If you index on a Macintosh, you will need a PC simulator to run the WordEmbed macro, and an Intel Macintosh to run the simulator. (Check About This Mac under the Apple menu to verify an Intel processor.) WordEmbed is a .dot template file of Visual Basic code, so if you have other Add-Ins installed in Word, especially “amateur-written” templates, and experience any difficulties with WordEmbed, try uninstalling the other Add-Ins.

With the James Lamb WordEmbed utility, your embedded indexes will have the same high quality as your conventional indexing jobs. The utility is available on the Web (http://www.jalamb.com/wordembed.html), currently priced at US$130 (conversion rates may have changed).


See my website 〠 See other articles

Spread the love