Messy source files? Check. Short turnaround? Definitely. Complex work flow? That goes without saying. Multiple deliverables? Double-check. (And, really, do you need to ask?)

It seems as if Indian vendors live to take on any challenge that any publishing client—journal, k–12, higher ed, or STM—might throw at them. If you need more convincing, just take a look at the following projects, chosen randomly, that showcase these vendors’ capabilities across segments and domains, and the extra mile they go to deliver the projects.

Amnet Systems

XML conversion of religious publications was the challenge for Amnet Systems. The production team had to provide native XML files based on the client’s DTD as well as EPub standards. Most of the 75,000-plus pages that arrived at its door were hard copies. Using an integrated XML EPub work flow, the team relied on its vast experience with theological works to design the pages in XML/HTML/SGML with style sheets matching the original source. The project also involved non-English languages such as German, Greek, Hebrew, Latin, Spanish, and Syriac. The team used biblio- and media-tagging—Amnet’s two new value-added services—to link resources, references, and images. Aside from the e-book formats determined by the client, the team also provided an optimally designed single output to fit any output device.

Another conversion project, from PDF to EPub, saw more than 100,000 pages processed per month. For those titles from the client’s frontlist, the team had to deliver each title within one week receipt; an extra week was given for the backlist, which usually involved scanned pages. The mostly non-English titles came in different complexity levels, thus requiring more time to fix errors that cropped up during the normal text extraction process and to meet the 99.995% content accuracy expectation.

Datamatics Global Services

Converting more than three million pages of legal content in multiple formats was what a major legal publisher wanted from Datamatics. Given the complex interlinking between the pages and issues, the team had to develop special tools, a tracking system, and quality mechanisms to help ensure accurate linking and content rendering. Its four delivery centers in India (located in Mumbai, Nashik, Chennai, and Puducherry) were ramped up to deliver the project within 10 weeks. Another four million to five million pages are to come.

A different project saw the team helping to convert academic titles containing mathematical, scientific, and graphical content into XML and e-books. Maintaining a high degree of accuracy during content conversion and rendering was crucial. Apart from content transformation, Datamatics was also called in to provide quality assurance for the publisher’s projects, including those done by other vendors. So far, more than one million pages have been processed.


Providing full project management services from India was the requirement set by diacriTech’s client for a math-laden science series. Composed in InDesign and managed in an XML-first work flow, the 2,600-page project employed a two-prong approach to enable delivery in print PDF and e-format (HTML5 and enriched e-books). Following the success of their first collaboration, the publisher is working again with the team to create several titles with iBooks Author.

Another full-service project, involving a 2,900-page series with English, science, social science, and math components within each title, presented quite a challenge. Each title had to be carefully designed to ensure continuity of the look and style of the series. The team was instructed to create and embed interactive elements and read-along audio for the accompanying Flash e-book. A test generator was also developed to help teachers produce assessments based on the content of the series.

DiTech Process Solutions

Creating digital pages out of 100 print titles published between 1995 and 2000 was one of the many digital publishing projects handled by DiTech last year. The client gave the team just over one month to recreate more than 40,000 pages (with illustrations) and convert them into EPub, XML, Web PDF, and print-ready PDF, the last format being the most difficult. Although high-end scanners were employed to maintain color accuracy and clarity for images and illustrations, the team still had to do considerable color correction and retouching prior to generating print-ready PDFs. As for EPub and XML deliverables, many in-house tools and scripts were developed and written to speed up the tagging and QC processes.

Gantec Publishing Solutions/

Creating an iPad app out of a discrete math textbook was the challenge given to Gantec. First, pages laden with symbols, equations, charts, and graphs had to be converted from PDF to HTML. Then the team developed an HTML reader with built-in features such as a dictionary, notes, bookmarks, and interactive practice tests. With this app, users can convert finished tests into PDF, print them for review and submission, or e-mail them.

One of the complex projects received at involved photo research and management for more than 1,000 backlist titles. Tasks included identifying royalty-free photos, replacing those that were no longer available with free images in the public domain, and purchasing replacement photos if free images were unavailable. In some cases, the team had to decide if removing an image from the e-book would affect its content. Sales from the e-books through the client and eBooks2go’s channels are then reported for profit sharing, with the cost of purchasing images deducted accordingly.


Creating interactive content that is compatible with most Web-based browsers and interactive whiteboards and viewable on autolaunching DVD was the project brief from one innovative publisher of supplemental teaching material. To meet the challenge, Integra’s new media services team came up with a multiplatform interactive learning solution that featured engaging content, user-friendly navigational tools and templates, animations to reinforce learning, as well as audio and video-based elements. Integra’s U.S. team took on the project management, content development, and customization tasks, while the rest of the processes, involving new media, Flash, animation, and interactivity, were done in India.

Lapiz Online

One IWB (interactive whiteboard) project with 10,000-plus flip charts and as many pages of teacher notes tested the Lapiz team’s knowledge and skills. A short turnaround time was given to produce pages for different grades and deliver batches at specified intervals. The project scope included development, conversion, building interactivity, and testing of the flip charts.

MPS Limited

The MPS team customized a British journal publisher’s Web-based production system, Journal Track, to enable authors to track the production status of accepted articles and for journal editors to track the status, content, and page and color budgets of journal issues. Functions such as article metadata capture and storage, article and issue tracking, and schedule creation were automated for the publisher, and special features such as measurement of supplier turnaround time and an e-commerce facility for the payment of nonsubscription revenue items were added. Journal Track offers a full overview of the real-time status of all material in production and provides increased transparency throughout the production process for all involved. Integrated into the accounting system, it also offers a widget for the finance team to check order details.

Planman Technologies

One of the most turbulent periods in U.S. history, the Civil War, was the topic of a full-service packaging project that an American publisher handed Planman Technologies. The eight-title series required meticulous content writing and copyediting to ensure factual accuracy. Much time was also spent on researching and procuring photos of that time period, with each title requiring between 80 and 100 photos. Next, the design team had to come up with interesting page layouts to make the content engaging to young readers (ages 10 to 13) to acquaint them with this important historical period.

SPi Global

A decade ago, it took SPi Global five years to convert more than 30 million print pages into PDF and XML formats for an STM publisher. Now, its team expects to process a similar number of pages but in more deliverable formats in less than two years. To speed up the process, besides utilizing special conversion tools developed for STM and professional publications (invariably sprinkled with equations, complex tables, and cross-references), employing a global work-flow model helps considerably. For such large-scale projects, SPi Global’s six offices in three countries work seamlessly on a common conversion platform. For a content enrichment project, another team extracted information such as chemical formulas from various STM publications and updated the client’s databases. Such projects require collaborative partnership with the client and the skill of SPi Global’s 75 full-time postgraduate and Ph.D.-level subject matter experts.

Swift Prosys

To handle a polytechnic institute’s journals from the 1800s, the team at Swift Prosys first had to learn and recognize Gothic fonts. Scanned pages from the client were OCR-ed and proofread by two staff, with a third person analyzing both sets for mismatches and corrections. Style was then applied and the documents converted into TEI P5 XML files. However, the liberal use of technical terms made spell-checking difficult, and special zoning software was required to work on the large-format illustrations. Nonetheless, within seven months, the team had completed 46 volumes (24,000 pages) with 250 large-format illustrations.

Another project involves scanned pages from handwritten burial and cremation registers dating back to the 18th century, where the biggest challenge lies in deciphering different people’s handwriting. The team’s tasks include analyzing each register, studying handwriting patterns, preparing a specific set of manuals for each register, double keyboarding, and counterchecking and validating each record against a huge database of first names, surnames, counties, and councils. A tool that provides 99.98% accuracy in reading the handwritten script is employed to aid the process. About two million records have been processed, with another three million to four million in the pipeline.

Thomson Digital

At Thomson Digital, a big project involving 154 chapters and nearly 2,500 typeset pages became even more complex with requirements for three different types of deliverables for different chapters (print only, Web only, and print plus Web) and to liaise with nearly 154 contributors, several section editors, and the publisher. With some contributors preferring soft copy while others hard copy for review, job tracking became a challenge. The team had to engage subject matter experts to interpret handwritten corrections and monitor all project stages on the in-house project management system (TDPMS).

Another project, totaling 27,740 pages in 40 volumes (approximately 1,050 articles) on organic chemistry, took the team nearly a year to complete. The biggest challenge was index generation from the client’s inappropriately tagged XML files. Special tools were developed to extract the index terms before sorting them according to the client’s guidelines. The final index took up more than 500 pages.