Asking vendors to tell us about their most challenging or unusual projects is itself a challenge. They have encountered dozens (if not hundreds) of complex projects in a year and can be hard-pressed to remember which was a worse nightmare with an impossible deadline or a bewildering project briefing. Making multiple outputs out of messy inputs is what they do every day.

Here are some notable projects that showcase these vendors’ capabilities and expertise across segments, domains, and languages.

AEL Data Services

Its biggest and most challenging project involved portal development for a leading k–12 educational provider and publisher in the Middle East. The portal uses multiple open source Web technologies and frameworks, such as Liferay, eFront, CakePHP, MySQL, TCExams, and Java, and has Arabic and English interfaces. CMS, LMS, AMS (assessment management system), social networking, Web-based e-book readers (supporting both PDF and ePub formats), and mobile apps are integrated into the client’s educational and publishing portals. The multitier architecture also supports cloud hosting to enable thousands of users to access the system simultaneously. A single sign-on across all integrated systems eases access.

At the same time, the client also awarded the team a project to convert about 100 Arabic titles into ePub so that the e-books could be accessed via the portal when it went live. These educational titles were in Tashkeel-based text with complex layouts, formats, and mathematics symbols and equations. To meet the ultra-short deadline, tools and scripts were developed in-house to automate certain processes while two teams of Arabic language experts—based in Chennai’s delivery center and in Tunisia—worked in tandem to ensure the highest possible conversion quality.

Amnet Systems

For a project with scanning and POD requirements that kicked off with 25 black-and-white books per week back in June 2012, Amnet Systems’ team had to convert hard copies into editable PDFs followed by further enhancements and revisions. Nondestructive scanning was deployed for some of the fragile books. Cleanup included removing moiré pattern for grayscale images, color correction, de-skewing, and even margin adjustment. The quality of the work coupled with timely delivery soon led to a contract covering 500 titles per week with a three-month turnaround time. So far, more than 2,000 titles of about 350 pages each have been processed. Another 250 full-color titles will be added to the weekly schedule in May.

Cenveo Publisher Services

A project covering three student books (totaling 528 pages and 600 photos) with accompanying teacher’s editions (at 672 pages) and workbooks (288 pages) was challenging due to its intensive design and large number of photos. Copyediting, proofreading, layout, and photo research were handled by Cenveo’s U.S. project management team while the rough page design was done in India. This hybrid workflow helped economize on time and cost while utilizing the strengths of each production facility in creating a successful product on schedule and under budget.

Another design-intensive project required the team to work on 294 chapters and deal with multiple contributors. Using only one copyeditor to maintain consistency of style made the project difficult to schedule. The team also faced a large number of figures and complex tables that did not transfer correctly from Word.

One automated XML legacy conversion project, on the other hand, covered more than one million journal pages spanning 25 years. The team had to put in considerable developmental efforts to deal with the journal’s design changes over the decades. Since the publisher managed the content in-house with proprietary ContentItem DTD, a transition to the NLM DTD was required, thus necessitating development of a special tool to enhance the conversion. Several rounds of testing and updates were performed until the software was confirmed stable enough for live production.


Direct-selling capabilities are now built into codeMantra’s cP3.0 platform. Encryption that optimizes ePub files as well as essential protocols and provisions have enabled cP clients to set practical parameters for direct content selling. These capabilities have proven to be profitable for such companies as Stylus/Potomac/Paradigm Publishers (U.S.) and Callan Publishing (U.K.), whose language learning content has reached a global audience with sales coming from as far afield as China. Two new direct sales installations are in progress now.

The team also perfected a robust catalogue creation tool for another cP3.0 client, the World Bank. The scripted dynamic workflow now allows users to prescribe a catalogue layout for a selection of titles. Using style sheet templates and an XML workflow, the system checks all pertinent metadata and delivers it to codeMantra’s InDesign server. Within minutes, a World Bank user can access a print-ready or Web-optimized PDF catalogue complete with an index and updated sales contacts. This tool can also auto-generate an order form.

Contentra Technologies

An iBooks Author project of middle and high school textbooks totaling 12,000 pages, with 13,300 illustrations and a huge number of math equations, was the challenge given to Contentra. The team enriched the original print content, provided in Quark, by creating multitouch iPad textbooks with such features as photo galleries, audio/video integration, interactive diagrams and animations, keynote presentation, chapter reviews, and multiple-choice questions. Videos supplied in Flash also had to be converted into M4V format. Repagination in iBooks Author was complex, but with proper widget creation, the whole project was completed on time with high quality.

A psychology project requiring animation using HTML5 had the team creating static assets for all digital object storyboards, producing both Flash and HTML5 versions of the animation, and making the HTML5 animations compatible on both iOS and Android platforms. The varied screen resolutions for various iOS and Android devices posed some problems that took considerable time to tweak the content.

Then there was the world history project for grades 9–12 that included the student’s edition, teacher’s edition, and ancillaries for a total of eight titles. The team had to turn it around within 12 weeks, providing editorial, design, page composition, and layout work on nearly 6,000 pages.


For one major client facing the poor quality of Spanish legal content that was converted into XML in-house, the Datamatics team processed more than one million pages by the three-month deadline. Aside from mastering the required language skill set, the team had to match and correct the XML against the source content, which came in multiple formats. Installing the Spanish dictionary into the system eased the task for Datamatics’ Spanish language experts, thus shortening the turnaround time.

Another interesting project involved converting educational content for the college market into XML with interactive e-books and in-app q&a. The client, having had production problems with the original vendor, faced huge financial losses if the digital content was not in the market on time. When the projects landed at Datamatics, two locations were quickly ramped up and specialized tools developed for the most challenging part of the conversion process, i.e., math equations. In a parallel workflow, one team converted the content into XML, while another moved it into ePub and tested it on the QA platform. The project was also done on a chapter basis to save time and increase efficiencies.


Conceptualizing a unique electronic-only project on customer’s request was formidable because the team had to bear in mind—at all times throughout the process—that it was done for a device and not a print book. All page elements, from the text to the quizzes, were restricted to a much smaller word count in order to fit the device screen. Inputs from the team were considered and implemented by the publisher’s editors. This collaborative project took about seven months to develop and became very successful. The client has since requested a full series (of 25 modules) based on the initial product.

A design-intensive math series totaling 1,300 pages, on the other hand, came in as Word files and had to be converted into XML (using MathML) based on the customer’s DTD. Then the XML had to be validated on the customer’s content management system over the Web. Various views generated by the system then had to be checked and the CSS tweaked for accurate page rendering. The finalized XML was then paginated using InXML (diacriTech’s XML-first InDesign workflow) and at every pass live XML feed together with page proofs were delivered into the client’s system.


A partnership with a leading Spanish book distributor has seen DiTech Process Solution’s team providing cross-platform publishing services, from typesetting to digitizing, to more than 300 Spanish publishers, while a project for a travel guide publisher required the use of a Typefi-based automated publishing platform for formatting and indexing more than half a million pages. The main challenge for the latter involved setting up processes for highly stylized formatting to get visually attractive complex layouts, and compiling detailed thematic indexes from the content.

Another partnership deal, this time with Scandinavia’s largest magazine publisher, involved scanning/OCR and generating print PDFs for about 50,000 pages. The need to maintain scalability while delivering sustained quality output within four months saw the team adding automation and quality audits that exceeded the client’s expectations.

Gantec Publishing Solutions

Development of a PDF-based catalogue reader for iOS that can be integrated with existing reader applications or new reader apps was the big achievement at Gantec. This reader supports ePDF documents with various functions such as search, table of contents, bookmarks, notes, audio/video, two-page view, highlights, and annotations. Publishers can use this reader in their own e-bookstores.

Another development project involved ramping up capabilities to deliver fixed layout content for children’s books in KF8 format (which is supported by new Kindle Fire HD devices). These unique capabilities enabled the production of more than 100 fixed layout titles per month for KF8 format.

Harbinger Knowledge Products (HKP)

Two months was the time line given to the team to convert more than 200 desktop assets into a single content package that would work across multiple device platforms and browsers. The client, one of the world’s leading educational publishers with curriculum materials, multimedia learning tools, and testing programs for early learning up to professional certification, required each asset to be made available in four different file formats—Flash, HTML5, MP4, and OGG. Prior to delivery, the team had to check each asset for seamless integration with the client’s LMS.

Another project, for a leading k–12 online curriculum provider, was to provide single-source mobile content. More than 300 highly interactive and engaging courses were produced, each with HTML5-based, SCORM compliant, single-source content for PCs, laptops, iPad, and similar tablets. The client continues to engage HKP in a variety of projects. In the first nine months of the relationship, billing has already exceeded $500,000.


A project to deliver 150 multiplatform interactive e-books (about 35 pages each) in less than three months required the Hurix team to deploy two different readers (for kids and higher grade students) via Kitaboo. Options for word- and paragraph-level audio synchronization were also provided. Then there was encryption logic for user authentication to ensure secure delivery of e-books through the client’s portal, as well as a customized feature to allow users to launch the app from the iPad browser. These children’s books with customized navigation and kid-specific user interface design can be played in three modes: auto, read-to-me, and let-me-read.

Another Kitaboo project covered 24 teacher/student edition titles (each approximately 350 pages) together with assessments. The student app enabled students to access workbook exercises and submit completed tests. There are pen, type, and highlighter tools for annotation, and a check button for answering multiple-choice questions. The teacher app has a roster screen to ease selection of classes and students for assessment purposes. A toggle feature also enables teachers to switch between answers in the student book and those in the teacher’s guide.


Developing math-specific HTML5 tools and a user interface was the project brief Integra received from a major Canadian publisher specializing in French-language educational resources. The biggest problem was integrating the application into the client’s existing Flash-based Flipbook. The team also had to compress the application, create an installer using Adobe AIR, and ensure seamless embedding of the interactivity into the Flipbook.

For another client, this time a pre-k–12 publisher, the assignment was to convert titles into IWB-compatible content that would be used by teachers as supplementary teaching aids. A graphical user interface with vibrant colors and design was created with the needs of the young audience in mind, with each module packed with multiple assessments, narration, video, sound effects, and music. Twenty modules were developed, packaged, and delivered in DVDs as well as in a Web version. (The team is currently working on 80-plus new modules for the client.)

The team also identified third-party material in the manuscripts and cleared permissions for a complex ESL title. More than 222 selections were cleared in 40 chapters. Another project required the team to research, review, and obtain permissions for approximately 2,500 images. The whole process was tracked on Integra’s Asset Management (iAM) system.

Lapiz Online

Converting more than 22,000 pages of graphic novels has given the Lapiz team lots of insight into this niche segment. Complexity in terms of images and content layout was part of the package deal. With most comic (or manga) book pages laid out in double-page spreads, image processing vis-à-vis device requirement and readability became the biggest hurdle.

MPS Limited

Helping an independent academic publishing company move from manual production tracking and management was the challenge facing MPS Limited. With production records managed through job cards and complex spreadsheets, the client had to deal with piles of documents that caused significant production delays. For the MPS team, this project came with five main goals: cutting operational costs, reducing production time, increasing the transparency of each stakeholder including third-party vendors, enhancing author satisfaction, and reflecting real-time author sign-off. MPSTrak—the MPS-hosted platform for production workflow tracking, which was developed in collaboration with a major publisher and meets most client requirements off the shelf—was used throughout the process. The team’s thorough understanding of standard publishing production processes also meant minimum involvement of the publishing client in information gathering and workflow design.


When Bangalore-based online shopping portal Flipkart wanted to add e-books to its services in 2012, Ninestars was signed up as its ePub conversion partner. Since then, books in multiple formats (including print) have been turned into ePub 2.0 and ePub 3.0 formats. Academic content was more demanding, as math equations had to be captured using MathType before ePub conversion to ensure high resolution for equations and smaller file size.

The team also digitized more than 5.4 million pages for JSTOR, and the work included debinding of bound copies; scanning images in bitonal mode, grayscale or color; metadata capture; and OCR-ing and editing to achieve 99.95% accuracy. A semiautomated rule-based program for tagging reference citations was implemented. It identifies citation strings, classifies citation formats, and recognizes citation types to produce highly accurate and parsed XML-encoded data.

Another digitization project, for the nonprofit organization Divine Life Society, came with 2,000 rare and out-of-print titles (nearly 300,000 pages in total) in English and Hindi. The use of high-accuracy OCR enabled the organization to preserve and repurpose its documentary heritage.


Wolters Kluwer Law and Business found the perfect way to offer e-content to law students across the U.S. in Qbend. The result is the eChapters Store, an e-bookstore that works in tandem with Wolters Kluwer’s primary Web site to create a direct sales channel to reach out to consumers. This has led to incremental sales, with customers (who normally do not buy their books) buying specific chapters needed for their courses.

Similarly, the search for the right technology platform (without being burdened by development and implementation issues) led Spain’s first specialty e-tailer Blue Bottle Books to Qbend. A partnership was duly struck to provide the e-tailer with a turnkey solution for content acquisition and marketing. The e-store went “live” within three weeks and has been selling e-books to various countries since then. Offering two language options (English and Spanish), it features a few thousand business and management titles from leading business publishers such as LID Editorial, RA-MA, Amacom, and Berrett-Koehler.

Swift Prosys

Digitizing half a million ancient German newspapers without knowing the language was a challenge for Swift Prosys. Using the client’s auto-segmenting tool, the team had to correctly clip each article, sequence the text blocks, and tag each block according to defined parameters such as byline, title, and section. Clear understanding of the client’s instructions was critical to the project’s success due to the language barrier. This was effectively achieved by spending a whole month on training production staff both in the classroom and on the job. Another 100,000 pages are in production with a total of 500,000 pages to be delivered by the end of the year.

Another project involving 80 children’s educational books in four different languages—Norwegian, Finnish, Swedish, and German—called for the team to produce fixed layout ePub books with a read-aloud feature. Varying text lengths with different languages made fitting the text blocks into heavily designed and illustrated pages a major challenge. In the three months to complete the project, the team also developed an automated tool to convert design-intensive print PDFs into fixed layout ePubs at 85%–95% accuracy.

Thomson Digital

Supporting Elsevier’s SciVerse Scopus—a bibliographic database of abstracts and citations for academic journal articles—has the Chennai team handling nearly 20,500 titles from more than 5,000 international publishers. It also means dealing with content in such languages as Spanish, French, Portuguese, and Dutch. In total, the team delivered ASCII files containing citation, abstraction, and references for nearly 500,000 articles last year. Books will be added to the database in August, and the team is expected to process and deliver approximately 40,000 titles by the end of this year.

The Chennai team also digitized more than 250,000 pages in different formats and languages last year, and approximately 400,000 pages will pass through its production floor by December.

Composition services in Portuguese for the Brazilian market, which started two years back, has been expanded to include meta-tagging and indexing. Since then, the team had completed 60,000 pages and is moving on to editorial services for both books and journals.