The publisher surely has the original tex (or whatever format the author/editor used) file(s) that can easily be rendered to epub/mobi, right?!
Wrong.
Prior to 2000, the MS may never have been submitted electronically; or might be in an obsolete format (Word Perfect, anyone?).Then, until circa 2008, the book was almost certainly copy-edited on paper -- that is, the editor printed out a paper copy and the copy-editor corrected typos by hand, which where then manually transcribed on a DTP system to produce the final output. The author therefore doesn't have an as-published electronic copy.
To make matters worse, all the big publishers outsourced copy-editing, typesetting, and printing many years ago. The typesetting was probably executed using Quark Publishing System on MacOS by an external agency, who then burned any backups on floppy disk or CDROM. The publishers do not own these, and in fact cannot acquire them without paying the typesetting bureau a three (or four) digit copying fee. So neither the authors nor the publishers own an as-published copy.
These days stuff is typeset on Adobe InDesign, which is a whole lot more ePub-friendly, and which can import Quark files ... except that Quark's format is notoriously idiosyncratic and import ops commonly lose some formatting info. (Such as italics, font changes, etc.)
TL:DR is that any book published before 2007 or thereabouts may be impossible to republish without either OCR or re-typesetting from scratch.
(TeX is pretty much unheard of among authors outside the science fields, and thanks to M$ churning the MS Word file format repeatedly between 1990 and 2008 it may be difficult to do anything with the original manuscript.)
Going forward the picture is brighter: I have as-published ebook editions of all my books and know how to crack the DRM on them to pull that text out in a legible form, with formatting intact. (Yes, I appreciate the irony of this. Why, just last week I emailed three DRM-cracked novels I downloaded off BitTorrent to an editor. I wrote 'em, the editor has a license to publish them in the UK, so it's legal, but ... the irony! It burns!)
Wrong.
Prior to 2000, the MS may never have been submitted electronically; or might be in an obsolete format (Word Perfect, anyone?).Then, until circa 2008, the book was almost certainly copy-edited on paper -- that is, the editor printed out a paper copy and the copy-editor corrected typos by hand, which where then manually transcribed on a DTP system to produce the final output. The author therefore doesn't have an as-published electronic copy.
To make matters worse, all the big publishers outsourced copy-editing, typesetting, and printing many years ago. The typesetting was probably executed using Quark Publishing System on MacOS by an external agency, who then burned any backups on floppy disk or CDROM. The publishers do not own these, and in fact cannot acquire them without paying the typesetting bureau a three (or four) digit copying fee. So neither the authors nor the publishers own an as-published copy.
These days stuff is typeset on Adobe InDesign, which is a whole lot more ePub-friendly, and which can import Quark files ... except that Quark's format is notoriously idiosyncratic and import ops commonly lose some formatting info. (Such as italics, font changes, etc.)
TL:DR is that any book published before 2007 or thereabouts may be impossible to republish without either OCR or re-typesetting from scratch.
(TeX is pretty much unheard of among authors outside the science fields, and thanks to M$ churning the MS Word file format repeatedly between 1990 and 2008 it may be difficult to do anything with the original manuscript.)
Going forward the picture is brighter: I have as-published ebook editions of all my books and know how to crack the DRM on them to pull that text out in a legible form, with formatting intact. (Yes, I appreciate the irony of this. Why, just last week I emailed three DRM-cracked novels I downloaded off BitTorrent to an editor. I wrote 'em, the editor has a license to publish them in the UK, so it's legal, but ... the irony! It burns!)