Quote:
Originally Posted by RogerBW
If I were to do this my inclination would be to parse the contents page for the article titles and page numbers and build it off that. (You'd get some full-page adverts at the ends of articles but that shouldn't be the end of the world.)
|
This is pretty much my go-to plan as well, but after several tries it seemed like it would be just simpler to write them down by hand, and as a matter of fact I had some time to kill in a train today and I managed to get the first 60 issues' indexes into a text file in the format of:
[article first page]-[article last page]; [article name]
Dash and article last page are omitted on one-page articles.
With this data it should be rather simple thing to write a script that uses an utility like the pdftk to form individual articles from the issue PDF's.
As a side note, I'm wondering how to name the article files, I'm torn between having the issue number before or after the article name. Both seem to have benefits.