Convert to PDF and set PDF name according to folder name in multiple folders

Asked by sword_guy on 2012-05-13

Hello all. I'm looking for a terminal command that I may use with imagemagick to convert multiple jpg files to PDF. I need it to start in a parent directory and go into each subdirectory with picture files and combine them into a PDF in each directory and use the directory name as the name of the PDF.

I'm not going to embarrass myself by even trying to write an example. The only thing I think will be required will be that it starts with "convert". Is this even possible to do? I'm asking this assuming that the command which needs to be written has nothing to do with imagemagick, but rather the terminal. Thanks to anyone that can help.

Question information

Language:
English Edit question
Status:
Solved
For:
Ubuntu imagemagick Edit question
Assignee:
No assignee Edit question
Solved by:
sword_guy
Solved:
2012-05-15
Last query:
2012-05-15
Last reply:
2012-05-13

Are all the PDF files uniquely named?
Where are the files stored and we can test? If so this is simple.

sword_guy (sword-guy) said : #2

I should have been more specific. Let's say I have three folders. In each of those folders, there are multiple folders with pictures in them. I want to go into each folder individually and combine the pictures into a pdf and have the pdf be named based on the folder it's in.

An example would be three folders named Sharon, Lois, and Bram. In each of those folders, there are three (or more) folders with names like March 2012, April 2012, and May 2012, each containing any number of pictures which are in numerical order. I want to create one pdf in each of those subfolders. The pdf in the first folder would be named March 2012.pdf (or better yet, Sharon-March 2012) and it would be all of the images in the folder combined into that one pdf. Then I would want to do that for every other folder.

What I've been doing: In terminal, I cd to the folder with the pictures and run convert * (folder name).pdf (I manually type in the folder name). Then I cd to the next folder and run it again....and again...and again...and AGAIN.....AND AGAIN. You see where I'm going with this. I don't only have three folders. There are probably about 30 more and I'm already fed up. Any ideas? I guess I'm just being lazy and want to type one command and walk away from it.

ahhh I see

try:

gksudo gedit /usr/bin/imagestopdf; sudo chmod +x /usr/bin/imagestopdf

add these lines:

#!/bin/bash
FILES=$(ls *jpg)
for file in $FILES; do
echo $file
BASE=$(echo $file | sed 's/.jpg//g')
echo $BASE

echo "convert $BASE.jpg $BASE.pdf"
convert $BASE.jpg $BASE.pdf
done

pdftk *pdf cat output PictureBook.pdf

Save the new file, when you run 'imagestopdf' witout the quotes, it will add all the images to a pdf file for you.

You will need to run:

sudo apt-get install pdftk imagemagick

To get the commands needed.

sword_guy (sword-guy) said : #5

I'm sorry, that didn't work. I still have to navigate to each sub folder individually and then run the command. Furthermore, it doesn't combine the files. It just creates a pdf for each of them. Will you please explain each line of that code you gave me? What script is it? Perl? I really don't know much about this, but I really want to learn.

I understand that you basically wrote a program named imagestopdf and then called it from terminal. It just needs to be modified slightly to start at a parent directory and automatically go into any directory with pictures. Then do the combining and create the one pdf per folder. If I knew what script it was, I could tinker with it myself. I would still like help because I'm sure I wouldn't be able to get what I need done with my limited knowledge.

Thanks so much!

P.S. To add issues, some of the folders have files jpg and others have files JPG. Would the easiest thing to do be to first make a command that will batch change all filenames to jpg from JPG? The command does seem to care if they're capitalized.

sword_guy (sword-guy) said : #6

Two things to add:

One, I did some testing and fooling around and figured out that your program does combine all the PDFs to one big one. I guess that's ok. Bit of a memory hog, but ok anyway. The one thing I would ask is if that's the way it has to be done, can there be a line in there to delete all the other converted PDFs? Also, the program doesn't handle spaces in the filenames. And there is still the problem of it not doing this conversion recursively and creating a final PDF for each folder and based on that folder's name. If it's too complicated, I understand. I'll just go through each folder individually and run the convert command. If it's possible to do it with a program, then that would be preferable. It's kind of interesting to me to see what can be done and how.

sword_guy (sword-guy) said : #7

Sorry to keep posting, but I'm still working on it myself. I've modified your code a bit and have gotten to the point where it will convert the individual files in all the folders to pdf and then it makes a file in the parent folder called PictureBook.pdf. That pdf contains only 1 page which is the last page it gets from the second find command. Also, it still won't accept spaces in file names. Here is the new code:

#!/bin/bash
FILES=$(find -iname '*.jpg')
for file in $FILES; do
echo $file

echo "convert $file $file.pdf"
convert $file $file.pdf
done

NEWFILES=$(find -iname '*.pdf')
for file in $NEWFILES; do
echo $file
done

pdftk $file cat output PictureBook.pdf

I've found it doesn't matter where the second "done" is placed. Same result.

sword_guy (sword-guy) said : #8

Ok, making more progress. My program now looks like this:

#!/bin/bash
FILES=$(find -iname '*.jpg')
for file in $FILES; do
dir=$(dirname $file)
echo $dir #This was so I could see what the program was outputting when I tried different things
echo $file

echo "convert $file $dir $file.pdf"
convert $FILES $dir/$dir.pdf
done

Now it will create a pdf in each sub folder with the sub folder's name, but the pdfs are all the same because FILES refers to every single file it finds, not just the files in a particular directory. Any ideas?

sword_guy (sword-guy) said : #9

I got it! After hours of fooling around, I finally got it. Thank you, actionparsnip, for pointing me in the right direction! Your post taught me that one can make a program, and taught me a little of the syntax. For the rest, there was google. The only issue is it won't accept files or folders with a space (which is actually Imagemagick's issue, not this program's). To fix that, I found a wonderful other program to replace spaces with underscores (and change uppercase letters to lowercase...but I removed that part). If you navigate to the parent directory in terminal and run this (after creating it the same way imagestopdf was created), it will fix all of the files from the parent directory on. Here's the space removing program (I named it cleanfilename):

if [ -n "$1" ]
then
  if [ -d "$1" ]
  then
    cd "$1"
  else
    echo invalid directory
    exit
  fi
fi

for i in *
do
  OLDNAME="$i"
  NEWNAME=`echo "$i" | tr ' ' '_' | sed s/_-_/-/g`
  if [ "$NEWNAME" != "$OLDNAME" ]
  then
    TMPNAME="$i"_TMP
    echo ""
    mv -v -- "$OLDNAME" "$TMPNAME"
    mv -v -- "$TMPNAME" "$NEWNAME"
  fi
  if [ -d "$NEWNAME" ]
  then
    echo Recursing lowercase for directory "$NEWNAME"
    $0 "$NEWNAME"
  fi
done

____________________________________________________________________________________________________________

Ok, after running that, it was time to run mine. In terminal, cd to the parent directory. Then run imagestopdf. This is what it looks like:

#!/bin/bash

folders=$(find -type d)

for dir in $folders; do
DIR=$dir
echo $DIR

for file in $DIR; do
FILES=$(find $DIR -maxdepth 1 -iname '*.jpg')

echo "Combining files in $DIR and converting to $DIR.pdf"
convert $FILES $DIR.pdf

done
echo "Finished creating file."
done

That does exactly what I want. It will create the pdf named whatever folder the images came from. It places the pdf in the parent (of that folder) directory. That's fine for me. If anybody else is to use this, make sure your folders are aptly named! This will also require Imagemagick compiled from source (installing from Synaptic won't install for filetypes such as jpg). Also, on my computer (4GB RAM and 3.2ghz), it wouldn't do above about 200 pictures at a time, but I think this is related to the quality and size of the pictures. I just broke up the images to be in multiple folders and added "-Part1", "-Part2", etc. to the end of the folder name. Remember, this script is using Imagemagick's convert utility, so it's only capable of doing what Imagemagick's capable of doing.

sword_guy (sword-guy) said : #10

Oops! I just finally looked further through the PDFs that were created. Some pages were out of order! I figured out that the easy way to fix it was just to change the line

"FILES=$(find $DIR -maxdepth 1 -iname '*.jpg')"

to

"FILES=$(find $DIR -maxdepth 1 -iname '*.jpg' | sort)"

Very simple fix, and seems to be working properly now.

I suppose the same thing could be added to the end of the folders line, but I'm not worried about in which order the files are created.