Convert Media Sometimes one simply need to convert a video, audio file or document to another format. Text encoding Text encoding can get totally wrong, specially when the language requires special characters like à äç. The command iconv can convert from one encoding to an other. # iconv -f -t # iconv -f ISO8859-1 -t UTF-8 -o file.input > file_utf8 # iconv -l # List known coded character sets Without the -f option, iconv will use the local char-set, which is usually fine if the document displays well. Convert filenames from one encoding to another (not file content). Works also if only some files are already utf8 # convmv -r -f utf8 --nfd -t utf8 --nfc /dir/* --notest Unix - DOS newlines Convert DOS (CR/LF) to Unix (LF) newlines and back within a Unix shell. See also dos2unix and unix2dos if you have them. # sed 's/.$//' dosfile.txt > unixfile.txt # DOS to UNIX # awk '{sub(/\r$/,"");print}' dosfile.txt > unixfile.txt # DOS to UNIX # awk '{sub(/$/,"\r");print}' unixfile.txt > dosfile.txt # UNIX to DOS Convert Unix to DOS newlines within a Windows environment. Use sed or awk from mingw or cygwin. # sed -n p unixfile.txt > dosfile.txt # awk 1 unixfile.txt > dosfile.txt # UNIX to DOS (with a cygwin shell) Remove ^M mac newline and replace with unix new line. To get a ^M use CTL-V then CTL-M # tr '^M' '\n' < macfile.txt PDF to Jpeg and concatenate PDF files Convert a PDF document with gs (GhostScript) to jpeg (or png) images for each page. Also much shorter with convert and mogrify (from ImageMagick or GraphicsMagick). # gs -dBATCH -dNOPAUSE -sDEVICE=jpeg -r150 -dTextAlphaBits=4 -dGraphicsAlphaBits=4 \ -dMaxStripSize=8192 -sOutputFile=unixtoolbox_%d.jpg unixtoolbox.pdf # convert unixtoolbox.pdf unixtoolbox-%03d.png # convert *.jpeg images.pdf # Create a simple PDF with all pictures # convert image000* -resample 120x120 -compress JPEG -quality 80 images.pdf # mogrify -format png *.ppm # convert all ppm images to png format Ghostscript can also concatenate multiple pdf files into a single one. This only works well if the PDF files are "well behaved". # gs -q -sPAPERSIZE=a4 -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=all.pdf \ file1.pdf file2.pdf ... # On Windows use '#' instead of '=' Extract images from pdf document using pdfimages from poppler or xpdfhttp://foolabs.com/xpdf/download.html # pdfimages document.pdf dst/ # extract all images and put in dst # yum install poppler-utils # install poppler-utils if needed. or: # apt-get install poppler-utils Convert video Compress the Canon digicam video with an mpeg4 codec and repair the crappy sound. # mencoder -o videoout.avi -oac mp3lame -ovc lavc -srate 11025 \ -channels 1 -af-adv force=1 -lameopts preset=medium -lavcopts \ vcodec=msmpeg4v2:vbitrate=600 -mc 0 vidoein.AVI See sox for sound processing. Copy an audio cd The program cdparanoiahttp://xiph.org/paranoia/ can save the audio tracks (FreeBSD port in audio/cdparanoia/), oggenc can encode in Ogg Vorbis format, lame converts to mp3. # cdparanoia -B # Copy the tracks to wav files in current dir # lame -b 256 in.wav out.mp3 # Encode in mp3 256 kb/s # for i in *.wav; do lame -b 256 $i `basename $i .wav`.mp3; done # oggenc in.wav -b 256 out.ogg # Encode in Ogg Vorbis 256 kb/s