In the STRATER Log Batch Production HOWTO I described how to produce a large number of PDF and JPG files as output of STRATER, and I vaguely promised to post a script that automates geo-tagging of the resulting JPG files.
With a combination of Adobe Acrobat and ARCInfo GIS it may be possible to convert technical drawings or PDF files to JPG files and georeference them. However, my intention was to achieve this with with low-cost software tools and utilities. Here it is.
Let us assume that STRATER has produced a folder full of log files in PDF format, and all files were named using A-PDF Rename according to the borehole names. It is highly likely that somewhere on your system there is a spreadsheet file that contains the borehole names and the coordinates. If they are already in decimal latitude/longitude format: great! If not, you should convert them.
To be able to use this script, you will need to install the following software packages:
- IrfanView (http://www.irfanview.com) including all plugins; private use is free, IRFAN asks a small and very reasonable license fee for professional use. If you are using my script (bgt.bat) in a commercial or business context, please register and buy a license.
- 32bit(!) Ghostscript (http://www.sourceforge.net/projects/ghostscript),
- Exiftool (http://www.sno.phy.queensu.ca/~phil/exiftool),
- Exiftool GUI is optional, and good for checking the results. It can be found at (http://www.freeweb.siol.net/hrastni3/foto/exif/exiftoolgui.htm),
- GNU Win Tools for MS Windows (http://sourceforge.net/projects/gnuwin32),
- GNU Win Linux Utilities util-linux-ng for Windows (http://gnuwin32.sourceforge.net/packages/util-linux-ng.htm),
- GNU gawk (http://gnuwin32.sourceforge.net/packages/gawk.htm),
- GPSBabel (http://www.gpsbabel.org).
At the time of writing of this script, all software except IrfanView (for professional purposes) was free of charge.
All directories of the executables mentioned above need to be in the PATH variable of your Windows system.
The script runs successfully on my Windows XP system, I have not checked for others. Please let me know if you got them to work on other systems as well, I will post it here.
The script expects a coordinate file with the name “coord.txt” in the working directory, which is a tab-delimited table with the colums
Name | Latitude | Longitude
The “Name” can be any format (I think), but I am not sure if it may contain spaces, so that part needs to be tested still. The script expects latitude and longitude to be provided in decimal degrees. There should be *no* header row in this file, it will be generated automatically during the batch process.
The script also expects all pdf files that are to be transformed in the same directory as the coordinate file.
The script itself should be in the same directory as the coordinate.txt and pdf files.
All else should run automatically.
Should there be any problems, send me a mail or comment in the blog.
This script is provided for free, as I have profited a lot of the GNU community, and I am happy to return something. Nevertheless, please be aware that I cannot spend too much effort with support as long as I am providing it for free.
I think some of the code may be less than optimal, so I am thankful for any optimisation of the script. Please feel free to comment.
Simply copy and paste the text below into a text editor and save as bgt.bat (bgt is for Batch Geo-Tagging) in the directory where the PDF files and the coordinate file are stored.
To produce the Google Earth kml file, exiftool will need a kml.fmt file to define the kml output format. You can find an example here. You may need or want to edit it to adapt it to your needs. It needs to live in the same directory as the other files.
Then open the directory and double-click the bgt.bat file. A command prompt window will open and comment what is happening.
This script is under GNU General Public License or GPL (http://www.gnu.org/copyleft/gpl.html). However, any software used in this script is under its own and separate license, which may or may not be different from the GPL. Please make sure to comply with the respective licenses. I do not ask for a fee, and as it comes along under the GNU GPL, there is no warranty for this script, neither assuring that it is fit for your purpose, nor that it does not interfere with anything on your computer. The script is provided for free as in “free beer”, and may be modified and shared as long as the original source (i.e. this web site) and its copyright is mentioned.
—— Start of batch script ——
cls
echo off
echo This batch file was developed by Thorsten Kallnischkies and is under GNU General Public License or GPL (http://www.gnu.org/copyleft/gpl.html). However, any software used in this script is under its own and separate license, which may or may not be different from the GPL. Please make sure to comply with the respective licenses. There is no warranty for this script, neither assuring that it is fit for your purpose, nor that it does not interfere with anything on your computer, and if something breaks you can keep the pieces.
rem START /min mplay32 /play /close %windir%\media\ding.wav
pause
echo CONVERTING ORIGINAL PDF FILES TO JPG FILES
i_view32 .\*.pdf /invert /invert /convert=*.jpg /one /killmesoftly
echo SETTING UP JPG FILES FOR GEOTAGGING…
exiftool “-FileModifyDate>CreateDate” -overwrite_original .\*.jpg
exiftool “-CreateDate>DateTimeOriginal” -overwrite_original .\*.jpg
exiftool “-CreateDate>ModifyDate” -overwrite_original .\*.jpg
echo JPG FILE DATES ARE UPDATED, NOW PRODUCING A FILE WITH FILENAMES AND TIME STAMPS
exiftool .\*.jpg -FileName -CreateDate -T > borehole_date.txt
echo REMOVING “.JPG” FROM THE FILE NAMES IN THE FILE
gawk {gsub(/.jpg/,\”\”,$1);print} borehole_date.txt > borehole_date1.txt
echo REMOVING LEADING “+” AND “-” FROM UTC RELATED TIME STAMPS
gawk {gsub(/+/,\”,+\”,$5);print} borehole_date1.txt > borehole_date_2a.txt
gawk {gsub(/-/,\”,-\”,$5);print} borehole_date_2a.txt > borehole_date_2b.txt
echo REPLACING COLONS FOR SLASHES IN DATES
gawk {sub(/:/,\”/\”);print} borehole_date_2b.txt > borehole_date_3.txt
gawk {sub(/:/,\”/\”);print} borehole_date_3.txt > borehole_date_4.txt
echo PRODUCING GPX RAW DATA FILE
join coord.txt borehole_date_4.txt > gpx_raw_data.txt
gawk {print”$1″\”,\”"$2″\”,\”"$3″\”,\”"$4″\”,\”"$5″} gpx_raw_data.txt > gpx_raw_data.csv
rem echo CLEANING UP…
rem gmkdir .\config
rem mv coord.txt .\config
rem rm .\*.txt
echo CREATING HEADER FILE
echo name,lat,lon,date,time,utc > gpx_raw_file.csv
echo ADDING THE GPX RAW DATA
cat gpx_raw_data.csv >> gpx_raw_file.csv
echo GPX RAW DATA FILE PRODUCED
echo NOW CONVERTING RAW DATA TO GPX TRACK FILES…
gpsbabel -i unicsv -f gpx_raw_file.csv -o gdb -x transform,trk=wpt -F coord.gdb
gpsbabel -i gdb -f coord.gdb -o gpx -F coord.gpx
echo GPX TRACK FILES ARE PRODUCED, NOW GEOTAGGING THE JPG FILES…
exiftool -geotag coord.gpx -overwrite_original .\*.jpg
exiftool “-CreateDate>FileModifyDate” -overwrite_original .\*.jpg
echo JPG FILES ARE GEOTAGGED
echo NOW PRODUCING SMALLER VERSIONS OF JPG FILES…
gmkdir .\jpg_small
gmkdir .\pdf
i_view32 .\*.jpg /resize=(800,800) /aspectratio /convert=.\jpg_small\*.jpg
rem THIS NEEDS TO BE EDITED MANUALLY IF THE “RESIZE” PARAMETER IS TO BE CHANGED.
echo SMALLER JPGS ARE SAVED
echo NOW PREPARING THE KML FILE IN DIRECTORY JPG_SMALL …
cp .\kml.fmt .\jpg_small
cd .\jpg_small
echo NOW PREPARING THE KML FILE FOR GOOGLE EARTH
exiftool -p .\kml.fmt .\*.jpg > .\out.kml
cd ..
echo ALL DONE, NOW CLEANING UP, PLEASE BE PATIENT…
gmkdir .\jpg_original
gmkdir .\config
mv .\*.jpg .\jpg_original
mv .\*.pdf .\pdf
mv .\*.g** .\config
mv .\coord.txt .\config
mv .\kml.fmt .\config
rm .\*.csv
rm .\*.txt
echo CLEANING UP IS COMPLETE, NOW SAVING README.TXT FILE WITH FILE INFORMATION…
echo All pdf files formerly in this directory should have been converted into jpg format and georeferenced, reduced in size and transformed into a Google Earth readable kml file. Please check the results: >> .\readme.txt
echo ———————————– >> .\readme.txt
echo These files are the original PDF files, which have been moved to the subdirectory PDF: >> .\readme.txt
ls -al .\pdf >> .\readme.txt
echo ———————————– >> .\readme.txt
echo These files have been produced, georeferenced and moved to the subdirectory JPG_ORIGINAL: >> .\readme.txt
echo – >> .\readme.txt
ls -al .\jpg_original >> .\readme.txt
echo ———————————– >> .\readme.txt
echo These files have been reduced in size, georeferenced, converted into a Google Earth kml file, >> .\readme.txt
echo and moved to the subdirectory JPEG_SMALL: >> .\readme.txt
echo – >> .\readme.txt
ls -al .\jpg_small >> .\readme.txt
echo ———————————– >> .\readme.txt
echo The GPS files and the original coordinate file have been moved to the CONFIG directory. >> .\readme.txt
echo – >> .\readme.txt
ls -al .\config >> .\readme.txt
rem START /min mplay32 /play /close %windir%\media\tada.wav
echo DONE!
echo
echo PLEASE READ THE README.TXT FILE AND CHECK YOUR FILE DATA
echo HIT ANY KEY TO EXIT.
pause
—— End of batch script ——
You can now check the coordinates in the EXIF data with Exiftool GUI, simply load all JPGs into it and flick through the files to check. If something is not working, let me know and I will try to assist you.
This batch file should run on any PDF/JPG files with an associated coordinate file, such as technical drawings of all kinds from a CAD system, etc.
I am looking at translating the script to Linux, but as IRFANView is used for PDF to JPG conversion here and does not run under Linux, I need to look deeper into the GhostScript parameters to get that done first. May take a while and does not look that urgent, as STRATER runs under Windows as well, which was the starting point of it all…
Enjoy!


