Thursday, February 16, 2012

Adventures migrating photos from iPhoto 9 to Digikam 2.1.1

After our mac mini update to Snow Leopard, the system was clearly now under-powered and since my wife and I both stay logged at most times running just browsers caused us to swap out way too much. I decided to build a Shuttle SH67H3 with i5-2500k CPU and 16GIG of RAM from NewEgg and we are now running Linux Mint 12. The most important part of the transitions was to not loose all our iPhoto Albums. Also, I did not want to do them over from scratch again so I set out on google and found photokam ( https://sites.google.com/site/laurentbovet/photokam ) which did the majority of my lifting but I did not want a straight copy from iPhoto.

iPhoto splits your images tree into Masters(Originals now a link)/Year/Date(Roll)/etc... and Preview(Modified now a link)/Year/Date(Roll)/etc... but digikam ( http://www.digikam.org/ ) allows you to manage your tree how you like and makes new version of a photo by appending _v1 to the basename of the file.

Just to make sure I did not leave myself with the easiest transition possible :-P I decided to just rsync my iPhoto Library Masters directory to /home/photos on the new machine.  I did not want the separate directory tree so after copying all the Masters/Originals sync'd I figured I'd look at copying the files from the Previews/Modified directory copied over and make a -v1 in the same path from /home/photo_temp.  I'm not sure why but iPhoto changes the extension from .jpg to .JPG on or after the copy process? So I wrote a script but this only gets you 90% there. You'll have to do some manual cleanup because in additions because iPhoto seems to copy modified versions i.e. cropped to an entirely different directory(Roll) so that will need to clean up afterwards.


Here is my update_photo script to move the Previews(Modified) versions over the new master location:
----------------- cut -------------------------
#!/usr/bin/perl


use File::Copy;
use File::Find;


find({ wanted => \&process_file, no_chdir => 1},  "." );
sub process_file {
  $tempfilename = $File::Find::name;
  if ( -f $tempfilename ) {
    #print " This is a file: $tempfilename";
       if ( -f "/home/photos/$tempfilename" ) {
          print " and matching file exist \n";
          my $new_name = "/home/photos/$tempfilename";
          $new_name =~ s/(.*)(\..+$)/$1-v1$2/g;
          print "move $tempfilename $new_name\n";
          move($tempfilename, $new_name);
       } else {
          #print " and matching file does not exist \n";
                my $upper_name = $tempfilename;
                $upper_name =~ s/.jpg/.JPG/g; 
                if ( -f "/home/photos/$upper_name" ) {
                   #print " Upper case is there: /home/photos/$upper_name\n";
                   my $new_upper_name = "/home/photos/$upper_name";
                   $new_upper_name =~ s/(.*)(\..+$)/$1-v1$2/g;
                   print "move $tempfilename $new_upper_name\n";
                   move($tempfilename, $new_upper_name);
                  } else {
                   my $lower_name = $tempfilename;
                   $lower_name =~ s/.JPG/.jpg/g;
                   #print " Lower case is there: /home/photos/$lower_name\n";
                   my $new_lower_name = "/home/photos/$lower_name";
                   $new_lower_name =~ s/(.*)(\..+$)/$1-v1$2/g;
                   print "move $tempfilename $new_lower_name\n";
                   move($tempfilename, $new_lower_name);
                }
       }
  } else {


    print " This is NOT a file: $tempfilename\n";
 }
  
}
----------------- cut -------------------------
So now you have to tweak photokam scripts to migrate the information from the AlbumData.xml file copied over from your iPhoto Library directory I just commented out the copy commands since the files were already copied over:
# diff prepare.py photokam-0.5/prepare.py 
8c8
< file_extensions=('jpg', 'JPG', 'jpeg', 'JPEG')
---
> file_extensions=('jpg', 'JPG', 'jpeg', 'JPEG', 'tif', 'TIF', 'tiff', 'TIFF', 'avi', 'AVI')
23c23
<         input = '.'
---
>         input = 'iPhoto Library'
30c30
<     debug = True
---
>     debug = False
128,129c128
<         print('                     Fullname --- '+to_date_string(date))
<         fullname = to_date_string(date)+' - '+roll['RollName']
---
>         fullname = to_date_string(date)+' - '+roll['AlbumName']
166c165
<     #            copy_file(input, original_source_path, out+'/'+target_path, mtime, True)
---
>                 copy_file(input, original_source_path, out+'/'+target_path, mtime, True)
177,178c176,177
<     #            copy_file(input, original_source_path, out+'/'+original_target_path, mtime, True)
<     #            copy_file(input, source_path, out+'/'+target_path, mtime)
---
>                 copy_file(input, original_source_path, out+'/'+original_target_path, mtime, True)
>                 copy_file(input, source_path, out+'/'+target_path, mtime)
183c182
<     #        copy_file(input, source_path, out+'/'+target_path, mtime)
---
>             copy_file(input, source_path, out+'/'+target_path, mtime)
 So after running the process-digikam-db.py script I get errors immediately on the process so I started hacking away and here is the diff.
$ diff process-digikam-db.py photokam-0.5/process-digikam-db.py 
26c26
<         input = args[0]+'/digikam4.db'
---
>         input = args[0]+'/digikam3.db'
59,60d58
<             if debug:
<                 print( ' The pieces  '+pieces[0])
66,75c64,73
<     #-#print('Setting album dates')
<     #-#p=re.compile('[1-2][0-9][0-9][0-9]-[0-1][0-9]-[0-3][0-9]')
<     #-#c=con.cursor()
<     #-#c.execute("select id, relativePath from Albums")
<     #-#for id, relativePath in c.fetchall():
<     #-#    name=relativePath.split('/')[-1]                
<     #-#    if len(name) >= 10 and p.match(name):
<     #-#        date=time.strptime(name[:10]+' 12', "%Y-%m-%d %H") #12h offset to avoid tz shifts
<     #-#        params=(name[:10], id)
<     #-#        c.execute('update albums set date=date(?) where id=?', params)
---
>     print('Setting album dates')
>     p=re.compile('[1-2][0-9][0-9][0-9]-[0-1][0-9]-[0-3][0-9]')
>     c=con.cursor()
>     c.execute("select id, url from Albums")
>     for id, url in c.fetchall():
>         name=url.split('/')[-1]                
>         if len(name) >= 10 and p.match(name):
>             date=time.strptime(name[:10]+' 12', "%Y-%m-%d %H") #12h offset to avoid tz shifts
>             params=(name[:10], id)
>             c.execute('update albums set date=date(?) where id=?', params)
116d113
<     #print( '    Album path '+album_path+'      Image name '+image_name )
121c118
<             "where Images.album=Albums.id and relativePath=? and name=?", params)
---
>             "where Images.dirid=Albums.id and url=? and name=?", params)
I thought that would get me home but I kept getting errors about not finding the photos in the database and found that the tag-mappings.txt was somehow not quite right... A sample here:

2011/2011-08-28 - 09-11-PhoneDump/IMAG0242.jpg=Albums/Favorites/Summer_2011
2011/2011-08-28 - 09-11-PhoneDump/IMAG0245.jpg=Albums/Favorites/Summer_2011
2011/2011-08-30 - Aug 27, 2011/P1090019.JPG=Albums/Favorites/Summer_2011
2011/2011-08-30 - Aug 27, 2011/P1090023.JPG=Albums/Favorites/Summer_2011

so I hacked up this:
$ cat update_tag_map.sh 
#!/bin/bash

while read line 
do

 myyear=`echo $line | cut -c 1-4`

 mydir=`echo $line | awk -F' - ' '{print $2}' | sed  's!\=.*$!!'`
 myfile=`echo $mydir | sed 's!^.*/!!g'`

  mynewfile=`locate $myfile | grep /home/photos/$myyear | cut -c 19-`
  echo " this is my new file --- $mynewfile"

  mynewline=`echo $line | sed s!"$mydir"!"$mynewfile"!g`
  echo "$mynewline" >> /home/photos/tag-mappings.txt
done < /home/photo_temp/tag-mappings.txt
And the result is:
2011/09-11-PhoneDump/IMAG0242.jpg=Albums/Favorites/Summer_2011
2011/09-11-PhoneDump/IMAG0245.jpg=Albums/Favorites/Summer_2011
2011/Aug 27, 2011/P1090019.JPG=Albums/Favorites/Summer_2011
2011/Aug 27, 2011/P1090023.JPG=Albums/Favorites/Summer_2011
NOTE: make sure the tag-mappings.txt file does not have any blank lines or the process-digikam-db.py script will fail.

Everything seems OK for now but I'll post back if I see any other gotchas.


.