Add/subtract a delta time

Problem

A number of photo files were tagged as follows, with the date and the time:

20151205_17h48-img_0098.jpg
20151205_18h20-img_0099.jpg
20151205_18h21-img_0100.jpg

Turns out that they should be all an hour earlier (reminder: mixing pics from two camera's), so let's create a script to rename these files...

Solution

1. Start

Let's use pandas:

import datetime as dt
import pandas as pd
import re

df0=pd.io.parsers.read_table( '/u01/work/20151205_gran_canaria/fl.txt',sep=",", \
        header=None, names= ["fn"])
df=df0[df0['fn'].apply( lambda a: 'img_0' in a )]  # filter out certain pics

2. Make parseable

Now add a column to the dataframe that only contains the numbers of the date, so it can be parsed:

df['rawdt']=df['fn'].apply( lambda a: re.sub('-.*.jpg','',a))\
                 .apply( lambda a: re.sub('[_h]','',a))

Result:

df.head()
                             fn         rawdt
0   20151202_07h17-img_0001.jpg  201512020717
1   20151202_07h17-img_0002.jpg  201512020717
2   20151202_07h17-img_0003.jpg  201512020717
3   20151202_15h29-img_0004.jpg  201512021529
28  20151202_17h59-img_0005.jpg  201512021759

3. Convert to datetime, and subtract delta time

Convert the raw-date to a real date, and subtract an hour:

df['adjdt']=pd.to_datetime( df['rawdt'], format('%Y%m%d%H%M'))-dt.timedelta(hours=1)

Note 20190105: apparently you can drop the 'format' string:

df['adjdt']=pd.to_datetime( df['rawdt'])-dt.timedelta(hours=1)

Result:

                             fn         rawdt               adjdt
0   20151202_07h17-img_0001.jpg  201512020717 2015-12-02 06:17:00
1   20151202_07h17-img_0002.jpg  201512020717 2015-12-02 06:17:00
2   20151202_07h17-img_0003.jpg  201512020717 2015-12-02 06:17:00
3   20151202_15h29-img_0004.jpg  201512021529 2015-12-02 14:29:00
28  20151202_17h59-img_0005.jpg  201512021759 2015-12-02 16:59:00

4. Convert adjusted date to string

df['adj']=df['adjdt'].apply(lambda a: dt.datetime.strftime(a, "%Y%m%d_%Hh%M") )

We also need the 'stem' of the filename:

df['stem']=df['fn'].apply(lambda a: re.sub('^.*-','',a) )

Result:

df.head()
                             fn         rawdt               adjdt  \
0   20151202_07h17-img_0001.jpg  201512020717 2015-12-02 06:17:00   
1   20151202_07h17-img_0002.jpg  201512020717 2015-12-02 06:17:00   
2   20151202_07h17-img_0003.jpg  201512020717 2015-12-02 06:17:00   
3   20151202_15h29-img_0004.jpg  201512021529 2015-12-02 14:29:00   
28  20151202_17h59-img_0005.jpg  201512021759 2015-12-02 16:59:00   

               adj          stem  
0   20151202_06h17  img_0001.jpg  
1   20151202_06h17  img_0002.jpg  
2   20151202_06h17  img_0003.jpg  
3   20151202_14h29  img_0004.jpg  
28  20151202_16h59  img_0005.jpg

5. Cleanup

Drop columns that are no longer useful:

df=df.drop(['rawdt','adjdt'], axis=1)

Result:

df.head()
                             fn             adj          stem
0   20151202_07h17-img_0001.jpg  20151202_06h17  img_0001.jpg
1   20151202_07h17-img_0002.jpg  20151202_06h17  img_0002.jpg
2   20151202_07h17-img_0003.jpg  20151202_06h17  img_0003.jpg
3   20151202_15h29-img_0004.jpg  20151202_14h29  img_0004.jpg
28  20151202_17h59-img_0005.jpg  20151202_16h59  img_0005.jpg

6. Generate scripts

Generate the 'rename' script:

sh=df.apply( lambda a: 'mv {} {}-{}'.format( a[0],a[1],a[2]), axis=1)
sh.to_csv('rename.sh',header=False, index=False )

Also generate the 'rollback' script (in case we have to rollback the renaming) :

sh=df.apply( lambda a: 'mv {}-{} {}'.format( a[1],a[2],a[0]), axis=1)
sh.to_csv('rollback.sh',header=False, index=False )

First lines of the rename script:

mv 20151202_07h17-img_0001.jpg 20151202_06h17-img_0001.jpg
mv 20151202_07h17-img_0002.jpg 20151202_06h17-img_0002.jpg
mv 20151202_07h17-img_0003.jpg 20151202_06h17-img_0003.jpg
mv 20151202_15h29-img_0004.jpg 20151202_14h29-img_0004.jpg
mv 20151202_17h59-img_0005.jpg 20151202_16h59-img_0005.jpg

Notes by Willem Moors. Generated on momo:/home/willem/sync/20151223_datamungingninja/pythonbook at 2019-07-31 19:22