The Python Book

delta_time pandas
20151207

## Problem

A number of photo files were tagged as follows, with the date and the time:

``````20151205_17h48-img_0098.jpg
20151205_18h20-img_0099.jpg
20151205_18h21-img_0100.jpg``````

..

Turns out that they should be all an hour earlier (reminder: mixing pics from two camera's), so let's create a script to rename these files...

## Solution

### 1. Start

Let's use pandas:

``````import datetime as dt
import pandas as pd
import re

df=df0[df0['fn'].apply( lambda a: 'img_0' in a )]  # filter out certain pics     ``````

### 2. Make parseable

Now add a column to the dataframe that only contains the numbers of the date, so it can be parsed:

``````df['rawdt']=df['fn'].apply( lambda a: re.sub('-.*.jpg','',a))\
.apply( lambda a: re.sub('[_h]','',a))``````

Result:

``````df.head()
fn         rawdt
0   20151202_07h17-img_0001.jpg  201512020717
1   20151202_07h17-img_0002.jpg  201512020717
2   20151202_07h17-img_0003.jpg  201512020717
3   20151202_15h29-img_0004.jpg  201512021529
28  20151202_17h59-img_0005.jpg  201512021759``````

### 3. Convert to datetime, and subtract delta time

Convert the raw-date to a real date, and subtract an hour:

``df['adjdt']=pd.to_datetime( df['rawdt'], format('%Y%m%d%H%M'))-dt.timedelta(hours=1)``

Note 20190105: apparently you can drop the 'format' string:

``df['adjdt']=pd.to_datetime( df['rawdt'])-dt.timedelta(hours=1) ``

Result:

``````                             fn         rawdt               adjdt
0   20151202_07h17-img_0001.jpg  201512020717 2015-12-02 06:17:00
1   20151202_07h17-img_0002.jpg  201512020717 2015-12-02 06:17:00
2   20151202_07h17-img_0003.jpg  201512020717 2015-12-02 06:17:00
3   20151202_15h29-img_0004.jpg  201512021529 2015-12-02 14:29:00
28  20151202_17h59-img_0005.jpg  201512021759 2015-12-02 16:59:00``````

### 4. Convert adjusted date to string

``df['adj']=df['adjdt'].apply(lambda a: dt.datetime.strftime(a, "%Y%m%d_%Hh%M") )``

We also need the 'stem' of the filename:

``df['stem']=df['fn'].apply(lambda a: re.sub('^.*-','',a) )``

Result:

``````df.head()
0   20151202_07h17-img_0001.jpg  201512020717 2015-12-02 06:17:00
1   20151202_07h17-img_0002.jpg  201512020717 2015-12-02 06:17:00
2   20151202_07h17-img_0003.jpg  201512020717 2015-12-02 06:17:00
3   20151202_15h29-img_0004.jpg  201512021529 2015-12-02 14:29:00
28  20151202_17h59-img_0005.jpg  201512021759 2015-12-02 16:59:00

0   20151202_06h17  img_0001.jpg
1   20151202_06h17  img_0002.jpg
2   20151202_06h17  img_0003.jpg
3   20151202_14h29  img_0004.jpg
28  20151202_16h59  img_0005.jpg  ``````

### 5. Cleanup

Drop columns that are no longer useful:

``df=df.drop(['rawdt','adjdt'], axis=1)``

Result:

``````df.head()
0   20151202_07h17-img_0001.jpg  20151202_06h17  img_0001.jpg
1   20151202_07h17-img_0002.jpg  20151202_06h17  img_0002.jpg
2   20151202_07h17-img_0003.jpg  20151202_06h17  img_0003.jpg
3   20151202_15h29-img_0004.jpg  20151202_14h29  img_0004.jpg
28  20151202_17h59-img_0005.jpg  20151202_16h59  img_0005.jpg``````

### 6. Generate scripts

Generate the 'rename' script:

``````sh=df.apply( lambda a: 'mv {} {}-{}'.format( a[0],a[1],a[2]), axis=1)

Also generate the 'rollback' script (in case we have to rollback the renaming) :

``````sh=df.apply( lambda a: 'mv {}-{} {}'.format( a[1],a[2],a[0]), axis=1)
``````mv 20151202_07h17-img_0001.jpg 20151202_06h17-img_0001.jpg