Reconciliation Tool Explanation

Reconciliation Tool

This post will briefly explain how the reconciliation tool works which can be found here https://scottclark.shinyapps.io/Reconciliation_Tool/.

The purpose of the app is to provide a simple and efficient way to take one or more files which contain transaction data and match any equal but opposite transactions, for example matching a positive 100 to a negative 100 would mean that those transactions have been successfully reconciled.

We will use the single file reconciliation process as the example, although the process for the multiple file reconciliation is almost identical.

Single File Reconciliation

Lets start by generating some example data.

single_file <- data.table("Row" = c("A", "B", "C" ,"D", "E"),
                          "Date" = c("01/01/2021","01/01/2021","01/01/2021","02/01/2021","01/01/2021"),
                          "Amount" = c(100,-100,100,-100,200))

single_file
##    Row       Date Amount
## 1:   A 01/01/2021    100
## 2:   B 01/01/2021   -100
## 3:   C 01/01/2021    100
## 4:   D 02/01/2021   -100
## 5:   E 01/01/2021    200

The first step is to find transactions which have dates that match and an equal but opposite amount. This can be achieved with the use of a few ‘helper’ columns which we will keep during the processing but remove in our final output.

# add our 'helper' column which counts how many times a particular combination of date and amount occurs

single_file[,`Row Occurrence`:=1:.N, by=.(Date, Amount)]
setkeyv(single_file, c("Date", "Amount", "Row Occurrence"))


# next we duplicate this file and add another column which contains the inverse amount

duplicate_file <- copy(single_file)
duplicate_file[,`Inverse Amount`:=.SD/-1, .SDcols="Amount"]
setkeyv(duplicate_file, c("Date", "Inverse Amount", "Row Occurrence"))

# this file is sorted by the date, amount and row occurence
single_file
##    Row       Date Amount Row Occurrence
## 1:   B 01/01/2021   -100              1
## 2:   A 01/01/2021    100              1
## 3:   C 01/01/2021    100              2
## 4:   E 01/01/2021    200              1
## 5:   D 02/01/2021   -100              1
# this file is sorted by the date, inverse amount and row occurence
duplicate_file
##    Row       Date Amount Row Occurrence Inverse Amount
## 1:   E 01/01/2021    200              1           -200
## 2:   A 01/01/2021    100              1           -100
## 3:   C 01/01/2021    100              2           -100
## 4:   B 01/01/2021   -100              1            100
## 5:   D 02/01/2021   -100              1            100

We now have the helper columns in place and can find out which rows have matching dates and equal but opposite amounts.

single_file_matching_rows <- single_file[duplicate_file, on = c("Amount"="Inverse Amount", "Date", "Row Occurrence"), which=TRUE]
duplicate_file_matching_rows <- duplicate_file[single_file, on = c("Inverse Amount"="Amount", "Date", "Row Occurrence"), which=TRUE]

# we can now add another helper column which will tell us which rows match
single_file[,"Row_duplicate_file" := duplicate_file_matching_rows]
duplicate_file[,"Row_single_file" := single_file_matching_rows]

single_file
##    Row       Date Amount Row Occurrence Row_duplicate_file
## 1:   B 01/01/2021   -100              1                  2
## 2:   A 01/01/2021    100              1                  4
## 3:   C 01/01/2021    100              2                 NA
## 4:   E 01/01/2021    200              1                 NA
## 5:   D 02/01/2021   -100              1                 NA
duplicate_file
##    Row       Date Amount Row Occurrence Inverse Amount Row_single_file
## 1:   E 01/01/2021    200              1           -200              NA
## 2:   A 01/01/2021    100              1           -100               1
## 3:   C 01/01/2021    100              2           -100              NA
## 4:   B 01/01/2021   -100              1            100               2
## 5:   D 02/01/2021   -100              1            100              NA

The next step is to find the columns which only have matching amounts (equal but opposite).

  # we find the rows which haven't been matched by looking for the NAs in the helper column
  non_matching_rows_single_file <- single_file[,.I[is.na(Row_duplicate_file)]]
  non_matching_rows_duplicate_file <- duplicate_file[,.I[is.na(Row_single_file)]]
  
  # we then overwrite the row occurence helper column, for the non-matching rows, with the row occurence based on only the amount (previously it was date and amount)
  single_file[non_matching_rows_single_file,`Row Occurrence`:=1:.N, by="Amount"]
  duplicate_file[non_matching_rows_duplicate_file,`Row Occurrence`:=1:.N, by="Amount"]
  
  # next we find the rows which can't be matched out of the non-matching rows
  unmatchable_rows_single_file <- single_file[non_matching_rows_single_file][duplicate_file[non_matching_rows_duplicate_file], on = c("Amount"="Inverse Amount", "Row Occurrence"), which=TRUE]
  unmatchable_rows_duplicate_file <- duplicate_file[non_matching_rows_duplicate_file][single_file[non_matching_rows_single_file], on = c("Inverse Amount"="Amount", "Row Occurrence"), which=TRUE]
  
  # we can now find the last rows which can be matched together based on the amount
  last_matching_single_file <- single_file[single_file[non_matching_rows_single_file][na.omit(unmatchable_rows_single_file)], on = c("Amount", "Date", "Row Occurrence", "Row_duplicate_file"), which=TRUE]
  last_matching_duplicate_file <- duplicate_file[duplicate_file[non_matching_rows_duplicate_file][na.omit(unmatchable_rows_duplicate_file)], on = c("Amount", "Date", "Row Occurrence", "Row_single_file"), which=TRUE]
  
  # we overwrite the helper column with the correct location of the last matching rows in the duplicate file and vice versa
  single_file[sort(last_matching_single_file),"Row_duplicate_file" := last_matching_duplicate_file]
  duplicate_file[sort(last_matching_duplicate_file),"Row_single_file" := last_matching_single_file]

  # sort the duplicate_file by the "Row_single_file" column, otherwise it won't line up
  setorderv(duplicate_file, "Row_single_file")
  
  single_file
##    Row       Date Amount Row Occurrence Row_duplicate_file
## 1:   B 01/01/2021   -100              1                  2
## 2:   A 01/01/2021    100              1                  4
## 3:   C 01/01/2021    100              1                  5
## 4:   E 01/01/2021    200              1                 NA
## 5:   D 02/01/2021   -100              1                  3
  duplicate_file
##    Row       Date Amount Row Occurrence Inverse Amount Row_single_file
## 1:   E 01/01/2021    200              1           -200              NA
## 2:   A 01/01/2021    100              1           -100               1
## 3:   B 01/01/2021   -100              1            100               2
## 4:   D 02/01/2021   -100              1            100               3
## 5:   C 01/01/2021    100              1           -100               5

The last piece of the puzzle is to display the data in a way that makes it easy to see which lines have been reconciled and which have not, the best way I have found to do this is by simply combining the columns of the original data and the duplicated data, with the addition of the reconciliation status column.

  # create a blank datatable to combine to
  column_names <- colnames(single_file[,-c("Row Occurrence", "Row_duplicate_file")])
  blank_dt <- blank_dt <- copy(single_file[FALSE,..column_names])
  
  # combine it all together to form the reconciliation report
  reconciliation_output <- rbindlist(list(cbind("Reconciliation Status"="Reconciled", 
                                                single_file[!is.na(single_file[["Row_duplicate_file"]]),..column_names], 
                                                duplicate_file[!is.na(duplicate_file[["Row_single_file"]]),..column_names], 
                                                stringsAsFactors=FALSE),
                                          cbind("Reconciliation Status"="Unreconciled", 
                                                single_file[is.na(single_file[["Row_duplicate_file"]]),..column_names], 
                                                blank_dt, 
                                                stringsAsFactors=FALSE),
                                          cbind("Reconciliation Status"="Unreconciled", 
                                                blank_dt, 
                                                duplicate_file[is.na(duplicate_file[["Row_single_file"]]),..column_names], 
                                                stringsAsFactors=FALSE)))
  
  
  setorderv(reconciliation_output, cols = c("Reconciliation Status", "Row"), na.last = TRUE)
  
  reconciliation_output
##    Reconciliation Status  Row       Date Amount  Row       Date Amount
## 1:            Reconciled    A 01/01/2021    100    B 01/01/2021   -100
## 2:            Reconciled    B 01/01/2021   -100    A 01/01/2021    100
## 3:            Reconciled    C 01/01/2021    100    D 02/01/2021   -100
## 4:            Reconciled    D 02/01/2021   -100    C 01/01/2021    100
## 5:          Unreconciled    E 01/01/2021    200 <NA>       <NA>     NA
## 6:          Unreconciled <NA>       <NA>     NA    E 01/01/2021    200

Summary

The reconciliation process is one which many people still do manually in accounting roles, by creating a logical matching process based on the dates and amounts of transactions and then subsequently only using amounts for the remaining unmatched entries, you can easily reduce the number of reconciling items. This is obviously under the assumption that the amounts in the data match on a one-to-one basis, there are occasions where a single transaction in one data-set could be made up of multiple transactions in another data-set and that has not been tackled in this program, however, it is definitely on the To-Do list!

The code that generates the reconciliation output shown here is only one part of the puzzle for the application, as there are many other things to consider when trying to package this up into a usable application. One of the main challenges I face is trying to consider all of the possible things that could go wrong causing the application to crash, therefore, I have tried to include enough validation that it shouldn’t topple over immediately if something isn’t quite right, but nevertheless there are bound to be some unforeseen situations so I will keep my eyes peeled and update when I can.

Scott Clark
Scott Clark
ACCA Qualified Finance Professional

My interests include R programming, finance and photography.