Best practice to get the most out of your duplicate payment AI

Sampling and search strategies for zap Cash

The AI is learning by all your judgments (red, yellow, green). There are many strategies to look at the results of zap Cash data analysis. Here are the best ones:

Starter strategies

We recommend judging at least 100 pairs right after the first AI-run. In the beginning, it is advisable to evaluate a very heterogeneous sample. To do so, you should stick to the following strategy.

by Rank

You should always go through top pairs initially and after each AI-run. Switch to the next strategy when you do not find any more duplicate payments.

Rank 1 is the most likely.

by Cluster

All document pairs are clustered in up to n different buckets. The number of clusters depends on the size and complexity of the data-set. Usually, there are at least 5 to 20 different clusters. Filter on each cluster and investigate at least a couple of pairs before the first AI-run. This ensures that a wide range of data is considered by the AI.

Note that a cluster may be empty and filtering can give you an empty list!

by Outlier

zap cash calculates for each document pair how it differs from other pairs. Consider sorting by the Outlier rank to find the most curious cases. This technique is particularly good so that the AI can assess better exceptions.

Outlier rank 1 is the most unusual pair.


Advanced strategies

After using the starter strategies, you may dig in more in-depth with the following techniques.

Using filter

You may already have some hypotheses about your organization and know how duplicate payments have occurred in the past. For example, invoices may have been sent to the parent company and a subsidiary. You can find such cases by filtering the pairs for unequal company codes.


To do so, go to the filter tab. Select one or more items to filter on:

  • company code
  • fiscal year
  • document type
  • assignment or reference number
  • reference transaction
  • cost center
  • and more

As you always look at pairs of two documents, you can set the filter to one of four filter options:

  • Equals: Show only pairs, where the fiscal year of both documents is the same.
  • Unequal: Show only pairs, where the fiscal year of both documents is not the same.
  • Contains: Show only pairs, where at least one of the two documents contains the free input text. (Case insensitive)
  • Not contains: Show only pairs, where at least one of the two documents does not contain the free input text. (Case insensitive)

You can also connect filters with logical OR and AND.

Nearest neighbors


If you have already found a few double paid invoices and marked them with a red exclamation mark, then you should click on the 'Show most similar pairs'-button. The list is filtered on the 20 most similar candidates. Check the most of them. Especially if other criteria also holds.

This technique may not make the AI-sample better than other strategies, but gives you the fastest way to find more hits.