Best sampling and search strategies for zapCash

We show you the best sampling and search strategies to get the best out of zapCash

Sampling and search strategies for zapCash

The AI learns by all your judgments (red, yellow, green). There are many strategies to look at the zapCash data analysis results. Here are the best ones:

We recommend judging at least 100 pairs right after the first AI run. In the beginning, it is advisable to evaluate a very heterogeneous sample. To do so, you should stick to the following strategy.

Fig. 1 - Project List of candidates zapCash Version 1.1.0

Sort by Score

It would help if you always went through top pairs initially and after each AI run. Switch to the following strategy when you can not find more duplicate payments.

Scores from 50 to 100 have a very high probability (%) to be a duplicate payment!

Fig. 2 - Show and Hide Columns zapCash

by Cluster (show/hide columns)

All document pairs are clustered in up to n different buckets. The number of clusters depends on the size and complexity of the data set. Usually, there are at least 5 to 20 different clusters. Filter each cluster and investigate at least a couple of pairs before the first AI-run. This ensures that the AI considers a wide range of data.

Note that a cluster may be empty, and filtering can give you an empty list!

by Outlier rank (show/hide columns)

zapCash calculates for each document pair how it differs from other pairs. Consider sorting by the Outlier rank to find the most curious cases. This technique is particularly good so that the AI can assess better exceptions.

Outlier rank 1 is the most unusual pair.


Advanced strategies

After using the starter strategies, you may dig more in-depth with the following techniques.

Using the filter to get a more in-depth overview

You may already have some hypotheses about your organization and know-how duplicate payments have occurred in the past. For example, invoices may have been sent to the parent company and a subsidiary. You can find such cases by filtering the pairs for unequal company codes.

To do so, go to the filter tab. Select one or more items to filter on:

  • Company code
  • Fiscal year
  • Document type
  • Assignment or reference number
  • Reference transaction
  • Inter-company
  • Currency
  • Vendor ID or name
  • and more...

As you always look at pairs of two documents, you can set the filter to one of four filter options:

  • Equals: Show only pairs, where the fiscal year of both documents is the same.
  • Unequal: Show only pairs, where the fiscal year of both documents is not the same.
  • Contains: Show only pairs, where at least one of the two documents contains the free input text. (Case insensitive)
  • Not contains: Show only pairs, where at least one of the two documents does not contain the free input text. (Case insensitive)
  • One_Empty: Show only pairs, where one position is empty.
  • Both_Empty: Show pairs, where both positions are empty.
  • None_Empty: Show pairs, where both positions are NOT empty.
  • IsSimiliar: Shows pairs that are NOT AT ALL similar to VERY similar in text position (click here for more information to this feature)

You can also connect filters with logical OR and AND.


As we stated initially, it is advisable to evaluate a very heterogeneous sample. Therefore you will find some pairs with score = Q.

We recommend examining those to maximize your findings in the next run fully.