You can use whatever accuracy metric you want in the scans to monitor your model’s performance. However, if you are thinking about how the responses came about, some metrics will be more helpful than others. For instance, let’s take a case where you do not have control groups: a model where you are predicting default rates, and as a result of your model you are giving loans only to those people who present a low enough risk profile. Out of those people, looking at who kept paying their loans for the first time period would give you a reliable true positive rate, that might decrease over time if the loan period isn’t complete, but which at least is not misleading. However, trying to look at an overall accuracy rate would not make much sense, as you have not given loans to anyone for whom you predicted a low likelihood of repayment in the first place. A lot of algorithmic bias related problems stem from these issues.