How I Built a Customer Risk Scoring Model for BD Fintechs

Last quarter, while reviewing a batch of 80,000 MFS transactions for a leading BD fintech, I noticed that about 10% of the transactions exceeded the BDT 100,000 threshold, triggering alerts for potential money laundering. As I dug deeper, I realized that most of these alerts were false positives, wasting valuable time and resources for the compliance team.

The core problem most practitioners miss

When I was starting out as an AML analyst, I was wrong about this until I stumbled upon a crucial aspect of customer risk scoring models: they are not one-size-fits-all. In my experience, many practitioners miss this point, applying generic models that fail to account for the unique characteristics of their customer base. I recall a particularly frustrating instance where our team spent weeks fine-tuning a model, only to realize that it was not tailored to our specific use case.

Background / why this matters in BD fintech context

The BFIU guidelines emphasize the importance of effective customer risk scoring models in preventing money laundering and terrorist financing. In the BD fintech context, this is particularly critical, given the high volume of MFS transactions and the need to balance regulatory compliance with customer convenience. For instance, bKash and Nagad patterns show that most customers use these services for small, frequent transactions, which can trigger false alerts if not properly accounted for in the risk scoring model.

Technical breakdown

In building a customer risk scoring model, I relied on a combination of machine learning algorithms and domain expertise. Here is an example of how I used Python to implement a basic risk scoring model:

import pandas as pd  # import pandas library
from sklearn.ensemble import RandomForestClassifier  # import random forest classifier

# load data
data = pd.read_csv('customer_data.csv')

# define features and target variable
X = data[['transaction_amount', 'transaction_frequency', 'customer_age']]
y = data['risk_score']

# train model
model = RandomForestClassifier(n_estimators=100)
model.fit(X, y)

This code snippet demonstrates how to load customer data, define features and target variables, and train a random forest classifier to predict risk scores. I was surprised by how well this basic model performed, but I knew that I needed to incorporate more features and fine-tune the model to achieve optimal results.

Bangladesh-specific application

In applying this model to the BD fintech context, I considered factors such as the BDT 100,000 MFS threshold, STR/SAR process, and bKash or Nagad patterns. For example, I incorporated a feature to account for transactions that exceed the threshold, as well as a feature to capture the frequency and amount of transactions. I also ensured that the model was calibrated to detect suspicious activity while minimizing false positives.

Common mistakes analysts make

In my experience, analysts often make the following mistakes when building customer risk scoring models:

Over-reliance on generic models that fail to account for unique customer characteristics
Insufficient feature engineering, leading to poor model performance
Inadequate testing and validation, resulting in models that are not calibrated to detect suspicious activity
Failure to incorporate domain expertise and regulatory requirements

Counterintuitive insight

One counterintuitive insight I discovered is that high-risk customers are not always the ones who trigger the most alerts. In fact, some high-risk customers may be able to fly under the radar by spreading their transactions across multiple accounts or using complex money laundering schemes. This highlights the importance of incorporating behavioral analysis and network analysis into the risk scoring model.

Practical conclusion + next step

In conclusion, building a customer risk scoring model for BD fintechs requires a deep understanding of the unique characteristics of the customer base, as well as the regulatory requirements and domain expertise. By avoiding common mistakes and incorporating counterintuitive insights, analysts can develop effective models that detect suspicious activity while minimizing false positives. Your next step today: review your current customer risk scoring model and identify areas where you can incorporate more features and fine-tune the model to achieve optimal results.

Search

AML Data with Python