8 Years of Fighting Money Laundering in Bangladesh: How I Built a Customer Risk Scoring Model

I still remember the night we discovered a massive structuring ring at one of the local fintechs. It was a BDT 50 million case, with thousands of transactions flying under the radar. Our team worked tirelessly for weeks to unravel the scheme, but it was a wake-up call - our current risk scoring model just wasn't cutting it.

So, I embarked on a journey to create a more effective customer risk scoring model, one that could help us identify high-risk customers and prevent money laundering in real-time. It wasn't easy - we faced numerous challenges, from data quality issues to regulatory hurdles.

The Hidden Problem

In Bangladesh, the standard approaches to risk scoring often fall short. Why? Because they don't account for our unique fintech landscape, where mobile financial services (MFS) like bKash and Nagad dominate the market. The BFIU guidelines are clear - we need to monitor transactions above the BDT 100,000 threshold - but that's just the tip of the iceberg.

Our analysis revealed that most money laundering cases involve smaller, frequent transactions, often using multiple accounts and agents. To catch these cases, we needed a model that could analyze customer behavior, transaction patterns, and network relationships.

Technical Breakdown & Logic Flow

Our approach involved a multi-step process: data collection, data preprocessing, feature engineering, model training, and model evaluation. We used a combination of machine learning algorithms, including decision trees, random forests, and neural networks.

First, we collected data from various sources, including customer information, transaction history, and account activity. We then preprocessed the data, handling missing values, outliers, and data normalization.

Next, we engineered features that could help us identify high-risk customers. These included metrics like transaction frequency, average transaction value, and account balance volatility.

We trained our model using a labeled dataset, where each customer was assigned a risk score based on their historical activity. We then evaluated our model using metrics like precision, recall, and F1-score.

import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

# Load data
data = pd.read_csv('customer_data.csv')

# Preprocess data
data = data.dropna()  # handle missing values
data = data[data['transaction_value'] > 0]  # remove outliers

# Engineer features
data['transaction_frequency'] = data['transaction_count'] / data['account_age']
data['average_transaction_value'] = data['transaction_value'] / data['transaction_count']

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(data.drop('risk_score', axis=1), data['risk_score'], test_size=0.2, random_state=42)

# Train model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Evaluate model
y_pred = model.predict(X_test)
print('Precision:', precision_score(y_test, y_pred))
print('Recall:', recall_score(y_test, y_pred))
print('F1-score:', f1_score(y_test, y_pred))

We chose this approach over alternative methods, like decision trees or support vector machines, because random forests can handle complex, non-linear relationships between features and provide better performance on imbalanced datasets.

Local Application

Our customer risk scoring model is designed to work within the Bangladeshi fintech landscape. We've incorporated BFIU guidelines and MFS realities into our model, ensuring that we're monitoring the right transactions and identifying high-risk customers.

For example, we've set up rules to flag transactions above the BDT 100,000 threshold, as required by the BFIU. We're also monitoring transactions that involve multiple accounts and agents, which is a common pattern in money laundering cases.

The BFIU guidelines state that 'all transactions above BDT 100,000 must be reported to the authorities.' Our model ensures that we're complying with these regulations, while also identifying potential money laundering cases.

Common Pitfalls & Edge Cases

One of the biggest challenges we faced was handling false positives. Our model was flagging too many legitimate transactions, which was causing unnecessary friction for our customers.

To address this issue, we implemented a secondary review process, where our team would manually review flagged transactions to determine whether they were legitimate or not.

We also encountered edge cases, like customers who were using our services for legitimate purposes, but had unusual transaction patterns. For example, a customer who was sending large amounts of money to multiple recipients, but was doing so for a legitimate business purpose.

To handle these edge cases, we developed a set of rules and guidelines that our team could follow to determine whether a transaction was legitimate or not.

Counterintuitive Insight

One of the most surprising findings from our analysis was that customers who were using our services for legitimate purposes, but had unusual transaction patterns, were actually less likely to be involved in money laundering.

This counterintuitive insight challenged our initial assumptions and forced us to rethink our approach to risk scoring. We realized that we needed to focus on customer behavior and transaction patterns, rather than just looking at individual transactions.

Conclusion & CTA

Building a customer risk scoring model for BD fintechs is a complex task, but it's essential for preventing money laundering and ensuring regulatory compliance. Our approach, which combines machine learning algorithms with local regulations and MFS realities, has proven to be effective in identifying high-risk customers and preventing money laundering.

So, what's the weirdest transaction pattern you've seen? Drop a comment below and let's discuss. Have you implemented a customer risk scoring model at your organization? What challenges did you face, and how did you overcome them? Share your experiences and let's learn from each other.

Check out other resources on aitipseveryday.com, including our resources page, which provides a wealth of information on AML, Python, and data science.

Search

AML Data with Python