I Built a BFIU-Compliant AML Detection System in Python (Here's Why the Kaggle Approach Doesn't Work

I Built a BFIU-Compliant AML Detection System in Python (Here's Why the Kaggle Approach Doesn't Work)

Most AML tutorials end with a confusion matrix and a 99% accuracy score. Here's why that doesn't work — and what I built instead. I've been working in fintech compliance data for a while. The one thing I kept noticing: every "fraud detection project" on GitHub or Kaggle uses the same dataset — the UCI credit card fraud dataset from 2013. It has 284,000 rows, 30 features labeled V1-V28, and approximately zero explanatory value for anyone who wants to understand how financial crime actually works. So I built something different. The problem with the standard approach Real transaction monitoring engines don't work like Kaggle competitions. They don't take a CSV, train a model, and output a probability score. They work like this: A rule engine runs first — deterministic, auditable, regulatory-cited rules that generate alerts Those alerts get scored and triaged by risk tier An ML layer reduces false positives among the high-risk alerts ...

Python AML Toolkit

Python · AML · Fintech

Stop submitting the
same Kaggle project
as your fintech portfolio

A production-style AML & Fraud Detection Toolkit — rule engine, ML layer, SAR export — calibrated against BFIU guidelines and real Bangladesh MFS transaction patterns. The architecture works for any digital payments context globally.

Get the Toolkit — $39 one-time · all future updates included

The problem with standard AML tutorials

📊

Same dataset, every time

Kaggle's credit card fraud dataset has been submitted by millions. Hiring managers stop reading at "accuracy: 99.2%".

⚠️

No rule engine

Real compliance teams run weighted rule sets. A classifier alone isn't how transaction monitoring actually works.

📄

No regulatory context

"I tuned a model" means nothing without knowing what BFIU Circular 02/2019 says about structuring thresholds.

🔢

Global false positives

Generic tools misfire constantly on MFS data — round-amount rules that fire on every BDT 500 bazar purchase.

How the toolkit works

Synthetic bKash / Nagad Transactions (10,000+) Rule Engine — 6 BFIU-calibrated rules ┌─────────────────────────────────────────────────────────────┐ STRUCTURING ≥3 txns below BDT 10,000 within 24h VELOCITY ≥5 txns within any 60-min window DORMANT_SPIKE 30+ day inactive → sudden surge LATE_NIGHT transactions between 01:00 – 04:00 AM ROUND_AMOUNT ≥BDT 50,000 AND ≥5× sender's own median HIGH_VALUE single transaction above BDT 20,000 └─────────────────────────────────────────────────────────────┘ Composite Risk Score (0–100) → LOW / MEDIUM / HIGH Threshold Backtesting → tune precision vs recall LightGBM ML Layer → reduces false positives by ~40% SAR Candidates Export → compliance-ready .csv

What's inside the full toolkit

Synthetic MFS data generator — 10,000+ realistic txns with injected typologies

6-rule BFIU-calibrated rule engine with weighted scoring

Composite risk scoring 0–100 with tiered alert levels

Threshold backtesting — simulate rule changes before deployment

LightGBM ML layer trained on rule-enriched features

SAR candidate export in compliance-ready format

EDD regulatory profile builder per flagged account

Compliance dashboard — 6 charts, production Jupyter notebooks

RULE_CALIBRATION.md — every rule cited to BFIU circulars

Full test suite + CI/CD pipeline

All future updates — network graph, SAR PDF, REST API (roadmap)

Private GitHub repo access via Gumroad post-purchase

Preview vs Full Toolkit

Feature Preview (Free) Full Toolkit ($39)
Data generation notebook
500-row sample dataset
Rule engine (6 BFIU rules)
Composite risk scoring (0–100)
Threshold backtesting
LightGBM ML layer
SAR candidates export
EDD regulatory profiler
Compliance dashboard (6 charts)
RULE_CALIBRATION.md (BFIU citations)
Full test suite + CI/CD
Future updates included
Private GitHub repo access

Who this is for

ML/Data job seekers

Targeting fintech, fraud, or AML roles? This project gets you a portfolio piece interviewers actually ask about — not just a confusion matrix.

AML analysts learning Python

You know BFIU guidelines. This turns that domain knowledge into working code, end-to-end, without starting from scratch.

Fintech developers

A working POC transaction monitoring engine you can adapt, extend, or demo to compliance stakeholders.

Freelancers & consultants

Pitching banks, MFIs, or compliance vendors in Bangladesh or South Asia? Show them a live demo of domain-calibrated AML capability.

Tech stack

Pure Python. No paid APIs. No cloud setup required. Runs locally on Windows / Mac / Linux.

Python 3.9+ Pandas NumPy LightGBM Scikit-learn Matplotlib Seaborn Jupyter

Common questions

Is this only useful for Bangladesh?

No. The BD calibration (bKash/Nagad thresholds, BDT amounts, BFIU citations) makes it a great portfolio piece for South Asia. But the architecture — rule engine + baseline-relative thresholds + ML layer — applies to any MFS or digital payments context globally. UPI, JazzCash, M-Pesa, PayTM all share the same calibration problem.

What Python level do I need?

Intermediate. You should be comfortable with Pandas and Jupyter notebooks. The code is heavily commented and includes a walkthrough notebook. No prior AML/compliance knowledge required — RULE_CALIBRATION.md explains the regulatory logic.

How do I get the code after buying?

Immediately after purchase, Gumroad sends an email with a GitHub collaborator invitation. You accept it and get access to the private repo. The download link in Gumroad also includes a zip of all notebooks and scripts.

Does the $39 include future updates?

Yes. The roadmap includes transaction network graph (NetworkX), SAR PDF generator, and a REST API wrapper. All updates go into the same private repo — you get them automatically as a collaborator.

Ready to build a real AML portfolio project?

One-time payment. Instant GitHub access. All future updates included.

Get the Toolkit — $39 →
Questions? Email: monsurhabib01@gmail.com

Comments

Popular posts from this blog

How to Use Notion to Improve Your Blog: A Step-by-Step Guide 🌱

Top 5 AI SEO Strategies to Skyrocket Your Blog Traffic in 2026 🚀

How to Start Freelancing with AI in 2025 for Beginners