I Built a BFIU-Compliant AML Detection System in Python (Here's Why the Kaggle Approach Doesn't Work

I Built a BFIU-Compliant AML Detection System in Python (Here's Why the Kaggle Approach Doesn't Work)

Most AML tutorials end with a confusion matrix and a 99% accuracy score. Here's why that doesn't work — and what I built instead. I've been working in fintech compliance data for a while. The one thing I kept noticing: every "fraud detection project" on GitHub or Kaggle uses the same dataset — the UCI credit card fraud dataset from 2013. It has 284,000 rows, 30 features labeled V1-V28, and approximately zero explanatory value for anyone who wants to understand how financial crime actually works. So I built something different. The problem with the standard approach Real transaction monitoring engines don't work like Kaggle competitions. They don't take a CSV, train a model, and output a probability score. They work like this: A rule engine runs first — deterministic, auditable, regulatory-cited rules that generate alerts Those alerts get scored and triaged by risk tier An ML layer reduces false positives among the high-risk alerts ...

My Toolkit | Favorite Data Science Tools & Resources

 

My Toolkit: The Essential Tools & Resources That Power My Workflow

In data science, having the right tools can make all the difference. This page is a curated list of the software, libraries, and resources I use daily to learn, code, and build projects. I hope this can be a helpful guide for other aspiring data scientists.

Core Development Environment

  • Code Editor: Visual Studio Code - My go-to editor for its flexibility, powerful extensions (like Python and Jupyter), and integrated terminal.
  • Version Control: Git & GitHub - Essential for tracking changes, collaborating, and showcasing my work.
  • Terminal: Git Bash (on Windows) - Provides a powerful Linux-like command-line experience on Windows.

Data Science & Machine Learning

  • Python: The primary language for my data science work.
  • Jupyter Notebooks: Perfect for interactive data analysis, prototyping, and storytelling with code and visualizations.
  • Pandas & NumPy: The backbone of my data manipulation and numerical analysis workflow.
  • Matplotlib & Seaborn: My chosen libraries for creating insightful and beautiful data visualizations.
  • Scikit-Learn: My entry point into the world of machine learning for building predictive models.

Learning & Knowledge

  • Kaggle: An incredible platform for finding datasets, practicing with notebooks, and learning from the community.
  • Coursera & freeCodeCamp: My primary sources for structured learning and high-quality educational content.
  • Medium & Towards Data Science: For staying updated with the latest trends, tutorials, and case studies.

Comments

Popular posts from this blog

How to Use Notion to Improve Your Blog: A Step-by-Step Guide 🌱

Top 5 AI SEO Strategies to Skyrocket Your Blog Traffic in 2026 🚀

How to Start Freelancing with AI in 2025 for Beginners