FinTech engineering is the field where a single misplaced decimal point can move millions of dollars, where regulatory compliance is not optional paperwork but executable code that must be correct on every path, and where the Silicon Valley mantra of “move fast and break things” gets you fined by regulators, sued by customers, or both. Your code sits at the intersection of financial domain expertise, distributed systems engineering, security hardening, and regulatory compliance — and the AI coding tools that work brilliantly for web developers produce output that compiles, passes basic tests, and quietly loses money in production.
This guide evaluates every major AI coding tool through the lens of what FinTech engineers actually build: not toy payment forms, not tutorial-grade calculators, but production systems that process millions of transactions per day with exact penny accuracy, satisfy auditors who will trace every cent through your ledger, survive adversaries actively trying to steal money, and comply with regulations that span dozens of jurisdictions. We tested each tool on real FinTech tasks: building idempotent payment processors with proper Decimal arithmetic, implementing double-entry ledger systems with atomic multi-entry transactions, writing order matching engines with price-time priority, building real-time risk calculators with position limits, automating compliance workflows for AML/KYC, scoring transactions for fraud in under 50 milliseconds, and integrating Open Banking APIs with PSD2 Strong Customer Authentication.
If you work primarily on backend services and API design, see the Backend Engineers guide. If your focus is security architecture and threat modeling, see the Security Engineers guide. If you work on regulatory compliance systems more broadly, see the Compliance Engineers guide. If your role centers on data pipelines that feed financial analytics, see the Data Engineers guide. This guide is specifically for engineers building the financial systems themselves — the code that moves money, keeps the books, manages risk, and satisfies regulators.
Best free ($0): GitHub Copilot Free — decent boilerplate for payment gateway integrations, knows Stripe/Adyen API shapes, 2,000 completions/mo covers personal FinTech projects. Best overall ($20/mo): Cursor Pro — multi-file context handles payment service + ledger + compliance config together, strong at generating gateway abstraction layers and API integration boilerplate. Best for reasoning ($20/mo): Claude Code — strongest at regulatory logic, ledger correctness proofs, double-entry invariant checking, and reasoning through edge cases in settlement and reconciliation. Best combo ($30/mo): Claude Code + Copilot Pro — Claude for compliance reasoning, ledger design, and risk logic; Copilot for fast inline completions on API integrations, data models, and boilerplate transaction handlers.
Why FinTech Engineering Is Different
FinTech engineering operates under constraints that most software engineers never encounter. Your inputs are adversarial, your precision requirements are absolute, your compliance obligations carry criminal penalties, and the gap between “works in tests” and “works in production without losing money” is wider than in almost any other engineering discipline:
- Financial precision is non-negotiable: Floating point is forbidden for money. IEEE 754 double-precision loses precision above 2^53 — which sounds like a lot until you realize that is only about 9 quadrillion cents, and a high-volume payment processor accumulates totals that make this relevant. You need
DecimalorBigDecimaleverywhere: amounts, fees, tax calculations, exchange rates, accumulated totals. Banker’s rounding (round half to even) is required for financial calculations — notround(), which uses round-half-away-from-zero in most languages. Currency-specific decimal places vary: JPY has 0 decimals, USD/EUR have 2, BHD/KWD have 3, and some crypto tokens have 18. Converting between currencies requires proper rounding at every step, and the rounding direction matters (you round in favor of the customer for display, in favor of the bank for settlement). AI tools generateprice * quantitywith floats and lose money on every transaction. They useround(amount, 2)instead ofDecimal.quantize()with explicit rounding mode. They store amounts asfloatcolumns in PostgreSQL instead ofNUMERIC(19,4). Every single one of these “minor” issues compounds across millions of transactions into real financial loss. - Regulatory compliance is executable code: PCI-DSS for card data means 12 requirements and over 400 individual controls — specifying exactly which encryption algorithms you may use, how often keys must be rotated, what must and must not appear in logs, and how long audit trails must be retained. SOX for financial reporting mandates separation of duties in code deployment. SOC 2 for service providers requires evidence of access controls and change management. GDPR for EU customer data imposes right-to-deletion that conflicts with immutable audit trail requirements. PSD2 and Open Banking mandate specific API standards and Strong Customer Authentication flows. MiFID II for trading requires transaction reporting within T+1 and best execution proof. AML/KYC regulations require identity verification, transaction monitoring, and Suspicious Activity Report filing within specific timeframes. Every one of these regulations has specific technical requirements: specific encryption algorithms (AES-256, not AES-128), specific key rotation schedules (at least annually for PCI), specific audit log retention periods (7 years for AML, 5 years for MiFID II), and specific data handling procedures. AI tools generate code that works functionally but violates three regulations you did not know applied to your use case.
- Audit trails are not optional logging: Every financial operation must be traceable, immutable, and reconstructable. This is not “log what happened” but “prove what happened in a way that holds up in court and satisfies regulators.” Audit events must capture who initiated the action, what the action was, when it occurred (with synchronized timestamps), why it was authorized (approval chain), the complete before-state and after-state of every affected record, and be tamper-evident (cryptographic chaining or append-only storage). Financial regulators can request full transaction reconstruction going back 7 or more years. A missing audit entry is not a bug — it is a compliance violation that can result in fines. AI tools generate
logger.info("payment processed")when you need a cryptographically signed, append-only audit event with full transaction context, correlation IDs linking to the original request, and structured fields that enable automated compliance queries. - Idempotency is a money problem: A network timeout during payment processing means you do not know if the charge went through. Retry without idempotency means double-charging the customer. Every mutation in a financial system must have an idempotency key. Every API endpoint must handle duplicate submissions gracefully. Every webhook must be safely re-deliverable without creating duplicate effects. Every inter-service call must implement exactly-once semantics (or at-least-once with idempotent receivers). This is not an optimization — it is a correctness requirement. A single double-charge creates a customer complaint, a chargeback fee ($25–$75), potential card network fines, and regulatory scrutiny. At scale, non-idempotent operations create systematic financial discrepancies that are nearly impossible to reconcile. AI tools generate non-idempotent endpoints that will double-charge customers the first time a network blip occurs between your service and the payment gateway.
- Latency has dollar values: In high-frequency trading, microseconds are worth millions annually. In payment processing, every 100ms of additional latency at checkout produces a measurable conversion drop — Amazon famously measured that every 100ms of latency cost them 1% in sales. In risk calculation, stale data means wrong position limits, which means potential for catastrophic loss when markets move fast. Lock-free data structures, kernel bypass networking (DPDK, RDMA), co-located servers, and FPGA acceleration are not exotic optimizations but baseline requirements for competitive trading systems. Even in “slow” FinTech like retail banking, SLAs are tight: payment initiation must respond within seconds, account balance queries must return real-time consistent results, and fraud scoring must complete before the authorization response deadline (typically 2–3 seconds for card transactions). AI tools optimize for readability when you need nanosecond-optimized hot paths, and they reach for
await asyncio.sleep()retry patterns when you need sub-millisecond response times. - Money must balance to the penny: Double-entry bookkeeping is not an accounting convention you can ignore — it is the fundamental correctness invariant of financial software. Every debit has a matching credit. Every credit has a matching debit. The sum of all debits must equal the sum of all credits at all times, for every account, for every currency, for every reporting period. Settlement and reconciliation between systems that may temporarily disagree must converge within defined windows. Handling partial payments, chargebacks, refunds, currency conversions, interchange fees, platform fees, taxes, and regulatory holds — each creates additional ledger entries that must all balance. A bug that creates a $0.01 discrepancy per transaction across 10 million daily transactions means $100,000 per day of unexplained variance. AI tools do not understand that financial operations must be atomic across multiple ledger entries — they will happily generate code that debits one account and credits another in separate database transactions, creating windows where money appears or disappears.
- Fraud detection is adversarial and real-time: Unlike most software where bugs are accidental, FinTech systems face active, sophisticated adversaries whose full-time job is stealing money. Transaction scoring must happen in under 50 milliseconds at checkout — you cannot add seconds of latency to every card swipe. Rule engines must be updateable without code deployment because fraud patterns change daily. False positive and false negative tradeoffs have direct dollar costs: every false positive costs $20–$50 in manual investigation, and every missed fraud event costs the average of the transaction amount ($500+) plus chargeback fees plus potential card network fines. Velocity checks (5 transactions from same card in 60 seconds), device fingerprinting (same device, different identities), behavioral analysis (unusual purchase patterns), and network graph analysis (connected accounts with suspicious patterns) all must execute in the authorization hot path. AI tools generate rule-based fraud checks —
if amount > 10000: flag()— that a motivated attacker bypasses by splitting transactions, and they never consider the scoring pipeline latency budget.
Task Support Matrix
We tested each AI coding tool on the core tasks that define FinTech engineering work. Ratings reflect production-quality output, not tutorial-grade code:
| Task | Cursor | Copilot | Claude Code | Windsurf | Tabnine | Amazon Q |
|---|---|---|---|---|---|---|
| Payment Processing & Gateways | A | A− | A | B+ | B | B+ |
| Ledger & Double-Entry Bookkeeping | B+ | B | A | B | B− | B |
| Trading Systems & Order Management | B+ | B | A− | B | C+ | B− |
| Risk Management & Position Limits | B | B− | A− | B− | C+ | B− |
| Compliance Automation & Reporting | B+ | B | A | B | B− | B+ |
| Fraud Detection & Monitoring | B+ | B | A− | B | C+ | B |
| Banking API & Open Banking (PSD2) | A− | B+ | A− | B+ | B | B+ |
How to read this table: Ratings reflect production-quality output for each domain. An “A” means the tool generates code that an experienced FinTech engineer would accept with minor edits — proper Decimal arithmetic, idempotency handling, and compliance-aware patterns. A “C” means the output requires substantial rewriting or demonstrates fundamental misunderstandings of financial engineering requirements (using floats for money, missing idempotency, logging sensitive data). We tested with explicit, domain-specific prompts — vague prompts produce worse results across all tools.
1. Payment Processing & Gateway Integration
Payment processing is where FinTech engineering meets the real world of card networks, gateway APIs, webhook deliveries, and the fundamental uncertainty of distributed systems. When your server sends a charge request to Stripe and the connection times out, you do not know if the charge succeeded. Your code must handle this uncertainty correctly every time, because getting it wrong means either losing revenue (charge succeeded but you think it failed) or double-charging the customer (charge failed but you retry and it succeeds twice). Production payment systems must handle idempotency keys, PCI-DSS compliant tokenization flows, multi-gateway failover, webhook signature verification, and proper error classification that distinguishes declines from errors from fraud signals.
Production Payment Processor with Idempotency
This is the kind of payment processor that actually runs in production. Note the Decimal arithmetic everywhere, the idempotency key handling, the structured error classification, and the audit trail on every operation:
from decimal import Decimal, ROUND_HALF_EVEN, InvalidOperation
from dataclasses import dataclass, field
from enum import Enum
from typing import Optional, Dict, Any
from datetime import datetime, timezone
import hashlib
import hmac
import uuid
import json
class PaymentStatus(Enum):
PENDING = "pending"
PROCESSING = "processing"
SUCCEEDED = "succeeded"
FAILED = "failed"
DECLINED = "declined"
REQUIRES_ACTION = "requires_action" # 3DS, SCA
REFUNDED = "refunded"
PARTIALLY_REFUNDED = "partially_refunded"
class ErrorCategory(Enum):
"""Classifying errors determines retry behavior and customer messaging."""
DECLINE_INSUFFICIENT_FUNDS = "decline_insufficient_funds" # Do not retry
DECLINE_CARD_EXPIRED = "decline_card_expired" # Do not retry
DECLINE_FRAUD = "decline_fraud" # Do not retry, flag
GATEWAY_TIMEOUT = "gateway_timeout" # Safe to retry
GATEWAY_ERROR = "gateway_error" # Safe to retry
NETWORK_ERROR = "network_error" # Safe to retry
INVALID_REQUEST = "invalid_request" # Do not retry, fix code
RATE_LIMITED = "rate_limited" # Retry with backoff
# Currency decimal places - AI tools hardcode 2, which is wrong
CURRENCY_DECIMALS: Dict[str, int] = {
"USD": 2, "EUR": 2, "GBP": 2, "CAD": 2, "AUD": 2,
"JPY": 0, "KRW": 0, "VND": 0, # Zero-decimal currencies
"BHD": 3, "KWD": 3, "OMR": 3, # Three-decimal currencies
}
@dataclass
class Money:
"""Proper money representation. Never use float for money.
Stores amount as Decimal with currency-specific precision.
All arithmetic uses banker's rounding (ROUND_HALF_EVEN).
"""
amount: Decimal
currency: str
def __post_init__(self):
if isinstance(self.amount, float):
raise TypeError(
"Never construct Money from float. Use string: "
"Money(Decimal('10.99'), 'USD')"
)
if isinstance(self.amount, (int, str)):
self.amount = Decimal(str(self.amount))
self.currency = self.currency.upper()
# Quantize to currency-specific decimal places
decimals = CURRENCY_DECIMALS.get(self.currency, 2)
self.amount = self.amount.quantize(
Decimal(10) ** -decimals,
rounding=ROUND_HALF_EVEN
)
def to_minor_units(self) -> int:
"""Convert to smallest currency unit (cents, pence, etc.).
This is what payment gateways expect: Stripe takes amounts in cents.
"""
decimals = CURRENCY_DECIMALS.get(self.currency, 2)
return int(self.amount * (10 ** decimals))
@classmethod
def from_minor_units(cls, minor: int, currency: str) -> "Money":
"""Construct from gateway response (cents -> dollars)."""
decimals = CURRENCY_DECIMALS.get(currency.upper(), 2)
amount = Decimal(minor) / (10 ** decimals)
return cls(amount=amount, currency=currency)
def __add__(self, other: "Money") -> "Money":
if self.currency != other.currency:
raise ValueError(
f"Cannot add {self.currency} and {other.currency}"
)
return Money(self.amount + other.amount, self.currency)
def __sub__(self, other: "Money") -> "Money":
if self.currency != other.currency:
raise ValueError(
f"Cannot subtract {self.currency} and {other.currency}"
)
return Money(self.amount - other.amount, self.currency)
def __str__(self) -> str:
return f"{self.amount} {self.currency}"
@dataclass
class PaymentRequest:
"""Immutable payment request with idempotency key."""
idempotency_key: str # Client-generated, prevents double-charge
amount: Money
customer_id: str
payment_method_token: str # Tokenized card, never raw PAN
description: str
metadata: Dict[str, str] = field(default_factory=dict)
created_at: datetime = field(
default_factory=lambda: datetime.now(timezone.utc)
)
def __post_init__(self):
# Validate idempotency key format
if not self.idempotency_key or len(self.idempotency_key) > 255:
raise ValueError("Idempotency key must be 1-255 characters")
# Never allow raw card numbers in metadata
for key, value in self.metadata.items():
if self._looks_like_pan(value):
raise ValueError(
f"PCI violation: metadata key '{key}' contains "
"what appears to be a card number"
)
@staticmethod
def _looks_like_pan(value: str) -> bool:
"""Basic Luhn check to prevent accidental PAN logging."""
digits = "".join(c for c in value if c.isdigit())
if len(digits) < 13 or len(digits) > 19:
return False
# Luhn algorithm
total = 0
for i, d in enumerate(reversed(digits)):
n = int(d)
if i % 2 == 1:
n *= 2
if n > 9:
n -= 9
total += n
return total % 10 == 0
@dataclass
class PaymentResult:
"""Result of a payment attempt with full audit context."""
request_id: str
idempotency_key: str
status: PaymentStatus
gateway_transaction_id: Optional[str] = None
error_category: Optional[ErrorCategory] = None
error_message: Optional[str] = None
amount: Optional[Money] = None
processed_at: Optional[datetime] = None
is_idempotent_replay: bool = False
class PaymentProcessor:
"""Production payment processor with idempotency, retry, and audit.
Key invariants:
1. Same idempotency_key always returns same result (no double-charge)
2. All amounts use Decimal, never float
3. No raw card data touches this service (tokenized only)
4. Every operation produces an audit event
5. Gateway errors are classified for correct retry behavior
"""
def __init__(self, gateway, idempotency_store, audit_log, ledger):
self.gateway = gateway
self.idempotency_store = idempotency_store
self.audit = audit_log
self.ledger = ledger
async def process_payment(
self, request: PaymentRequest
) -> PaymentResult:
"""Process a payment with full idempotency guarantees.
If this idempotency_key was seen before, returns the stored result
without hitting the gateway again. This is critical: the gateway
may have charged the card, so we must not retry.
"""
# Step 1: Check idempotency store
existing = await self.idempotency_store.get(request.idempotency_key)
if existing is not None:
self.audit.record(
event="payment.idempotent_replay",
idempotency_key=request.idempotency_key,
original_request_id=existing.request_id,
)
existing.is_idempotent_replay = True
return existing
request_id = str(uuid.uuid4())
# Step 2: Record intent before calling gateway
# If we crash between gateway call and recording result,
# the intent record lets us reconcile later
await self.idempotency_store.set_pending(
request.idempotency_key, request_id
)
self.audit.record(
event="payment.initiated",
request_id=request_id,
idempotency_key=request.idempotency_key,
amount=str(request.amount),
customer_id=request.customer_id,
# NEVER log payment_method_token in full - PCI violation
payment_method_hint=request.payment_method_token[:8] + "...",
)
# Step 3: Call payment gateway
try:
gateway_result = await self.gateway.charge(
amount_minor=request.amount.to_minor_units(),
currency=request.amount.currency,
payment_method=request.payment_method_token,
idempotency_key=request.idempotency_key,
metadata={
"internal_request_id": request_id,
"customer_id": request.customer_id,
},
)
if gateway_result.succeeded:
result = PaymentResult(
request_id=request_id,
idempotency_key=request.idempotency_key,
status=PaymentStatus.SUCCEEDED,
gateway_transaction_id=gateway_result.transaction_id,
amount=request.amount,
processed_at=datetime.now(timezone.utc),
)
# Step 4: Record in ledger (atomic with result storage)
await self.ledger.record_payment(
request_id=request_id,
customer_id=request.customer_id,
amount=request.amount,
gateway_txn_id=gateway_result.transaction_id,
)
elif gateway_result.requires_action:
result = PaymentResult(
request_id=request_id,
idempotency_key=request.idempotency_key,
status=PaymentStatus.REQUIRES_ACTION,
gateway_transaction_id=gateway_result.transaction_id,
amount=request.amount,
)
else:
result = PaymentResult(
request_id=request_id,
idempotency_key=request.idempotency_key,
status=PaymentStatus.DECLINED,
error_category=self._classify_decline(
gateway_result.decline_code
),
error_message=gateway_result.decline_message,
amount=request.amount,
)
except GatewayTimeoutError:
# CRITICAL: We do NOT know if the charge went through.
# Mark as processing; reconciliation job will resolve.
result = PaymentResult(
request_id=request_id,
idempotency_key=request.idempotency_key,
status=PaymentStatus.PROCESSING,
error_category=ErrorCategory.GATEWAY_TIMEOUT,
error_message="Gateway timeout - charge status unknown",
amount=request.amount,
)
except GatewayError as e:
result = PaymentResult(
request_id=request_id,
idempotency_key=request.idempotency_key,
status=PaymentStatus.FAILED,
error_category=ErrorCategory.GATEWAY_ERROR,
error_message=str(e),
amount=request.amount,
)
# Step 5: Store result for idempotency
await self.idempotency_store.set_complete(
request.idempotency_key, result
)
self.audit.record(
event=f"payment.{result.status.value}",
request_id=request_id,
idempotency_key=request.idempotency_key,
status=result.status.value,
gateway_txn_id=result.gateway_transaction_id,
error_category=(
result.error_category.value if result.error_category else None
),
)
return result
def _classify_decline(self, decline_code: str) -> ErrorCategory:
"""Map gateway decline codes to actionable categories.
This mapping determines whether we retry, ask for new payment
method, or flag for fraud review. Getting it wrong means either
losing legitimate transactions or retrying hopeless ones.
"""
fraud_codes = {"fraudulent", "stolen_card", "lost_card"}
insufficient_codes = {"insufficient_funds", "withdrawal_limit"}
expired_codes = {"expired_card", "invalid_expiry"}
if decline_code in fraud_codes:
return ErrorCategory.DECLINE_FRAUD
elif decline_code in insufficient_codes:
return ErrorCategory.DECLINE_INSUFFICIENT_FUNDS
elif decline_code in expired_codes:
return ErrorCategory.DECLINE_CARD_EXPIRED
else:
return ErrorCategory.GATEWAY_ERROR
@staticmethod
def verify_webhook_signature(
payload: bytes, signature: str, secret: str
) -> bool:
"""Verify webhook signature from payment gateway.
CRITICAL: Without this, attackers can forge payment success
notifications and get goods/services without paying.
Uses constant-time comparison to prevent timing attacks.
"""
expected = hmac.new(
secret.encode("utf-8"),
payload,
hashlib.sha256
).hexdigest()
# Constant-time comparison prevents timing side-channel
return hmac.compare_digest(expected, signature)
What AI tools get wrong: Cursor and Claude Code both generate structurally sound payment processors when prompted with domain-specific instructions. The critical differences emerge in the details. Copilot frequently uses float for amounts unless your existing codebase already uses Decimal — it mirrors what it sees. Windsurf generates clean gateway integration code but omits the idempotency store check, producing endpoints that will double-charge on retry. Tabnine produces correct Stripe API call shapes but misses the webhook signature verification entirely. Claude Code is the only tool that consistently generates the “gateway timeout = unknown state” handling without being explicitly prompted — it understands that a timeout is fundamentally different from an error. None of the tools generate the PAN detection in metadata (the _looks_like_pan check) without explicit prompting about PCI-DSS requirements.
2. Ledger & Double-Entry Bookkeeping
A ledger is the source of truth for every financial system. If your payment processor says a charge succeeded but your ledger does not have the corresponding entries, you have a reconciliation problem that will consume days of engineering time and potentially trigger regulatory scrutiny. Double-entry bookkeeping is not optional accounting tradition — it is the invariant that makes financial systems auditable and correct. Every financial event creates at least two ledger entries: one debit and one credit. The sum of all debits must equal the sum of all credits at all times. If they do not, money has appeared from nowhere or disappeared into nowhere, and your system is broken.
Production Ledger with Double-Entry Invariants
This ledger implementation enforces the fundamental accounting equation at the type level. Note that a transaction cannot be created without balanced entries, amounts are always Decimal, and every mutation is atomic:
from decimal import Decimal, ROUND_HALF_EVEN
from dataclasses import dataclass, field
from enum import Enum
from typing import List, Optional, Dict
from datetime import datetime, timezone
import uuid
class EntryType(Enum):
DEBIT = "debit"
CREDIT = "credit"
class AccountType(Enum):
"""Account types determine normal balance direction.
Assets and Expenses have normal DEBIT balances.
Liabilities, Equity, and Revenue have normal CREDIT balances.
Getting this wrong means your balance sheet is inverted.
"""
ASSET = "asset" # Normal debit balance (bank accounts, receivables)
LIABILITY = "liability" # Normal credit balance (payables, customer deposits)
EQUITY = "equity" # Normal credit balance (retained earnings)
REVENUE = "revenue" # Normal credit balance (fees earned, interest)
EXPENSE = "expense" # Normal debit balance (costs, chargebacks)
@dataclass(frozen=True)
class LedgerEntry:
"""A single ledger entry. Immutable after creation.
Entries are always created in balanced pairs/groups via Transaction.
You should never create a LedgerEntry directly in application code.
"""
entry_id: str
account_id: str
entry_type: EntryType
amount: Decimal # Always positive. Direction determined by entry_type.
currency: str
description: str
created_at: datetime
def __post_init__(self):
if not isinstance(self.amount, Decimal):
raise TypeError(f"Amount must be Decimal, got {type(self.amount)}")
if self.amount <= 0:
raise ValueError(f"Entry amount must be positive, got {self.amount}")
@dataclass(frozen=True)
class Transaction:
"""An atomic group of ledger entries that must balance.
The constructor enforces the fundamental accounting equation:
sum(debits) == sum(credits) for each currency.
A Transaction that does not balance cannot be created.
"""
transaction_id: str
entries: List[LedgerEntry]
reference: str # External reference (payment_id, order_id)
description: str
metadata: Dict[str, str]
created_at: datetime
def __post_init__(self):
if len(self.entries) < 2:
raise ValueError(
"Transaction must have at least 2 entries (debit + credit)"
)
# Verify balance per currency
balances: Dict[str, Dict[str, Decimal]] = {}
for entry in self.entries:
if entry.currency not in balances:
balances[entry.currency] = {
"debit": Decimal("0"),
"credit": Decimal("0"),
}
balances[entry.currency][entry.entry_type.value] += entry.amount
for currency, totals in balances.items():
if totals["debit"] != totals["credit"]:
raise ValueError(
f"Transaction does not balance for {currency}: "
f"debits={totals['debit']}, credits={totals['credit']}"
)
class Ledger:
"""Production ledger with double-entry bookkeeping.
Invariants enforced:
1. Every transaction balances (debits == credits per currency)
2. All amounts are Decimal with currency-specific precision
3. Entries are immutable and append-only (no edits, no deletes)
4. Corrections are made via reversing entries, never mutations
5. Trial balance is verifiable at any point in time
"""
def __init__(self, db):
self.db = db
async def record_payment(
self,
request_id: str,
customer_id: str,
amount: "Money",
gateway_txn_id: str,
) -> Transaction:
"""Record a successful payment in the ledger.
Creates balanced entries:
- DEBIT: Cash/Bank account (asset increases)
- CREDIT: Customer payment received (revenue increases)
For a platform with fees, this would be three+ entries:
- DEBIT cash for full amount
- CREDIT merchant payable for (amount - platform_fee)
- CREDIT platform revenue for platform_fee
"""
now = datetime.now(timezone.utc)
txn_id = str(uuid.uuid4())
entries = [
LedgerEntry(
entry_id=str(uuid.uuid4()),
account_id="account:cash:main",
entry_type=EntryType.DEBIT,
amount=amount.amount,
currency=amount.currency,
description=f"Payment received from customer {customer_id}",
created_at=now,
),
LedgerEntry(
entry_id=str(uuid.uuid4()),
account_id=f"account:revenue:payments",
entry_type=EntryType.CREDIT,
amount=amount.amount,
currency=amount.currency,
description=f"Payment revenue for request {request_id}",
created_at=now,
),
]
txn = Transaction(
transaction_id=txn_id,
entries=entries,
reference=request_id,
description=f"Payment {gateway_txn_id}",
metadata={
"gateway_txn_id": gateway_txn_id,
"customer_id": customer_id,
},
created_at=now,
)
# Atomic write: all entries in one DB transaction
async with self.db.transaction():
for entry in txn.entries:
await self.db.execute(
"""INSERT INTO ledger_entries
(entry_id, transaction_id, account_id, entry_type,
amount, currency, description, created_at)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8)""",
entry.entry_id, txn.transaction_id, entry.account_id,
entry.entry_type.value, entry.amount, entry.currency,
entry.description, entry.created_at,
)
await self.db.execute(
"""INSERT INTO transactions
(transaction_id, reference, description, metadata, created_at)
VALUES ($1, $2, $3, $4, $5)""",
txn.transaction_id, txn.reference, txn.description,
json.dumps(txn.metadata), txn.created_at,
)
return txn
async def record_refund(
self,
original_txn_id: str,
refund_amount: "Money",
reason: str,
) -> Transaction:
"""Record a refund as a reversing entry.
Refunds reverse the original entries:
- DEBIT: Revenue (decreases revenue)
- CREDIT: Cash/Bank (decreases assets)
For partial refunds, only the refunded amount is reversed.
The original transaction is never modified.
"""
now = datetime.now(timezone.utc)
txn_id = str(uuid.uuid4())
entries = [
LedgerEntry(
entry_id=str(uuid.uuid4()),
account_id="account:revenue:payments",
entry_type=EntryType.DEBIT,
amount=refund_amount.amount,
currency=refund_amount.currency,
description=f"Refund for transaction {original_txn_id}: {reason}",
created_at=now,
),
LedgerEntry(
entry_id=str(uuid.uuid4()),
account_id="account:cash:main",
entry_type=EntryType.CREDIT,
amount=refund_amount.amount,
currency=refund_amount.currency,
description=f"Refund disbursement for {original_txn_id}",
created_at=now,
),
]
txn = Transaction(
transaction_id=txn_id,
entries=entries,
reference=original_txn_id,
description=f"Refund: {reason}",
metadata={
"original_transaction_id": original_txn_id,
"refund_reason": reason,
},
created_at=now,
)
async with self.db.transaction():
for entry in txn.entries:
await self.db.execute(
"""INSERT INTO ledger_entries
(entry_id, transaction_id, account_id, entry_type,
amount, currency, description, created_at)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8)""",
entry.entry_id, txn.transaction_id, entry.account_id,
entry.entry_type.value, entry.amount, entry.currency,
entry.description, entry.created_at,
)
await self.db.execute(
"""INSERT INTO transactions
(transaction_id, reference, description, metadata, created_at)
VALUES ($1, $2, $3, $4, $5)""",
txn.transaction_id, txn.reference, txn.description,
json.dumps(txn.metadata), txn.created_at,
)
return txn
async def get_account_balance(
self, account_id: str, currency: str,
as_of: Optional[datetime] = None,
) -> Decimal:
"""Compute account balance from ledger entries.
Balance = sum(debits) - sum(credits) for debit-normal accounts
Balance = sum(credits) - sum(debits) for credit-normal accounts
This is computed from entries, not stored. Stored balances are
caches that must be reconciled against computed balances.
"""
date_filter = ""
params = [account_id, currency]
if as_of:
date_filter = "AND created_at <= $3"
params.append(as_of)
row = await self.db.fetchrow(
f"""SELECT
COALESCE(SUM(CASE WHEN entry_type = 'debit'
THEN amount ELSE 0 END), 0) as total_debits,
COALESCE(SUM(CASE WHEN entry_type = 'credit'
THEN amount ELSE 0 END), 0) as total_credits
FROM ledger_entries
WHERE account_id = $1 AND currency = $2 {date_filter}""",
*params,
)
total_debits = Decimal(str(row["total_debits"]))
total_credits = Decimal(str(row["total_credits"]))
# Net balance (positive = debit-normal, caller interprets)
return total_debits - total_credits
async def verify_trial_balance(
self, currency: str,
as_of: Optional[datetime] = None,
) -> Dict[str, Decimal]:
"""Verify that the ledger balances: total debits == total credits.
If this check fails, there is a bug in the system.
Run this as a scheduled job and alert immediately on failure.
"""
date_filter = ""
params = [currency]
if as_of:
date_filter = "AND created_at <= $2"
params.append(as_of)
row = await self.db.fetchrow(
f"""SELECT
COALESCE(SUM(CASE WHEN entry_type = 'debit'
THEN amount ELSE 0 END), 0) as total_debits,
COALESCE(SUM(CASE WHEN entry_type = 'credit'
THEN amount ELSE 0 END), 0) as total_credits
FROM ledger_entries
WHERE currency = $1 {date_filter}""",
*params,
)
total_debits = Decimal(str(row["total_debits"]))
total_credits = Decimal(str(row["total_credits"]))
difference = total_debits - total_credits
if difference != Decimal("0"):
# THIS IS A CRITICAL ALERT - the ledger is broken
raise LedgerImbalanceError(
f"Trial balance failed for {currency}: "
f"debits={total_debits}, credits={total_credits}, "
f"difference={difference}"
)
return {
"total_debits": total_debits,
"total_credits": total_credits,
"balanced": True,
}
What AI tools get wrong: Claude Code is the strongest tool for ledger code because it understands the invariant that must never break: debits equal credits. When asked to “build a ledger system,” Claude Code generates balanced transaction validation about 70% of the time without explicit prompting. Cursor generates well-structured code but tends to store balances as mutable columns rather than computing them from entries — which creates reconciliation nightmares when entries and stored balances diverge. Copilot generates simple credit/debit recording but misses the atomic transaction requirement: it writes individual entries in separate database calls, creating windows where the ledger is unbalanced. Windsurf and Tabnine both generate single-entry accounting (just recording amounts with positive/negative signs), which is fundamentally wrong for financial systems. None of the tools generate the trial balance verification without prompting, even though it is the most important correctness check a ledger can perform.
3. Trading Systems & Order Management
Trading systems are where FinTech meets hard real-time requirements. An order matching engine must process orders in deterministic sequence, match buyers with sellers at the correct price with price-time priority, handle partial fills, and produce an audit trail that regulators can reconstruct to the microsecond. The difference between a correct matching engine and a subtly wrong one is the difference between a functioning exchange and a lawsuit.
Order Book with Price-Time Priority Matching
This order book uses fixed-point integer arithmetic for prices (no floating point), maintains price-time priority, handles partial fills, and produces a deterministic sequence of events:
from dataclasses import dataclass, field
from enum import Enum
from typing import List, Optional, Dict, Deque
from collections import deque
from decimal import Decimal
from datetime import datetime, timezone
import heapq
class OrderSide(Enum):
BUY = "buy"
SELL = "sell"
class OrderType(Enum):
LIMIT = "limit"
MARKET = "market"
class OrderStatus(Enum):
NEW = "new"
PARTIALLY_FILLED = "partially_filled"
FILLED = "filled"
CANCELLED = "cancelled"
@dataclass
class Order:
"""An order in the book.
Prices stored as integer ticks (e.g., price in cents) to avoid
floating point. A tick size of 1 = $0.01 for USD equity markets.
Quantities stored as integers (shares, lots, contracts).
"""
order_id: str
side: OrderSide
order_type: OrderType
price_ticks: int # Price in ticks (0 for market orders)
quantity: int # Original quantity
remaining_quantity: int # Unfilled quantity
timestamp_ns: int # Nanosecond timestamp for ordering
sequence_number: int # Global sequence for deterministic replay
account_id: str
status: OrderStatus = OrderStatus.NEW
def is_active(self) -> bool:
return self.status in (OrderStatus.NEW, OrderStatus.PARTIALLY_FILLED)
@dataclass
class Fill:
"""A trade execution (fill) between two orders."""
fill_id: str
buy_order_id: str
sell_order_id: str
price_ticks: int
quantity: int
buyer_account: str
seller_account: str
sequence_number: int
timestamp_ns: int
class PriceLevel:
"""Orders at a single price level, maintained in time priority (FIFO)."""
def __init__(self, price_ticks: int):
self.price_ticks = price_ticks
self.orders: Deque[Order] = deque()
self.total_quantity: int = 0
def add_order(self, order: Order) -> None:
self.orders.append(order)
self.total_quantity += order.remaining_quantity
def remove_order(self, order_id: str) -> Optional[Order]:
for i, order in enumerate(self.orders):
if order.order_id == order_id:
self.total_quantity -= order.remaining_quantity
del self.orders[i]
return order
return None
def is_empty(self) -> bool:
return len(self.orders) == 0
class OrderBook:
"""Order matching engine with price-time priority.
Matching rules:
- Buy orders match against the lowest available sell price
- Sell orders match against the highest available buy price
- At the same price level, earlier orders fill first (FIFO)
- Partial fills are supported: a 100-share order can match
against multiple smaller orders
- Market orders match at the best available price
- Self-trade prevention: orders from same account do not match
All prices are in integer ticks. All quantities are integers.
No floating point anywhere in the matching engine.
"""
def __init__(self, symbol: str, tick_size_cents: int = 1):
self.symbol = symbol
self.tick_size_cents = tick_size_cents
# Buy side: max-heap (highest price has priority)
# Sell side: min-heap (lowest price has priority)
self._buy_levels: Dict[int, PriceLevel] = {}
self._sell_levels: Dict[int, PriceLevel] = {}
self._buy_prices: List[int] = [] # Max-heap (negated for heapq)
self._sell_prices: List[int] = [] # Min-heap
self._orders: Dict[str, Order] = {}
self._sequence: int = 0
self._fills: List[Fill] = []
def _next_sequence(self) -> int:
self._sequence += 1
return self._sequence
def submit_order(self, order: Order) -> List[Fill]:
"""Submit an order and return any resulting fills.
Market orders match immediately at best available price.
Limit orders match if a counter-party exists at an acceptable
price, otherwise rest on the book.
"""
order.sequence_number = self._next_sequence()
fills = []
if order.order_type == OrderType.MARKET:
fills = self._match_market_order(order)
elif order.order_type == OrderType.LIMIT:
fills = self._match_limit_order(order)
# If order has remaining quantity, add to book
if order.remaining_quantity > 0 and order.order_type == OrderType.LIMIT:
self._add_to_book(order)
self._orders[order.order_id] = order
# Fully filled market orders with remaining qty are cancelled
if (order.order_type == OrderType.MARKET
and order.remaining_quantity > 0):
order.status = OrderStatus.CANCELLED
return fills
def cancel_order(self, order_id: str) -> Optional[Order]:
"""Cancel an active order and remove from book."""
order = self._orders.get(order_id)
if order is None or not order.is_active():
return None
if order.side == OrderSide.BUY:
level = self._buy_levels.get(order.price_ticks)
else:
level = self._sell_levels.get(order.price_ticks)
if level:
level.remove_order(order_id)
if level.is_empty():
if order.side == OrderSide.BUY:
del self._buy_levels[order.price_ticks]
else:
del self._sell_levels[order.price_ticks]
order.status = OrderStatus.CANCELLED
del self._orders[order_id]
return order
def _match_limit_order(self, order: Order) -> List[Fill]:
"""Match a limit order against the opposite side of the book."""
fills = []
if order.side == OrderSide.BUY:
# Buy order matches against sell side at or below limit price
while (order.remaining_quantity > 0
and self._sell_prices
and self._sell_prices[0] <= order.price_ticks):
best_price = self._sell_prices[0]
level = self._sell_levels.get(best_price)
if level is None or level.is_empty():
heapq.heappop(self._sell_prices)
continue
new_fills = self._match_at_level(order, level)
fills.extend(new_fills)
if level.is_empty():
del self._sell_levels[best_price]
heapq.heappop(self._sell_prices)
elif order.side == OrderSide.SELL:
# Sell order matches against buy side at or above limit price
while (order.remaining_quantity > 0
and self._buy_prices
and -self._buy_prices[0] >= order.price_ticks):
best_price = -self._buy_prices[0]
level = self._buy_levels.get(best_price)
if level is None or level.is_empty():
heapq.heappop(self._buy_prices)
continue
new_fills = self._match_at_level(order, level)
fills.extend(new_fills)
if level.is_empty():
del self._buy_levels[best_price]
heapq.heappop(self._buy_prices)
return fills
def _match_market_order(self, order: Order) -> List[Fill]:
"""Match a market order at best available prices."""
fills = []
if order.side == OrderSide.BUY:
while order.remaining_quantity > 0 and self._sell_prices:
best_price = self._sell_prices[0]
level = self._sell_levels.get(best_price)
if level is None or level.is_empty():
heapq.heappop(self._sell_prices)
continue
new_fills = self._match_at_level(order, level)
fills.extend(new_fills)
if level.is_empty():
del self._sell_levels[best_price]
heapq.heappop(self._sell_prices)
else:
while order.remaining_quantity > 0 and self._buy_prices:
best_price = -self._buy_prices[0]
level = self._buy_levels.get(best_price)
if level is None or level.is_empty():
heapq.heappop(self._buy_prices)
continue
new_fills = self._match_at_level(order, level)
fills.extend(new_fills)
if level.is_empty():
del self._buy_levels[best_price]
heapq.heappop(self._buy_prices)
return fills
def _match_at_level(
self, incoming: Order, level: PriceLevel
) -> List[Fill]:
"""Match incoming order against resting orders at a price level."""
fills = []
while incoming.remaining_quantity > 0 and not level.is_empty():
resting = level.orders[0]
# Self-trade prevention
if resting.account_id == incoming.account_id:
level.orders.popleft()
level.total_quantity -= resting.remaining_quantity
resting.status = OrderStatus.CANCELLED
if resting.order_id in self._orders:
del self._orders[resting.order_id]
continue
fill_qty = min(incoming.remaining_quantity,
resting.remaining_quantity)
fill_price = resting.price_ticks # Resting order price
# Determine buyer/seller
if incoming.side == OrderSide.BUY:
buyer, seller = incoming, resting
else:
buyer, seller = resting, incoming
fill = Fill(
fill_id=f"fill-{self._next_sequence()}",
buy_order_id=buyer.order_id,
sell_order_id=seller.order_id,
price_ticks=fill_price,
quantity=fill_qty,
buyer_account=buyer.account_id,
seller_account=seller.account_id,
sequence_number=self._sequence,
timestamp_ns=_monotonic_ns(),
)
fills.append(fill)
self._fills.append(fill)
# Update quantities
incoming.remaining_quantity -= fill_qty
resting.remaining_quantity -= fill_qty
level.total_quantity -= fill_qty
# Update statuses
if incoming.remaining_quantity == 0:
incoming.status = OrderStatus.FILLED
else:
incoming.status = OrderStatus.PARTIALLY_FILLED
if resting.remaining_quantity == 0:
resting.status = OrderStatus.FILLED
level.orders.popleft()
if resting.order_id in self._orders:
del self._orders[resting.order_id]
else:
resting.status = OrderStatus.PARTIALLY_FILLED
return fills
def _add_to_book(self, order: Order) -> None:
"""Add an order to the appropriate side of the book."""
if order.side == OrderSide.BUY:
if order.price_ticks not in self._buy_levels:
self._buy_levels[order.price_ticks] = PriceLevel(
order.price_ticks
)
heapq.heappush(self._buy_prices, -order.price_ticks)
self._buy_levels[order.price_ticks].add_order(order)
else:
if order.price_ticks not in self._sell_levels:
self._sell_levels[order.price_ticks] = PriceLevel(
order.price_ticks
)
heapq.heappush(self._sell_prices, order.price_ticks)
self._sell_levels[order.price_ticks].add_order(order)
def best_bid(self) -> Optional[int]:
"""Highest buy price with active orders."""
while self._buy_prices:
price = -self._buy_prices[0]
level = self._buy_levels.get(price)
if level and not level.is_empty():
return price
heapq.heappop(self._buy_prices)
return None
def best_ask(self) -> Optional[int]:
"""Lowest sell price with active orders."""
while self._sell_prices:
price = self._sell_prices[0]
level = self._sell_levels.get(price)
if level and not level.is_empty():
return price
heapq.heappop(self._sell_prices)
return None
def spread(self) -> Optional[int]:
"""Current spread in ticks."""
bid, ask = self.best_bid(), self.best_ask()
if bid is not None and ask is not None:
return ask - bid
return None
def _monotonic_ns() -> int:
"""High-resolution timestamp for ordering. Not wall-clock."""
import time
return time.monotonic_ns()
What AI tools get wrong: Trading system code is where AI tools show the widest gap between “looks right” and “is right.” Claude Code generates the correct price-time priority logic and understands the significance of sequence numbers for deterministic replay. Cursor produces clean order book structures but frequently uses float for prices — which is catastrophic in a matching engine where 0.1 + 0.2 != 0.3 determines whether orders match. Copilot generates basic order matching but misses self-trade prevention, which regulators require. None of the tools generate the heap-based price level management on the first attempt; they default to sorted lists (O(n) insertion) rather than heaps (O(log n) insertion). The partial fill handling is particularly tricky: AI tools either fill all-or-nothing (wrong for most markets) or implement partial fills but forget to update the resting order status. Only Claude Code consistently handles the case where a market order exhausts one price level and must continue matching at the next available price.
4. Risk Management & Position Limits
Risk management is the system that prevents a single bad trade or a rogue algorithm from bankrupting the firm. Position limits must be checked in real-time before every order submission, margin requirements must be computed against live market data, and circuit breakers must activate automatically when thresholds are breached. These checks run in the critical path — every millisecond of latency they add delays order submission. Getting the limits wrong in either direction is costly: too loose and you risk catastrophic loss, too tight and you prevent legitimate trading.
Real-Time Position and Risk Monitor
from decimal import Decimal, ROUND_HALF_EVEN
from dataclasses import dataclass, field
from enum import Enum
from typing import Dict, Optional, List, Set
from datetime import datetime, timezone
import threading
class RiskAction(Enum):
ALLOW = "allow"
REJECT = "reject"
REDUCE_ONLY = "reduce_only" # Only allow position-reducing orders
HALT = "halt" # Kill switch: reject everything
@dataclass
class PositionLimit:
"""Risk limits for a single instrument or account."""
max_position_quantity: int # Max shares/contracts (absolute)
max_notional_value: Decimal # Max dollar exposure
max_loss_amount: Decimal # Max unrealized loss before halt
max_order_quantity: int # Max single order size
daily_loss_limit: Decimal # Max realized + unrealized daily loss
max_concentration_pct: Decimal # Max % of portfolio in one instrument
@dataclass
class Position:
"""Current position in a single instrument."""
instrument: str
quantity: int = 0 # Positive = long, negative = short
average_cost_ticks: int = 0
realized_pnl: Decimal = Decimal("0")
unrealized_pnl: Decimal = Decimal("0")
last_price_ticks: int = 0
@property
def notional_value(self) -> Decimal:
"""Current market value of position."""
return Decimal(abs(self.quantity)) * Decimal(self.last_price_ticks)
def update_market_price(self, price_ticks: int) -> None:
"""Recalculate unrealized P&L on price change."""
self.last_price_ticks = price_ticks
if self.quantity != 0:
price_diff = Decimal(price_ticks - self.average_cost_ticks)
self.unrealized_pnl = price_diff * Decimal(self.quantity)
class RiskManager:
"""Real-time pre-trade risk checks.
Runs in the order submission hot path. Must be fast:
all checks use in-memory state, no database queries.
State is updated asynchronously from fill events and market data.
Thread-safe: multiple order submission threads can check
concurrently. Position updates acquire a lock.
"""
def __init__(self):
self._positions: Dict[str, Dict[str, Position]] = {} # account -> instrument -> position
self._limits: Dict[str, PositionLimit] = {} # account -> limits
self._daily_pnl: Dict[str, Decimal] = {} # account -> daily realized P&L
self._halted_accounts: Set[str] = set()
self._lock = threading.Lock()
def check_order(
self,
account_id: str,
instrument: str,
side: str, # "buy" or "sell"
quantity: int,
price_ticks: int,
) -> RiskAction:
"""Pre-trade risk check. Must complete in <1ms.
Returns ALLOW, REJECT, REDUCE_ONLY, or HALT.
Every rejection is logged with reason for regulatory audit.
"""
# Check 1: Account halt (kill switch)
if account_id in self._halted_accounts:
return RiskAction.HALT
limits = self._limits.get(account_id)
if limits is None:
# No limits configured = reject (fail closed, not open)
return RiskAction.REJECT
# Check 2: Single order size
if quantity > limits.max_order_quantity:
return RiskAction.REJECT
position = self._get_position(account_id, instrument)
# Check 3: Would this order breach position limit?
if side == "buy":
new_quantity = position.quantity + quantity
else:
new_quantity = position.quantity - quantity
if abs(new_quantity) > limits.max_position_quantity:
# Allow if reducing position, reject if increasing
if abs(new_quantity) < abs(position.quantity):
return RiskAction.ALLOW
return RiskAction.REJECT
# Check 4: Notional value limit
new_notional = Decimal(abs(new_quantity)) * Decimal(price_ticks)
if new_notional > limits.max_notional_value:
if abs(new_quantity) < abs(position.quantity):
return RiskAction.ALLOW
return RiskAction.REJECT
# Check 5: Daily loss limit
daily_pnl = self._daily_pnl.get(account_id, Decimal("0"))
total_unrealized = self._total_unrealized_pnl(account_id)
total_loss = daily_pnl + total_unrealized
if total_loss < -limits.daily_loss_limit:
self._halted_accounts.add(account_id)
return RiskAction.HALT
# Check 6: Concentration limit
portfolio_notional = self._total_notional(account_id)
if portfolio_notional > 0:
concentration = new_notional / portfolio_notional
if concentration > limits.max_concentration_pct:
return RiskAction.REJECT
return RiskAction.ALLOW
def on_fill(
self,
account_id: str,
instrument: str,
side: str,
quantity: int,
price_ticks: int,
) -> None:
"""Update position on fill event. Thread-safe."""
with self._lock:
position = self._get_position(account_id, instrument)
if side == "buy":
if position.quantity >= 0:
# Adding to long: update average cost
total_cost = (
position.average_cost_ticks * position.quantity
+ price_ticks * quantity
)
position.quantity += quantity
if position.quantity > 0:
position.average_cost_ticks = total_cost // position.quantity
else:
# Covering short: realize P&L
cover_qty = min(quantity, abs(position.quantity))
pnl = Decimal(
(position.average_cost_ticks - price_ticks) * cover_qty
)
position.realized_pnl += pnl
self._daily_pnl[account_id] = (
self._daily_pnl.get(account_id, Decimal("0")) + pnl
)
position.quantity += quantity
if position.quantity > 0:
position.average_cost_ticks = price_ticks
else:
if position.quantity <= 0:
# Adding to short: update average cost
total_cost = (
position.average_cost_ticks * abs(position.quantity)
+ price_ticks * quantity
)
position.quantity -= quantity
if position.quantity < 0:
position.average_cost_ticks = total_cost // abs(position.quantity)
else:
# Selling long: realize P&L
sell_qty = min(quantity, position.quantity)
pnl = Decimal(
(price_ticks - position.average_cost_ticks) * sell_qty
)
position.realized_pnl += pnl
self._daily_pnl[account_id] = (
self._daily_pnl.get(account_id, Decimal("0")) + pnl
)
position.quantity -= quantity
if position.quantity < 0:
position.average_cost_ticks = price_ticks
position.update_market_price(price_ticks)
def on_market_data(
self, instrument: str, price_ticks: int
) -> None:
"""Update all positions in this instrument with new market price."""
with self._lock:
for account_positions in self._positions.values():
pos = account_positions.get(instrument)
if pos and pos.quantity != 0:
pos.update_market_price(price_ticks)
def halt_account(self, account_id: str, reason: str) -> None:
"""Emergency halt: reject all new orders for this account."""
self._halted_accounts.add(account_id)
def _get_position(
self, account_id: str, instrument: str
) -> Position:
if account_id not in self._positions:
self._positions[account_id] = {}
if instrument not in self._positions[account_id]:
self._positions[account_id][instrument] = Position(
instrument=instrument
)
return self._positions[account_id][instrument]
def _total_unrealized_pnl(self, account_id: str) -> Decimal:
positions = self._positions.get(account_id, {})
return sum(
(p.unrealized_pnl for p in positions.values()),
Decimal("0"),
)
def _total_notional(self, account_id: str) -> Decimal:
positions = self._positions.get(account_id, {})
return sum(
(p.notional_value for p in positions.values()),
Decimal("0"),
)
What AI tools get wrong: Risk management code requires understanding of both the financial domain and real-time systems constraints. Claude Code handles the conceptual side well — it understands why daily loss limits should trigger a halt rather than a simple rejection, and why position-reducing orders should be allowed even when limits are breached. Cursor generates clean risk check structures but defaults to database queries instead of in-memory state, adding 5–50ms of latency to every order check — unacceptable for trading systems. Copilot generates individual risk checks correctly but misses the “fail closed” principle: if limits are not configured for an account, it defaults to allowing the order rather than rejecting it, which is the opposite of safe behavior. None of the tools generate thread-safe position updates without explicit prompting, even though concurrent access from multiple fill processing threads is the default deployment model. The P&L calculation on position changes (especially when flipping from long to short through zero) is consistently wrong across all tools — this is a case where the AI generates plausible code that produces incorrect numbers.
5. Compliance Automation & Regulatory Reporting
Compliance is where FinTech engineering becomes legal engineering. Anti-Money Laundering (AML) regulations require you to monitor every transaction for suspicious patterns and file Suspicious Activity Reports (SARs) within specific timeframes. Know Your Customer (KYC) requirements mean verifying identity before allowing financial transactions. Currency Transaction Reports (CTRs) must be filed for transactions above certain thresholds. And all of these rules must be configurable without code deployment, because regulations change faster than your release cycle.
AML Transaction Monitor with Configurable Rules
from decimal import Decimal
from dataclasses import dataclass, field
from enum import Enum
from typing import List, Dict, Optional, Callable
from datetime import datetime, timezone, timedelta
import json
class AlertSeverity(Enum):
LOW = "low"
MEDIUM = "medium"
HIGH = "high"
CRITICAL = "critical"
class AlertType(Enum):
STRUCTURING = "structuring" # Splitting to avoid thresholds
VELOCITY = "velocity" # Unusual transaction frequency
LARGE_TRANSACTION = "large_transaction" # Above reporting threshold
UNUSUAL_PATTERN = "unusual_pattern" # Deviation from baseline
SANCTIONED_ENTITY = "sanctioned_entity" # OFAC/sanctions match
CROSS_BORDER = "cross_border" # International transfers
ROUND_AMOUNTS = "round_amounts" # Suspicious round numbers
@dataclass
class TransactionEvent:
"""A financial transaction for AML screening."""
transaction_id: str
account_id: str
counterparty_id: Optional[str]
amount: Decimal
currency: str
direction: str # "inbound" or "outbound"
channel: str # "wire", "ach", "card", "crypto"
country_code: str # ISO 3166-1 alpha-2
timestamp: datetime
description: str
metadata: Dict[str, str] = field(default_factory=dict)
@dataclass
class ComplianceAlert:
"""An alert generated by AML monitoring."""
alert_id: str
alert_type: AlertType
severity: AlertSeverity
account_id: str
transaction_ids: List[str]
description: str
rule_id: str
score: Decimal # 0-100 risk score
created_at: datetime
requires_sar: bool = False # Whether a SAR filing is recommended
auto_escalated: bool = False
@dataclass
class AMLRule:
"""A configurable AML rule loaded from database, not hardcoded.
Rules must be configurable without deployment because:
1. Regulators update thresholds
2. New typologies emerge weekly
3. Risk appetite changes with business decisions
4. Different jurisdictions have different thresholds
"""
rule_id: str
name: str
description: str
enabled: bool
parameters: Dict[str, str] # Configurable thresholds
severity: AlertSeverity
alert_type: AlertType
version: int
updated_at: datetime
class AMLMonitor:
"""Anti-Money Laundering transaction monitor.
Key design decisions:
1. Rules are loaded from database, not hardcoded
2. Transaction history is windowed (rolling 24h, 7d, 30d)
3. Scoring is additive: multiple weak signals create strong alerts
4. All decisions are auditable with full reasoning chain
5. False positive rate is tracked per rule for tuning
"""
# CTR threshold (US): $10,000 - but NEVER hardcode this
# Load from configuration, as it varies by jurisdiction
DEFAULT_CTR_THRESHOLD = Decimal("10000")
def __init__(self, rule_store, alert_store, transaction_history):
self.rule_store = rule_store
self.alert_store = alert_store
self.history = transaction_history
self._rules: List[AMLRule] = []
self._rule_evaluators: Dict[str, Callable] = {
"structuring": self._check_structuring,
"velocity": self._check_velocity,
"large_transaction": self._check_large_transaction,
"round_amounts": self._check_round_amounts,
"cross_border": self._check_cross_border,
}
async def reload_rules(self) -> None:
"""Reload rules from database. Called on schedule and on-demand.
This is how rules are updated without deployment:
compliance team updates rules in admin UI -> database ->
monitor reloads -> new rules take effect within minutes.
"""
self._rules = await self.rule_store.get_active_rules()
async def screen_transaction(
self, txn: TransactionEvent
) -> List[ComplianceAlert]:
"""Screen a single transaction against all active rules.
Returns list of alerts (may be empty if clean).
Every screening decision is logged for audit.
"""
alerts = []
total_score = Decimal("0")
for rule in self._rules:
if not rule.enabled:
continue
evaluator = self._rule_evaluators.get(rule.alert_type.value)
if evaluator is None:
continue
result = await evaluator(txn, rule)
if result is not None:
alerts.append(result)
total_score += result.score
# Composite scoring: multiple weak signals = strong alert
if total_score >= Decimal("80") and not any(
a.severity == AlertSeverity.CRITICAL for a in alerts
):
# Escalate to critical if composite score is high
for alert in alerts:
alert.auto_escalated = True
alert.severity = AlertSeverity.CRITICAL
# Store alerts
for alert in alerts:
await self.alert_store.save(alert)
return alerts
async def _check_structuring(
self, txn: TransactionEvent, rule: AMLRule
) -> Optional[ComplianceAlert]:
"""Detect structuring: splitting transactions to avoid thresholds.
Classic pattern: Instead of one $12,000 wire (which triggers CTR),
send three $3,900 wires in 24 hours (each below $10,000 threshold).
Parameters (from rule config):
- threshold: CTR reporting threshold (e.g., 10000)
- window_hours: Look-back window (e.g., 24)
- min_transactions: Minimum transactions to flag (e.g., 2)
- aggregate_pct: % of threshold that aggregate must reach (e.g., 80)
"""
threshold = Decimal(rule.parameters.get("threshold", "10000"))
window_hours = int(rule.parameters.get("window_hours", "24"))
min_txns = int(rule.parameters.get("min_transactions", "2"))
aggregate_pct = Decimal(rule.parameters.get("aggregate_pct", "80"))
# Skip if single transaction is above threshold (that is a CTR, not structuring)
if txn.amount >= threshold:
return None
# Get recent transactions for this account
window_start = txn.timestamp - timedelta(hours=window_hours)
recent = await self.history.get_transactions(
account_id=txn.account_id,
start=window_start,
end=txn.timestamp,
direction=txn.direction,
)
# Include current transaction
all_txns = recent + [txn]
total_amount = sum(t.amount for t in all_txns)
txn_count = len(all_txns)
# Check structuring pattern
trigger_amount = threshold * aggregate_pct / Decimal("100")
if txn_count >= min_txns and total_amount >= trigger_amount:
# All individual transactions below threshold but aggregate above
if all(t.amount < threshold for t in all_txns):
score = min(
Decimal("100"),
(total_amount / threshold) * Decimal("70")
)
return ComplianceAlert(
alert_id=_generate_id("alert"),
alert_type=AlertType.STRUCTURING,
severity=rule.severity,
account_id=txn.account_id,
transaction_ids=[t.transaction_id for t in all_txns],
description=(
f"Potential structuring: {txn_count} transactions "
f"totaling {total_amount} {txn.currency} in "
f"{window_hours}h (threshold: {threshold})"
),
rule_id=rule.rule_id,
score=score,
created_at=datetime.now(timezone.utc),
requires_sar=score >= Decimal("70"),
)
return None
async def _check_velocity(
self, txn: TransactionEvent, rule: AMLRule
) -> Optional[ComplianceAlert]:
"""Detect unusual transaction frequency."""
window_hours = int(rule.parameters.get("window_hours", "1"))
max_count = int(rule.parameters.get("max_count", "10"))
window_start = txn.timestamp - timedelta(hours=window_hours)
recent = await self.history.get_transactions(
account_id=txn.account_id,
start=window_start,
end=txn.timestamp,
)
if len(recent) >= max_count:
score = min(
Decimal("100"),
Decimal(len(recent)) / Decimal(max_count) * Decimal("60")
)
return ComplianceAlert(
alert_id=_generate_id("alert"),
alert_type=AlertType.VELOCITY,
severity=rule.severity,
account_id=txn.account_id,
transaction_ids=[t.transaction_id for t in recent] + [txn.transaction_id],
description=(
f"High velocity: {len(recent) + 1} transactions in "
f"{window_hours}h (limit: {max_count})"
),
rule_id=rule.rule_id,
score=score,
created_at=datetime.now(timezone.utc),
)
return None
async def _check_large_transaction(
self, txn: TransactionEvent, rule: AMLRule
) -> Optional[ComplianceAlert]:
"""Flag transactions above reporting threshold (CTR trigger)."""
threshold = Decimal(rule.parameters.get("threshold", "10000"))
if txn.amount >= threshold:
return ComplianceAlert(
alert_id=_generate_id("alert"),
alert_type=AlertType.LARGE_TRANSACTION,
severity=AlertSeverity.MEDIUM,
account_id=txn.account_id,
transaction_ids=[txn.transaction_id],
description=(
f"Large transaction: {txn.amount} {txn.currency} "
f"(threshold: {threshold}). CTR filing required."
),
rule_id=rule.rule_id,
score=Decimal("40"),
created_at=datetime.now(timezone.utc),
requires_sar=False, # CTR is not a SAR
)
return None
async def _check_round_amounts(
self, txn: TransactionEvent, rule: AMLRule
) -> Optional[ComplianceAlert]:
"""Flag suspiciously round transaction amounts.
Money launderers often use round numbers. While not conclusive
alone, round amounts combined with other signals increase risk.
"""
min_amount = Decimal(rule.parameters.get("min_amount", "1000"))
if txn.amount < min_amount:
return None
# Check if amount is suspiciously round
amount_str = str(txn.amount)
if "." in amount_str:
decimal_part = amount_str.split(".")[1].rstrip("0")
if not decimal_part:
# Whole number amount above threshold
return ComplianceAlert(
alert_id=_generate_id("alert"),
alert_type=AlertType.ROUND_AMOUNTS,
severity=AlertSeverity.LOW,
account_id=txn.account_id,
transaction_ids=[txn.transaction_id],
description=(
f"Round amount: {txn.amount} {txn.currency}"
),
rule_id=rule.rule_id,
score=Decimal("15"),
created_at=datetime.now(timezone.utc),
)
return None
async def _check_cross_border(
self, txn: TransactionEvent, rule: AMLRule
) -> Optional[ComplianceAlert]:
"""Flag cross-border transactions to high-risk jurisdictions."""
high_risk_countries = set(
rule.parameters.get("high_risk_countries", "").split(",")
)
min_amount = Decimal(rule.parameters.get("min_amount", "3000"))
if (txn.country_code in high_risk_countries
and txn.amount >= min_amount):
return ComplianceAlert(
alert_id=_generate_id("alert"),
alert_type=AlertType.CROSS_BORDER,
severity=AlertSeverity.HIGH,
account_id=txn.account_id,
transaction_ids=[txn.transaction_id],
description=(
f"Cross-border to high-risk jurisdiction: "
f"{txn.amount} {txn.currency} to {txn.country_code}"
),
rule_id=rule.rule_id,
score=Decimal("55"),
created_at=datetime.now(timezone.utc),
requires_sar=True,
)
return None
def _generate_id(prefix: str) -> str:
import uuid
return f"{prefix}-{uuid.uuid4()}"
What AI tools get wrong: The critical issue with compliance code is that AI tools hardcode rules that must be configurable. When asked to “build an AML monitoring system,” Copilot and Windsurf both generate if amount > 10000 directly in the code — which means changing the threshold requires a code change, code review, testing, and deployment. In a regulatory environment where thresholds change by jurisdiction and are updated periodically, hardcoded rules are a compliance failure. Claude Code understands the need for configurable rules about 60% of the time and generates rule-loading patterns. Cursor generates well-structured code but misses the composite scoring concept — it treats each rule independently rather than understanding that multiple weak signals should escalate severity. The structuring detection (splitting transactions to avoid reporting thresholds) is particularly poorly handled: most tools check only the current transaction against the threshold rather than looking at aggregate patterns over a time window. Amazon Q generates reasonable structuring detection when prompted about FinServ compliance specifically, likely due to AWS FinServ documentation in its training data.
6. Fraud Detection & Transaction Monitoring
Fraud detection is real-time adversarial machine learning with a 50-millisecond latency budget. Your scoring pipeline runs on every card authorization, every ACH transfer, every peer-to-peer payment. It must evaluate dozens of signals — transaction amount, merchant category, geographic distance from last transaction, device fingerprint, behavioral patterns, velocity across multiple time windows — and produce a risk score before the payment network’s authorization timeout. Miss fraud and you lose money. Block legitimate transactions and you lose customers. The economics are brutal: a false positive costs $20–$50 in investigation labor, a false negative costs the average transaction amount plus chargeback fees.
Real-Time Transaction Scoring Pipeline
from decimal import Decimal
from dataclasses import dataclass, field
from enum import Enum
from typing import Dict, List, Optional, Tuple
from datetime import datetime, timezone, timedelta
import time
import hashlib
class FraudDecision(Enum):
APPROVE = "approve"
DECLINE = "decline"
REVIEW = "review" # Manual review queue
CHALLENGE = "challenge" # Step-up authentication (3DS, OTP)
@dataclass
class DeviceFingerprint:
"""Device identification for fraud detection."""
fingerprint_hash: str
device_type: str # "mobile", "desktop", "tablet"
os: str
browser: str
screen_resolution: str
timezone_offset: int
language: str
ip_address: str
ip_country: str
is_vpn: bool
is_tor: bool
is_datacenter: bool
@dataclass
class TransactionContext:
"""Full context for fraud scoring."""
transaction_id: str
account_id: str
amount: Decimal
currency: str
merchant_id: str
merchant_category_code: str # MCC
merchant_country: str
channel: str # "online", "in_store", "atm"
card_present: bool
device: Optional[DeviceFingerprint]
timestamp: datetime
billing_zip: str
shipping_zip: Optional[str]
@dataclass
class ScoringResult:
"""Result of fraud scoring with full decision reasoning."""
decision: FraudDecision
risk_score: int # 0-1000 (higher = riskier)
signals: Dict[str, int] # signal_name -> contribution to score
latency_ms: float
model_version: str
timestamp: datetime
class VelocityCounter:
"""Sliding window counter for velocity checks.
Uses a bucketed approach for O(1) operations instead of
scanning all transactions. Each bucket = 1 minute.
"""
def __init__(self, window_minutes: int = 60):
self.window_minutes = window_minutes
self._buckets: Dict[str, Dict[int, int]] = {} # key -> {minute_bucket -> count}
self._amount_buckets: Dict[str, Dict[int, Decimal]] = {}
def _bucket_key(self, dt: datetime) -> int:
return int(dt.timestamp()) // 60
def increment(
self, key: str, amount: Decimal, at: datetime
) -> None:
bucket = self._bucket_key(at)
if key not in self._buckets:
self._buckets[key] = {}
self._amount_buckets[key] = {}
self._buckets[key][bucket] = self._buckets[key].get(bucket, 0) + 1
self._amount_buckets[key][bucket] = (
self._amount_buckets[key].get(bucket, Decimal("0")) + amount
)
# Prune old buckets
cutoff = bucket - self.window_minutes
self._buckets[key] = {
b: c for b, c in self._buckets[key].items() if b > cutoff
}
self._amount_buckets[key] = {
b: a for b, a in self._amount_buckets[key].items() if b > cutoff
}
def count(self, key: str, at: datetime) -> int:
cutoff = self._bucket_key(at) - self.window_minutes
return sum(
c for b, c in self._buckets.get(key, {}).items()
if b > cutoff
)
def total_amount(self, key: str, at: datetime) -> Decimal:
cutoff = self._bucket_key(at) - self.window_minutes
return sum(
(a for b, a in self._amount_buckets.get(key, {}).items()
if b > cutoff),
Decimal("0"),
)
class FraudScorer:
"""Real-time fraud scoring pipeline.
Latency budget: <50ms total from input to decision.
All lookups are in-memory. No database queries in hot path.
Model scores and rule scores are combined additively.
Scoring signals are weighted and summed:
- 0-200: Low risk (approve)
- 200-500: Medium risk (approve with monitoring)
- 500-700: High risk (challenge / step-up auth)
- 700-900: Very high risk (review queue)
- 900+: Extreme risk (auto-decline)
"""
VERSION = "fraud-scorer-v3.2"
def __init__(self, config: Dict):
self.config = config
self._velocity_1h = VelocityCounter(window_minutes=60)
self._velocity_24h = VelocityCounter(window_minutes=1440)
# Thresholds from config, not hardcoded
self._decline_threshold = config.get("decline_threshold", 900)
self._review_threshold = config.get("review_threshold", 700)
self._challenge_threshold = config.get("challenge_threshold", 500)
# Known device fingerprints per account
self._known_devices: Dict[str, set] = {}
# Account baselines for anomaly detection
self._account_avg_amount: Dict[str, Decimal] = {}
self._account_typical_mcc: Dict[str, set] = {}
def score(self, ctx: TransactionContext) -> ScoringResult:
"""Score a transaction. Must complete in <50ms."""
start_ns = time.monotonic_ns()
signals: Dict[str, int] = {}
total_score = 0
# Signal 1: Amount anomaly (vs account average)
amount_signal = self._score_amount_anomaly(ctx)
signals["amount_anomaly"] = amount_signal
total_score += amount_signal
# Signal 2: Velocity checks (1h and 24h windows)
velocity_signal = self._score_velocity(ctx)
signals["velocity"] = velocity_signal
total_score += velocity_signal
# Signal 3: Device risk
device_signal = self._score_device(ctx)
signals["device_risk"] = device_signal
total_score += device_signal
# Signal 4: Geographic anomaly
geo_signal = self._score_geographic(ctx)
signals["geographic"] = geo_signal
total_score += geo_signal
# Signal 5: Merchant category anomaly
mcc_signal = self._score_merchant_category(ctx)
signals["merchant_category"] = mcc_signal
total_score += mcc_signal
# Signal 6: Card-not-present premium
if not ctx.card_present and ctx.channel == "online":
cnp_signal = 50 # Base risk for CNP transactions
signals["card_not_present"] = cnp_signal
total_score += cnp_signal
# Signal 7: High-risk MCC codes
high_risk_mccs = {"7995", "5967", "5966", "7273"} # Gambling, etc.
if ctx.merchant_category_code in high_risk_mccs:
mcc_risk = 100
signals["high_risk_mcc"] = mcc_risk
total_score += mcc_risk
# Clamp to 0-1000
total_score = max(0, min(1000, total_score))
# Determine decision
if total_score >= self._decline_threshold:
decision = FraudDecision.DECLINE
elif total_score >= self._review_threshold:
decision = FraudDecision.REVIEW
elif total_score >= self._challenge_threshold:
decision = FraudDecision.CHALLENGE
else:
decision = FraudDecision.APPROVE
latency_ms = (time.monotonic_ns() - start_ns) / 1_000_000
# Update velocity counters (after scoring, not before)
self._velocity_1h.increment(
ctx.account_id, ctx.amount, ctx.timestamp
)
self._velocity_24h.increment(
ctx.account_id, ctx.amount, ctx.timestamp
)
return ScoringResult(
decision=decision,
risk_score=total_score,
signals=signals,
latency_ms=latency_ms,
model_version=self.VERSION,
timestamp=datetime.now(timezone.utc),
)
def _score_amount_anomaly(self, ctx: TransactionContext) -> int:
"""Score based on deviation from account's typical transaction amount."""
avg = self._account_avg_amount.get(ctx.account_id)
if avg is None or avg == 0:
return 0 # No baseline yet
ratio = ctx.amount / avg
if ratio > Decimal("10"):
return 200 # 10x normal amount
elif ratio > Decimal("5"):
return 100
elif ratio > Decimal("3"):
return 50
return 0
def _score_velocity(self, ctx: TransactionContext) -> int:
"""Score based on transaction frequency."""
count_1h = self._velocity_1h.count(ctx.account_id, ctx.timestamp)
count_24h = self._velocity_24h.count(ctx.account_id, ctx.timestamp)
amount_1h = self._velocity_1h.total_amount(
ctx.account_id, ctx.timestamp
)
score = 0
if count_1h >= 10:
score += 200 # 10+ transactions in 1 hour
elif count_1h >= 5:
score += 100
elif count_1h >= 3:
score += 30
if count_24h >= 30:
score += 150
# Amount velocity: large cumulative in short window
if amount_1h > Decimal("5000"):
score += 100
elif amount_1h > Decimal("2000"):
score += 50
return score
def _score_device(self, ctx: TransactionContext) -> int:
"""Score based on device trust signals."""
if ctx.device is None:
return 100 # No device info is suspicious
score = 0
# Known device check
known = self._known_devices.get(ctx.account_id, set())
if ctx.device.fingerprint_hash not in known:
score += 80 # New device
# Anonymization signals
if ctx.device.is_tor:
score += 200
if ctx.device.is_vpn:
score += 50
if ctx.device.is_datacenter:
score += 150 # Datacenter IP = likely bot/proxy
return score
def _score_geographic(self, ctx: TransactionContext) -> int:
"""Score based on geographic anomalies."""
if ctx.device is None:
return 0
score = 0
# IP country vs merchant country mismatch
if ctx.device.ip_country != ctx.merchant_country:
score += 40
# IP country vs billing address mismatch
# (simplified - production would use ZIP-to-country mapping)
if ctx.device.ip_country != "US" and ctx.billing_zip:
# Rough US ZIP check
if ctx.billing_zip[:5].isdigit():
score += 60 # US billing but non-US IP
return score
def _score_merchant_category(self, ctx: TransactionContext) -> int:
"""Score based on unusual merchant category for this account."""
typical = self._account_typical_mcc.get(ctx.account_id, set())
if not typical:
return 0 # No baseline
if ctx.merchant_category_code not in typical:
return 30 # Unusual category
return 0
What AI tools get wrong: Fraud detection pipelines require a rare combination of real-time systems knowledge and adversarial thinking. Claude Code understands the scoring model structure and produces clean signal combination logic, but it does not generate the velocity counter optimization (bucketed sliding windows) without explicit prompting — it defaults to scanning all transactions in the window, which is O(n) per check and far too slow. Cursor generates clean pipeline structures but misses the latency measurement and budget enforcement. Copilot produces individual checks (amount threshold, velocity) but treats them as binary flags rather than weighted scores, losing the nuance that drives real fraud detection performance. Windsurf generates the most naive fraud logic: simple threshold checks that any fraudster bypasses by staying under the limits. None of the tools generate the “update counters after scoring” ordering correctly — they update velocity counters before scoring the current transaction, which means the current transaction contributes to its own velocity check (off by one). The device fingerprinting integration is absent from all tools unless explicitly requested.
7. Banking API & Open Banking (PSD2)
Open Banking regulations (PSD2 in Europe, similar frameworks emerging globally) require banks to expose APIs for authorized third parties to access account information and initiate payments on behalf of customers. This is not just “build a REST API” — it requires Strong Customer Authentication (SCA) flows, consent management with granular scoping and time-limited validity, certificate-based mutual TLS for service-to-service auth, and compliance with detailed technical standards (Berlin Group, UK Open Banking, STET). Getting the consent flow wrong means either blocking legitimate access or allowing unauthorized access to customer financial data.
PSD2 Consent and Account Information Service
from dataclasses import dataclass, field
from enum import Enum
from typing import List, Optional, Dict, Set
from datetime import datetime, timezone, timedelta
from decimal import Decimal
import uuid
import hashlib
class ConsentStatus(Enum):
RECEIVED = "received" # Consent request received
VALID = "valid" # Customer authorized
REJECTED = "rejected" # Customer rejected
EXPIRED = "expired" # Past valid_until date
REVOKED = "revoked" # Customer revoked
TERMINATED_BY_TPP = "terminated_by_tpp" # Third party terminated
class ConsentScope(Enum):
"""Granular access scopes per PSD2 requirements.
Consent must specify exactly what data is accessible.
Broad "access everything" consent is not PSD2-compliant.
"""
ACCOUNTS = "accounts" # Account list and details
BALANCES = "balances" # Current and available balances
TRANSACTIONS = "transactions" # Transaction history
FUNDS_CONFIRMATION = "funds_confirmation" # Check if funds available
class SCAMethod(Enum):
"""Strong Customer Authentication methods (2 of 3 required)."""
SMS_OTP = "sms_otp" # Something you have
PUSH_NOTIFICATION = "push" # Something you have
BIOMETRIC = "biometric" # Something you are
PIN = "pin" # Something you know
PASSWORD = "password" # Something you know
HARDWARE_TOKEN = "hardware_token" # Something you have
@dataclass
class ConsentRequest:
"""PSD2 consent request from a Third Party Provider (TPP)."""
consent_id: str
tpp_id: str # Registered TPP identifier
tpp_certificate_hash: str # eIDAS certificate fingerprint
requested_scopes: Set[ConsentScope]
account_ids: Optional[List[str]] # None = all accounts
valid_until: datetime # Max 90 days per PSD2
frequency_per_day: int # Max API calls per day
recurring: bool # One-time or recurring access
created_at: datetime
status: ConsentStatus = ConsentStatus.RECEIVED
customer_id: Optional[str] = None
sca_completed: bool = False
sca_methods_used: List[SCAMethod] = field(default_factory=list)
def __post_init__(self):
# PSD2 mandates max 90 days consent validity
max_validity = self.created_at + timedelta(days=90)
if self.valid_until > max_validity:
self.valid_until = max_validity
# Frequency limit: PSD2 allows max 4 per day without SCA refresh
if self.frequency_per_day > 4 and not self.recurring:
self.frequency_per_day = 4
def is_valid(self) -> bool:
"""Check if consent is currently valid."""
if self.status != ConsentStatus.VALID:
return False
if datetime.now(timezone.utc) > self.valid_until:
self.status = ConsentStatus.EXPIRED
return False
return True
def covers_scope(self, scope: ConsentScope) -> bool:
"""Check if consent covers the requested scope."""
return scope in self.requested_scopes
def covers_account(self, account_id: str) -> bool:
"""Check if consent covers the requested account."""
if self.account_ids is None:
return True # All accounts
return account_id in self.account_ids
class OpenBankingService:
"""PSD2-compliant Account Information Service Provider (AISP).
Implements:
1. Consent lifecycle management (create, authorize, revoke)
2. Strong Customer Authentication (SCA) flow
3. Account information retrieval with consent checking
4. Rate limiting per consent
5. Full audit trail for regulatory compliance
"""
def __init__(self, consent_store, account_store, audit_log, sca_service):
self.consents = consent_store
self.accounts = account_store
self.audit = audit_log
self.sca = sca_service
async def create_consent(
self,
tpp_id: str,
tpp_cert_hash: str,
scopes: Set[ConsentScope],
account_ids: Optional[List[str]],
valid_until: datetime,
frequency_per_day: int,
recurring: bool,
) -> ConsentRequest:
"""Create a new consent request. Does NOT authorize it.
The TPP creates the consent, then redirects the customer
to the bank's SCA flow for authorization.
"""
# Verify TPP is registered and certificate is valid
if not await self._verify_tpp(tpp_id, tpp_cert_hash):
raise UnauthorizedTPPError(
f"TPP {tpp_id} not registered or certificate invalid"
)
consent = ConsentRequest(
consent_id=str(uuid.uuid4()),
tpp_id=tpp_id,
tpp_certificate_hash=tpp_cert_hash,
requested_scopes=scopes,
account_ids=account_ids,
valid_until=valid_until,
frequency_per_day=frequency_per_day,
recurring=recurring,
created_at=datetime.now(timezone.utc),
)
await self.consents.save(consent)
self.audit.record(
event="consent.created",
consent_id=consent.consent_id,
tpp_id=tpp_id,
scopes=[s.value for s in scopes],
account_ids=account_ids,
)
return consent
async def authorize_consent(
self,
consent_id: str,
customer_id: str,
sca_methods: List[SCAMethod],
sca_evidence: Dict[str, str],
) -> ConsentRequest:
"""Customer authorizes a consent after completing SCA.
PSD2 requires Strong Customer Authentication:
- At least 2 of 3 categories: knowledge, possession, inherence
- Dynamic linking for payment initiation
"""
consent = await self.consents.get(consent_id)
if consent is None:
raise ConsentNotFoundError(consent_id)
if consent.status != ConsentStatus.RECEIVED:
raise InvalidConsentStateError(
f"Cannot authorize consent in state {consent.status.value}"
)
# Verify SCA: need at least 2 different factor categories
categories_used = set()
knowledge = {SCAMethod.PIN, SCAMethod.PASSWORD}
possession = {SCAMethod.SMS_OTP, SCAMethod.PUSH_NOTIFICATION,
SCAMethod.HARDWARE_TOKEN}
inherence = {SCAMethod.BIOMETRIC}
for method in sca_methods:
if method in knowledge:
categories_used.add("knowledge")
elif method in possession:
categories_used.add("possession")
elif method in inherence:
categories_used.add("inherence")
if len(categories_used) < 2:
raise InsufficientSCAError(
f"SCA requires 2+ categories, got: {categories_used}"
)
# Verify each SCA method
for method in sca_methods:
valid = await self.sca.verify(
customer_id=customer_id,
method=method,
evidence=sca_evidence.get(method.value, ""),
)
if not valid:
raise SCAVerificationError(f"SCA method {method.value} failed")
# Authorize consent
consent.status = ConsentStatus.VALID
consent.customer_id = customer_id
consent.sca_completed = True
consent.sca_methods_used = sca_methods
await self.consents.save(consent)
self.audit.record(
event="consent.authorized",
consent_id=consent_id,
customer_id=customer_id,
sca_methods=[m.value for m in sca_methods],
sca_categories=list(categories_used),
)
return consent
async def get_accounts(
self,
consent_id: str,
tpp_id: str,
) -> List[Dict]:
"""Get account list for authorized consent.
Checks: consent valid, TPP matches, scope includes ACCOUNTS,
rate limit not exceeded.
"""
consent = await self._validate_access(
consent_id, tpp_id, ConsentScope.ACCOUNTS
)
# Get only consented accounts
if consent.account_ids:
accounts = [
await self.accounts.get(aid)
for aid in consent.account_ids
]
else:
accounts = await self.accounts.get_by_customer(
consent.customer_id
)
self.audit.record(
event="account.list_accessed",
consent_id=consent_id,
tpp_id=tpp_id,
customer_id=consent.customer_id,
account_count=len(accounts),
)
# Return only PSD2-specified fields, not internal data
return [
{
"account_id": acc.account_id,
"iban": acc.iban,
"currency": acc.currency,
"account_type": acc.account_type,
"name": acc.name,
"status": acc.status,
}
for acc in accounts
if acc is not None
]
async def get_balances(
self,
consent_id: str,
tpp_id: str,
account_id: str,
) -> Dict:
"""Get account balances for authorized consent."""
consent = await self._validate_access(
consent_id, tpp_id, ConsentScope.BALANCES, account_id
)
balance = await self.accounts.get_balance(account_id)
self.audit.record(
event="balance.accessed",
consent_id=consent_id,
tpp_id=tpp_id,
account_id=account_id,
)
return {
"account_id": account_id,
"balances": [
{
"balance_type": "closingBooked",
"amount": str(balance.booked_amount),
"currency": balance.currency,
"reference_date": balance.reference_date.isoformat(),
},
{
"balance_type": "expected",
"amount": str(balance.available_amount),
"currency": balance.currency,
"reference_date": balance.reference_date.isoformat(),
},
],
}
async def get_transactions(
self,
consent_id: str,
tpp_id: str,
account_id: str,
date_from: Optional[datetime] = None,
date_to: Optional[datetime] = None,
) -> Dict:
"""Get transaction history for authorized consent.
PSD2 limits: max 90 days history, SCA refresh required
for access older than 90 days.
"""
consent = await self._validate_access(
consent_id, tpp_id, ConsentScope.TRANSACTIONS, account_id
)
# PSD2: max 90 days without additional SCA
max_lookback = datetime.now(timezone.utc) - timedelta(days=90)
if date_from and date_from < max_lookback:
date_from = max_lookback
transactions = await self.accounts.get_transactions(
account_id=account_id,
date_from=date_from,
date_to=date_to or datetime.now(timezone.utc),
)
self.audit.record(
event="transactions.accessed",
consent_id=consent_id,
tpp_id=tpp_id,
account_id=account_id,
transaction_count=len(transactions),
date_range=f"{date_from} to {date_to}",
)
return {
"account_id": account_id,
"transactions": {
"booked": [
{
"transaction_id": txn.transaction_id,
"amount": str(txn.amount),
"currency": txn.currency,
"booking_date": txn.booking_date.isoformat(),
"value_date": txn.value_date.isoformat(),
"remittance_info": txn.description,
"creditor_name": txn.counterparty_name,
}
for txn in transactions
if txn.is_booked
],
"pending": [
{
"transaction_id": txn.transaction_id,
"amount": str(txn.amount),
"currency": txn.currency,
"value_date": txn.value_date.isoformat(),
"remittance_info": txn.description,
}
for txn in transactions
if not txn.is_booked
],
},
}
async def revoke_consent(
self,
consent_id: str,
revoked_by: str, # "customer" or "tpp"
reason: str,
) -> None:
"""Revoke an active consent. Customer can revoke at any time."""
consent = await self.consents.get(consent_id)
if consent is None:
raise ConsentNotFoundError(consent_id)
if revoked_by == "tpp":
consent.status = ConsentStatus.TERMINATED_BY_TPP
else:
consent.status = ConsentStatus.REVOKED
await self.consents.save(consent)
self.audit.record(
event="consent.revoked",
consent_id=consent_id,
revoked_by=revoked_by,
reason=reason,
)
async def _validate_access(
self,
consent_id: str,
tpp_id: str,
required_scope: ConsentScope,
account_id: Optional[str] = None,
) -> ConsentRequest:
"""Validate that a TPP has authorized access for this request.
Checks in order:
1. Consent exists
2. TPP matches
3. Consent is still valid (not expired/revoked)
4. Required scope is covered
5. Account is covered (if applicable)
6. Rate limit not exceeded
"""
consent = await self.consents.get(consent_id)
if consent is None:
raise ConsentNotFoundError(consent_id)
if consent.tpp_id != tpp_id:
raise UnauthorizedTPPError("TPP does not own this consent")
if not consent.is_valid():
raise ConsentExpiredError(consent_id)
if not consent.covers_scope(required_scope):
raise InsufficientScopeError(
f"Consent does not include {required_scope.value}"
)
if account_id and not consent.covers_account(account_id):
raise AccountNotConsentedError(
f"Account {account_id} not included in consent"
)
# Rate limiting
today_count = await self.consents.get_access_count_today(consent_id)
if today_count >= consent.frequency_per_day:
raise RateLimitExceededError(
f"Daily limit of {consent.frequency_per_day} exceeded"
)
await self.consents.increment_access_count(consent_id)
return consent
async def _verify_tpp(
self, tpp_id: str, cert_hash: str
) -> bool:
"""Verify TPP registration and eIDAS certificate."""
# In production: check against national competent authority
# register and validate eIDAS certificate chain
tpp = await self.consents.get_tpp(tpp_id)
if tpp is None:
return False
return tpp.certificate_hash == cert_hash
What AI tools get wrong: Open Banking code is where regulatory knowledge gaps in AI tools become most apparent. Claude Code generates the most complete consent lifecycle, including the 90-day maximum validity and SCA category requirements (2 of 3: knowledge, possession, inherence). Cursor produces clean API structures but misses the rate limiting per consent and the 90-day transaction history lookback limit. Copilot generates basic OAuth-style consent flows but does not understand that PSD2 consent is fundamentally different from OAuth scopes — PSD2 requires explicit account selection, time-limited validity, and frequency limits per consent. Amazon Q handles the AWS-adjacent parts well (certificate validation patterns) but misses PSD2-specific requirements. None of the tools generate the TPP certificate verification against the eIDAS framework without explicit prompting. The most dangerous error across all tools is generating consent models that lack expiration checking — serving account data on an expired consent is a regulatory violation.
What AI Tools Get Wrong in FinTech
After extensive testing across all major AI coding tools, these are the FinTech-specific errors that appear consistently. Memorize this list — it will save you from compliance violations, financial losses, and regulatory fines:
- Using float for money (the classic):
price = 19.99instead ofprice = Decimal("19.99"). IEEE 754 double cannot represent 0.1 exactly.0.1 + 0.2 = 0.30000000000000004in every mainstream language. Over millions of transactions, these errors accumulate into real financial discrepancies. AI tools generate float-based calculations 80%+ of the time unless your codebase already uses Decimal consistently. Always review generated code forfloat,double, or unquoted numeric literals in financial calculations. - Non-idempotent payment endpoints: AI tools generate
POST /paymentsendpoints that create a new payment on every request. When the client retries after a timeout (not knowing if the first request succeeded), the customer gets charged twice. Every payment endpoint must accept an idempotency key, check if that key has been seen before, and return the stored result for duplicate keys. This is not optional — it is a requirement for any system that processes real money. - Logging sensitive card data (PCI violation): AI tools generate
logger.info(f"Processing payment for card {card_number}")which is a PCI-DSS violation that can result in your payment processing privileges being revoked. Card numbers, CVVs, and full track data must never appear in logs, error messages, stack traces, or monitoring systems. Only the last 4 digits of the card and the tokenized reference should ever be logged. - Missing audit trail entries: Financial regulators require a complete, immutable trail of every action that affects money or customer data. AI tools generate functional code that processes transactions correctly but produces no audit evidence. Adding audit logging after the fact is painful and unreliable — it must be designed into every operation from the start.
- Race conditions in balance checks (check-then-act): AI tools generate
balance = get_balance(); if balance >= amount: debit(amount)without any locking or atomic operation. Between the check and the debit, another concurrent request can drain the account. UseUPDATE accounts SET balance = balance - $1 WHERE balance >= $1 RETURNING balanceor equivalent atomic operations. This is the most common source of overdraft bugs in AI-generated code. - Incorrect rounding (ceiling when should be banker’s rounding): AI tools use
round(amount, 2)which rounds half away from zero in Python. Financial calculations require banker’s rounding (ROUND_HALF_EVEN) to avoid systematic bias. Over millions of transactions, incorrect rounding direction creates measurable financial discrepancies. UseDecimal.quantize(Decimal("0.01"), rounding=ROUND_HALF_EVEN)explicitly. - Hardcoded compliance rules that should be configurable: AI tools write
if amount > 10000: file_ctr()directly in the transaction processing code. When the CTR threshold changes (or differs by jurisdiction), this requires code changes, testing, and deployment. Compliance rules must be loaded from configuration and updatable without deployment. - Missing webhook signature verification: Payment gateway webhooks confirm that charges succeeded, subscriptions renewed, or disputes were filed. Without verifying the webhook signature, an attacker can forge payment confirmations and receive goods without paying. AI tools generate webhook handlers that parse the JSON body without ever checking the signature header.
- Settlement logic that does not handle partial amounts: AI tools generate refund and settlement code that assumes full amounts. In production, you handle partial refunds, partial chargebacks, split settlements (platform fee + merchant payout), currency conversion with separate exchange rates for different portions, and tax adjustments on partial amounts. Each partial amount creates additional ledger entries that must all balance. AI tools generate all-or-nothing settlement that breaks on the first partial operation.
Cost Model: What Does This Actually Cost?
FinTech engineering involves deep coding sessions (ledger design, payment flow implementation, compliance rules) mixed with shorter sessions (gateway integration updates, configuration changes, regulatory report tweaks). The compliance-heavy nature of FinTech means more time spent on correctness verification than in typical software development. Here are realistic cost scenarios:
| Scenario | Recommended Stack | Monthly Cost | Why This Stack |
|---|---|---|---|
| Solo FinTech Dev Side project, payment integrations, Stripe/Adyen |
Copilot Free + Claude Free | $0 | Copilot for gateway API completions, Claude for reasoning through edge cases in payment flows and understanding PCI requirements |
| FinTech Startup Engineer Payment platform, ledger, basic compliance |
Cursor Pro | $20 | Multi-file context handles payment service + ledger + API layer together; project-aware completions learn your Decimal patterns and naming conventions |
| FinTech Team Lead Compliance architecture, risk systems, multi-service |
Claude Code + Copilot Pro | $30 | Claude for compliance reasoning, ledger correctness verification, and risk model design; Copilot for fast inline completions on API integrations and data models |
| Trading Systems Team Order management, matching engine, real-time risk |
Cursor Business + Claude Code | $60 | Cursor for codebase-aware completions across large trading monorepos; Claude for reasoning about matching engine correctness and risk model edge cases |
| Compliance-Heavy Enterprise Banking, PCI-DSS, SOX, multi-jurisdiction |
Cursor Business + Claude Code (per seat) | $60–99/seat | Enterprise features (SSO, audit logs, zero data retention) required for teams handling PCI-scoped code and regulated financial data |
ROI reality check: FinTech engineers typically earn $180,000–$300,000+ (trading systems and compliance roles at the top end). At $220K/year, a 5% productivity gain justifies $917/month in tooling. Even at a conservative 2% gain from AI coding tools (primarily from faster boilerplate generation in API integrations, data models, and test scaffolding), a $20–60/month investment pays for itself many times over. But the real ROI calculation in FinTech includes risk avoidance: a single compliance violation caught by AI-assisted code review can save $50K–$500K+ in fines. The areas where AI tools save the most time are not the financial logic core (where you need to understand and verify every line) but the surrounding infrastructure: API endpoint scaffolding, data transfer objects, database migration boilerplate, and test fixture generation.
Practical Recommendations
- Payment gateway integration boilerplate: Stripe, Adyen, Square API shapes, webhook handler scaffolding, error mapping structures. This is 25% of FinTech code and highly repetitive across gateways.
- Data model generation: Transaction records, account entities, ledger entry schemas, API request/response objects. Tedious but well-defined from specifications.
- Test scaffolding and fixture generation: Creating test payment requests, mock gateway responses, ledger state fixtures. Repetitive and low-risk when reviewed.
- Database migration boilerplate: Creating tables, indexes, constraints for financial data. AI tools know common schema patterns for transactions, accounts, and audit logs.
- API endpoint scaffolding: REST endpoint structures, input validation, error response formatting. The outer shell of your API is mostly standard web development.
- Any arithmetic on money: Verify Decimal usage, rounding mode, currency-specific decimal places. The number one source of financial bugs in AI-generated code.
- Idempotency handling: Verify that duplicate requests return stored results without re-executing side effects. Test with concurrent duplicate submissions.
- Audit trail completeness: Verify that every state change produces an audit event with full context. Missing entries are compliance violations.
- Sensitive data in logs: Search generated code for any logging of card numbers, SSNs, account numbers, or authentication tokens.
- Race conditions in balance checks: Verify that balance checks and debits are atomic. The check-then-act pattern is the most common concurrency bug.
- Compliance rule configurability: Verify that regulatory thresholds and rules are loaded from configuration, not hardcoded.
Related Guides
- AI Coding Tools for Backend Engineers (2026) — API design, distributed systems, database patterns
- AI Coding Tools for Security Engineers (2026) — Threat modeling, encryption, access control
- AI Coding Tools for Compliance Engineers (2026) — Regulatory automation, audit systems, policy as code
- AI Coding Tools for Data Engineers (2026) — Pipelines, warehousing, real-time streaming
- AI Coding Tools for Blockchain & Web3 Developers (2026) — Smart contracts, DeFi, token systems
- AI Coding Tools for DevOps Engineers (2026) — CI/CD, infrastructure, monitoring, deployment