Merchant Risk — Buli Xing

Context

At FiveBy, I had spent years on sanctions and export control compliance, work primarily conducted for Microsoft, covering Russia and CIS, China, Iran, North Korea, dual-use defense technology, and the fintech and GDPR frameworks around them. Sanctions work is not organized around controls. It is organized around actors. You do not start with what rules should be enforced. You start with who the adversary is, what they are trying to move, and how they are currently evading detection.

Amazon Merchant Risk had the opposite default. The team was asking a reasonable-sounding question: how do we catch more of these bad actors? The answer almost always pointed toward extrapolating new risk signals from data we already had. That approach produces incremental gains. What it does not produce is a current model of the adversary. Without that, any control you build is solving the version of the problem you could already see.

The pattern that crossed domains

At FiveBy, one of the most reliable evasion patterns in sanctions and export-control work was what I internally called Shared Leadership Across Multiple Entities. An individual would register or operate multiple legal entities, often across jurisdictions, using unrelated nominees as representatives of record. Each entity looked clean on its own. The connections were visible only if you were looking for the person across entities rather than evaluating each entity in isolation.

That pattern showed up, almost unchanged, on Amazon.

Sophisticated seller fraud operations rarely run one account. They run portfolios: multiple legal entities, multiple registered agents, sometimes multiple jurisdictions. Each account passes its own KYC review because KYC was, by default, an entity-scoped exercise. The system was asking whether this entity was legitimate when it should have been asking whether this actor was legitimate across everywhere they appeared.

I led the build of an entity-linkage database for Merchant Risk, a system designed to maintain unique identification attributes across individuals and entities so that when the same person appeared as a registered party, beneficial owner, or operational signal across multiple companies, the system could surface that fact rather than miss it.

The interview

The linkage database found actors. It did not explain them. The team's working model of seller fraud rings was built from data we already had. That model was structurally limited in the same way any data-only model is. It could describe what the actors were doing, but not how they recruited, why they chose the identities they chose, or what the ring looked like from the inside.

I proposed a different approach: work backward not from more data, but from the actor.

With Amazon legal's approval, and using a held-balance context that created a legitimate incentive for cooperation, I conducted an in-person interview with the leader of an organized seller fraud ring. This was not a typical PM activity. It was uncomfortable, heavily structured, and carefully bounded. It was also the single most valuable research hour I had during my time on the program.

What I learned could not have come from our data. The ring leader described how they recruited hired identities through fake job postings, deliberately targeting vulnerable cohorts including recent graduates and elderly workers. He described how the operation selected for people unlikely to ask questions about the administrative paperwork they were asked to sign. He described which of our controls mattered to him and which he routed around without a second thought.

I led the development of an automated re-verification process using the new attribute set. It operated with strong accuracy in distinguishing hired identities from authentic sellers.

Friction, calibrated

The third move is the one I think about most now, because it is the one that connects marketplace integrity to the broader thesis of trust.

Amazon treated risky sellers largely as a single class. A flag fired, enforcement followed. The system was tuned to catch bad actors, but the cost structure of enforcement fell disproportionately on the wrong people. Sophisticated fraud rings had operational redundancy. Losing an account was a cost of doing business. A new seller, an amateur working out of a garage, had no redundancy. An enforcement action on them was not a setback. It was a closure.

That asymmetry was a trust failure dressed up as a safety mechanism. A marketplace that cannot distinguish between a coordinated adversary and a new seller who misfiled a tax document is a marketplace that punishes the second because it is defended against the first.

I developed a risk appetite framework for Merchant Risk to quantify the financial and reputational impact of different tiers of risky sellers and match each tier to a proportional enforcement path. The critical design decision was that the verification experience was agnostic to the seller on the surface, while the backend routed them to the right verification depth.

The framework saved a large number of new and amateur sellers from enforcement actions that the old uniform system would have applied and tightened enforcement against the actors who actually warranted it.

What the three moves had in common

The three moves look like different projects. They are not. They are the same move, applied at three layers of the same system. Each question moved the program further from a controls-first posture and closer to an actor-first one. Each answer made the system simultaneously more effective against the people trying to break it and less punishing to the people who depended on it.

If I had to name the single move that mattered most, it would be the decision to stop asking how do we catch more of them and start asking who are they, actually?