Resources / Field guide
Safe record merging & coalesce
Every real database grows duplicates — the same vendor entered twice, two records for one client. "Merge these" sounds simple and is quietly one of the riskier operations you can build: get it wrong and you orphan child records or lose data. Here's a pattern that merges cleanly, plus the EF Core gotcha that makes the second merge fail when the first one worked.
What "merge" actually means
Merging record B (the source) into record A (the target) is three distinct operations, in order:
- Re-point every child record that references B so it references A instead (trips, payments, line items — everything related).
- Coalesce — copy any field that's filled on B but blank on A, so you don't lose detail. The target's existing values always win; you're only backfilling gaps.
- Delete the now-empty source B.
Order matters: if you delete before re-pointing, you orphan B's children (or hit a foreign-key error). Coalesce before delete, or that data is gone for good.
Coalesce: target wins, blanks get filled
The rule that keeps merges predictable is "target wins." Walk each field; copy from source only when the target is empty:
// Backfill only the blanks — never overwrite a value the target already has
target.Email ??= source.Email;
target.Website ??= source.Website;
target.Address ??= source.Address;
target.Notes ??= source.Notes;
// booleans: OR them; phone/address blocks: move as a unit if target lacks one
Spell the semantics out for the user before they commit, too: "linked records move to the target, the target keeps its own details, blanks are filled from the source, and this can't be undone." A merge is destructive; the dialog should say so plainly.
The gotcha: a soft-delete query filter causes foreign-key violations on delete. If your app uses EF Core global query filters for soft-delete, your re-point queries silently skip soft-deleted children — they're filtered out, so they keep pointing at the source. Then the hard delete fails with a foreign-key violation (MySQL error 1451). The classic symptom: the first merge works, the second one doesn't. The fix is to run every re-point and collision query with the filter off:
// Include soft-deleted children, or they keep referencing the source
var children = _db.Trips
.IgnoreQueryFilters()
.Where(t => t.VendorId == sourceId);
foreach (var c in children) c.VendorId = targetId;
Do it in a transaction
Re-point, coalesce, and delete must all succeed or all roll back — a half-finished merge is worse than no merge. Wrap the whole thing:
await using var tx = await _db.Database.BeginTransactionAsync();
RepointChildren(sourceId, targetId); // with IgnoreQueryFilters()
Coalesce(target, source);
_db.Vendors.Remove(source);
await _db.SaveChangesAsync();
await tx.CommitAsync();
Don't forget the deep relationships
The bugs that escape testing are the relationships you forgot to re-point. A vendor isn't just its profile — it may own contacts, attachments, or nested child collections. If those aren't moved, they vanish from the merged record even though the merge "succeeded." Enumerate every table with a foreign key to the entity and handle each one; audit a real merge afterward to confirm nothing went missing.
Conflict policy: "target wins" resolves both-sides-filled conflicts silently. That's the right default for speed, but if the data is high-stakes, consider a per-field "keep this one" picker so a human decides when both records disagree. Make it a conscious choice, not an accident.
When this fits
Build a real merge tool when duplicates are inevitable and matter — CRM contacts, vendors, clients, catalog entries. The investment is in correctness: transactional, filter-aware, coalescing, and exhaustive about relationships. Skip the shortcuts here; a merge that loses data erodes trust in the whole system.
Cleaning up messy, duplicated data?
I build dedupe, merge, and data-integrity tooling that won't lose your records — transactional and tested. Tell me what your data looks like and let's talk.
Work with me