Back to Blog
Geospatial AIModels

MBCTD: Open Multi-Label Building Change Type Detection at the Pixel Level

We're releasing MBCTD, a deep learning model that detects demolished, new, and unchanged buildings independently per pixel — including replacement sites that single-label models can't represent. Available now.

RA
Retgen AI
May 21, 2026
3 min read
Building Change DetectionRemote SensingGeospatial AIModel ReleaseMulti-Label Segmentation
MBCTD: Open Multi-Label Building Change Type Detection at the Pixel Level

MBCTD: Open Multi-Label Building Change Type Detection at the Pixel Level

Today we're releasing MBCTD (Multi-Label Building Change Type Detection), a deep learning model for per-pixel detection of building changes in bi-temporal aerial and satellite imagery. The model is available now on GitHub, with pre-trained weights, a programmatic inference API, and an interactive web demo.

Most change detection models reduce the problem to a binary verdict (changed or not changed) and lose the most useful signal: what changed. A demolished plot and a new construction look the same to a binary classifier. A site that's been torn down and rebuilt collapses into a single ambiguous event. For anyone using these outputs downstream — urban planners, insurers, real-estate analysts, disaster response teams — that ambiguity is the difference between an actionable insight and a noisy mask.

MBCTD treats each pixel as multi-label across three independent classes:

  • Unchanged: building present in both images
  • Demolished: building present before, absent after
  • New: building absent before, present after

Because the three heads are independent, a single pixel can carry more than one label at once. Replacement sites — demolished and rebuilt on the same footprint — are represented as exactly that: demolished and new active simultaneously. No model surgery, no post-processing heuristics. The architecture supports the case natively.

Architecture

MBCTD pairs a Siamese ConvNeXt-Base encoder (initialized with DINOv3 LVD1689M weights) with a full-resolution U-Net decoder. At each encoder scale, before/after features are fused as [before, after, before−after, |before−after|] and projected through 1×1 → 3×3 convolutions. The decoder uses PixelShuffle upsampling to avoid checkerboard artifacts, and high-resolution skip connections at 1/2 and full resolution inject the raw input pair directly into the final layers — preserving the fine building boundaries that low-resolution skips would blur.

Results

MBCTD was trained on FOTBCD, a dataset of 220k+ before/after aerial pairs spanning 28 French departments, and evaluated at full resolution on two benchmarks:

BenchmarkF1mIoUPer-class IoU
FOTBCD0.9070.9090.78 unchanged · 0.82 demolished · 0.82 new
LEVIR-CD+0.7910.818binary only

Strong cross-dataset transfer to LEVIR-CD+ — a benchmark whose imagery and geography were never seen during training — confirms that multi-label supervision on a geographically diverse dataset produces representations that generalize across acquisition conditions, sensor types, and urban morphologies.

Availability and licensing

MBCTD model weights and code are released under CC BY-NC 4.0 — free for research and non-commercial use. Clone, fine-tune, benchmark, publish.

For commercial deployments — urban monitoring platforms, real-estate analytics, insurance underwriting, illegal-construction surveillance, post-disaster assessment — both the model and the FOTBCD training dataset are available under dedicated commercial licenses. If you're building a product on top of building change detection, we'd like to hear about it. Contact us to discuss licensing terms.