MBCTD: Open Multi-Label Building Change Type Detection at the Pixel Level
Today we're releasing MBCTD (Multi-Label Building Change Type Detection), a deep learning model for per-pixel detection of building changes in bi-temporal aerial and satellite imagery. The model is available now on GitHub, with pre-trained weights, a programmatic inference API, and an interactive web demo.
Most change detection models reduce the problem to a binary verdict (changed or not changed) and lose the most useful signal: what changed. A demolished plot and a new construction look the same to a binary classifier. A site that's been torn down and rebuilt collapses into a single ambiguous event. For anyone using these outputs downstream — urban planners, insurers, real-estate analysts, disaster response teams — that ambiguity is the difference between an actionable insight and a noisy mask.
MBCTD treats each pixel as multi-label across three independent classes:
- Unchanged: building present in both images
- Demolished: building present before, absent after
- New: building absent before, present after
Because the three heads are independent, a single pixel can carry more than one label at once. Replacement sites — demolished and rebuilt on the same footprint — are represented as exactly that: demolished and new active simultaneously. No model surgery, no post-processing heuristics. The architecture supports the case natively.
Architecture
MBCTD pairs a Siamese ConvNeXt-Base encoder (initialized with DINOv3 LVD1689M weights) with a full-resolution U-Net decoder. At each encoder scale, before/after features are fused as [before, after, before−after, |before−after|] and projected through 1×1 → 3×3 convolutions. The decoder uses PixelShuffle upsampling to avoid checkerboard artifacts, and high-resolution skip connections at 1/2 and full resolution inject the raw input pair directly into the final layers — preserving the fine building boundaries that low-resolution skips would blur.
Results
MBCTD was trained on FOTBCD, a dataset of 220k+ before/after aerial pairs spanning 28 French departments, and evaluated at full resolution on two benchmarks:
| Benchmark | F1 | mIoU | Per-class IoU |
|---|---|---|---|
| FOTBCD | 0.907 | 0.909 | 0.78 unchanged · 0.82 demolished · 0.82 new |
| LEVIR-CD+ | 0.791 | 0.818 | binary only |
Strong cross-dataset transfer to LEVIR-CD+ — a benchmark whose imagery and geography were never seen during training — confirms that multi-label supervision on a geographically diverse dataset produces representations that generalize across acquisition conditions, sensor types, and urban morphologies.
Availability and licensing
MBCTD model weights and code are released under CC BY-NC 4.0 — free for research and non-commercial use. Clone, fine-tune, benchmark, publish.
For commercial deployments — urban monitoring platforms, real-estate analytics, insurance underwriting, illegal-construction surveillance, post-disaster assessment — both the model and the FOTBCD training dataset are available under dedicated commercial licenses. If you're building a product on top of building change detection, we'd like to hear about it. Contact us to discuss licensing terms.
- GitHub (code, demo, weights): https://github.com/abdelpy/MBCTD
- FOTBCD dataset: Learn more
