Featured Project

ESRI Recognised

Acorn CCS — North Sea Geospatial Platform

13,693 hrs delivered

View project →

All Projects

6
ALBs integrated
300–500M
Feature rows
49UK GEMINI 2.3
Metadata elements

government · London / England (national platform)

Defra National Land Data Platform — GeoEP

ACTIVE · 2025

Architected the geospatial data model underpinning Defra's national land data platform — integrating 6 Arm's Length Bodies, 300–500 million feature rows, and a 49-element UK GEMINI 2.3 metadata standard. Reduced data discovery from approximately one month to under 10 minutes.

10 min
data discovery (was ~1 month)
Defra (via Informed Solutions)

The Challenge

The Problem We Were Asked to Solve

Defra's Geospatial Engagement Programme (GeoEP) needed to rationalise land data spread across six Arm's Length Bodies (RPA, EA, NE, APHA, FC, MMO) — organisations with conflicting taxonomies, inconsistent data standards, and no common discovery mechanism. Finding a specific land parcel record could take a team member up to a month due to siloed, undocumented datasets.

Key technical challenges: - Cross-ALB taxonomy harmonisation: each body used different classification schemes, attribute names, and coordinate reference systems - Metadata coverage: no consistent metadata existed across the 300–500 million feature rows in scope - Standards compliance: the platform needed to satisfy UK GEMINI 2.3, ISO 19115, INSPIRE, DEFRA DAF, DEFRA Data Modelling Standards v2.3, and UK GDPR/DPIA requirements simultaneously - Data quality: a 40% false-positive rate in SSSI boundary checks and a 14% 'white space' gap in land coverage both needed systematic resolution - Architecture coherence: a Conceptual Data Model (CDM), Logical Data Model (LDM), and Physical Data Model (PDM) all had to be designed in alignment and version-controlled to Defra standards

Project Details

Client
Defra (via Informed Solutions)
Year
2025
Location
London / England (national platform)
Sector
government

Tags

governmentUK GEMINI 2.3data architectureDefraINSPIREQFAIRmetadataCDMLDMcross-ALB harmonisationPythonOGCISO 19115land data

Our Approach

What We Built

Soheil Sotoodeh joined the GeoEP programme as Geospatial Data Architect in November 2025, embedded with Informed Solutions during the Alpha phase and continuing into Beta supervision.

Data Architecture Design - Designed CDM v4.2 (Conceptual Data Model) and LDM v3.1 (Logical Data Model) covering 9 entities with 114 fully-specified attributes - Implemented a linkage-based architecture enabling cross-ALB record resolution without duplicating data across bodies - Applied Third Normal Form (3NF) normalisation throughout the LDM to eliminate data redundancy and protect referential integrity - Designed tiered Physical Data Model (PDM) with Hot / Cold / Archive storage layers for cost-effective at-scale hosting

Metadata Framework - Designed the UK GEMINI 2.3 Defra Land Profile — a 49-element metadata standard covering discovery, lineage, quality, distribution, and governance fields - Mapped 51 functional requirements against the profile, achieving 94% requirements coverage - Integrated QFAIR quality scoring framework across 5 dimensions (Quality, Findability, Accessibility, Interoperability, Reusability)

Taxonomy & Data Discovery Pipeline - Reconciled conflicting land classification taxonomies across all 6 ALBs, including Living England Habitat Map, OS NGD Land Parcel Types, and Soilscapes - Built a production-ready taxonomy discovery pipeline in Python using a modular, reusable architecture - Resolved 40% false-positive rate in SSSI checks through geometry validation and topology QC - Identified and documented 14% white-space land coverage gap for onward data acquisition planning

Standards & Governance - Ensured full compliance with ISO 19115:2003, ISO 19107, ISO 19111, OGC API Records, OGC CSW, INSPIRE Metadata Implementing Rules, and DAMA DMBOK - Implemented PII separation boundaries and DPIA controls in alignment with UK GDPR requirements - Produced structured data lineage documentation covering sources, transformations, and workflows for all integrated datasets

The Results

What Changed

Platform Impact - Reduced data discovery time from approximately 1 month to under 10 minutes — a 99%+ reduction in search time for land data professionals across government - 94% coverage of the 51 mapped programme requirements in the Alpha phase, verified against the CDM/LDM deliverables - Taxonomy conflicts across 6 ALBs reconciled into a single, auditable classification framework - 40% false-positive rate in SSSI checks eliminated through systematic geometry and topology QC - 14% white-space land coverage gap formally documented, enabling targeted data acquisition in the Beta phase

Technical Outputs - CDM v4.2 and LDM v3.1 with 9 entities, 114 attributes, delivered to DEFRA Data Modelling Standards v2.3 - 49-element UK GEMINI 2.3 Defra Land Profile — ready for adoption across ALBs - QFAIR-scored quality framework for all integrated datasets - Production-ready Python taxonomy discovery pipeline (modular, documented, handover-ready) - Tiered PDM architecture specification for Hot/Cold/Archive storage

<10 minutes
(was ~1 month)
Data discovery time
94%
across 51 requirements
Requirements coverage
6
ALBs harmonised
300–500M
Feature rows in scope
49
GEMINI 2.3 metadata elements
9
CDM/LDM entities
114
LDM attributes specified
40%
SSSI false-positive rate resolved
14%
Land coverage white-space identified

Explore More

Acorn CCS — North Sea Carbon Capture GIS Platform

Explore another project →Talk to us about a similar challenge →