The Situation
A property services firm in Zug was spending significant staff time manually extracting data from land registry (Grundbuch) sources and entering it into internal spreadsheets. The process involved cross-referencing parcel numbers, ownership records, area dimensions, encumbrances, and easement data — all of it done by hand, page by page.
The manual process took one experienced staff member approximately 6 hours per day. Error rates from transcription were estimated at 3–5%, which created downstream problems when the firm relied on that data for valuations and client reports. The spreadsheets had grown to an unmanageable size with no consistent structure across different projects.
What I Did
Process mapping. Before writing a single line of code, I spent two days mapping the existing workflow in detail — what data was being collected, from which sources, in what order, and how it was being used downstream. This step identified several data fields that were being collected but never actually used, and several that were needed but frequently missed.
Semi-automated data collection. I built a Python-based tool that connects to the relevant cantonal data sources, extracts the required fields by parcel identifier, and structures the output into a consistent format. The tool handles the repetitive retrieval work; the staff member reviews and confirms each record rather than manually entering it. This kept human judgment in the loop for cases that require interpretation while removing the mechanical transcription step.
Data normalisation and validation. Raw cadastre data comes in inconsistent formats depending on source, canton, and record age. I wrote a normalisation layer that standardises address formats, area units, date formats, and ownership name conventions. A validation step flags records that fall outside expected parameters — for example, area values that differ significantly from neighboring parcels — for human review before the data enters the firm’s system.
Database restructuring. The existing spreadsheet system was migrated to a structured SQLite database with a consistent schema. Each project has its own set of records, linked to source data and timestamped. Queries that previously required manual scrolling through a 4,000-row spreadsheet now return results in under a second.
Documentation and handover. I wrote operational documentation for the tool — how to run a collection, how to handle flagged records, how to export data for use in client reports. The goal was that any staff member could operate it independently without needing to understand the underlying code.
Results
- Daily data processing time: 6 hours → 1.2 hours (80% reduction)
- Transcription error rate: 3–5% → ~0% (validation catches format issues before entry)
- Data retrieval for queries: manual scrolling → under 1 second
- Consistency across projects: all records now follow a single schema, searchable and exportable
- Staff dependency: tool runs independently; no developer involvement needed for routine operations
The firm now processes significantly more parcels per day with the same staffing, and client reports are generated from clean, structured data rather than manually assembled spreadsheet exports.
What Made the Difference
The key decision was keeping the process semi-automated rather than fully automated. Cadastre data sometimes requires interpretation — a record with an ambiguous ownership entry, a parcel boundary dispute, an easement with non-standard terms. Full automation would either fail on these cases or silently produce incorrect output.
By automating the mechanical retrieval and structuring work while keeping a human confirmation step, the tool is both faster and more reliable than either a fully manual or fully automated approach would be. The staff member’s time is now spent on judgment calls rather than data entry.
The database migration had a secondary benefit: the firm now has a searchable, auditable history of every parcel they’ve processed. Previously, if a client asked about work done two years ago, the answer was in a spreadsheet somewhere. Now it’s a 5-second query.