How Crowdsourced Outage Tracking Works: The Outage.gr Methodology Explained
A detailed explanation of how Outage.gr collects, verifies, and expires community outage reports, integrates DEDDIE scheduled data, and maintains accuracy at national scale.
Crowdsourcing — the practice of gathering information from a distributed community rather than a centralised authority — is a powerful tool for real-time event tracking. But it introduces specific challenges around accuracy, spam, and the freshness of information that centralised systems do not face. This article explains in detail how Outage.gr is designed to address these challenges.
The Core Data Model: Reports, Confirmations, and Comments
Every outage report in the Outage.gr system consists of:
Required fields: - Utility type (power, water, or internet) - Geographic location (GPS coordinates from your device or a searched address) - Timestamp (server-side, not user-provided, to prevent manipulation) - Anonymous device fingerprint (the UUID that identifies your browser session without identifying you)
Optional fields: - Provider name - Damage type (none, electrical appliances, business loss)
What is explicitly excluded: A free-text description field. We made this decision deliberately. Free-text fields in anonymous community platforms attract spam, misinformation, and inappropriate content that requires moderation. By limiting reports to structured fields, we maintain data quality at scale without a moderation team.
Each report can receive two types of community engagement: - **Confirmations ("Me Too"):** A single tap from another community member to indicate they are experiencing the same outage. Each browser session can confirm each report once. - **Comments:** Free text, submitted by community members, visible in the report's detail view on the map. Comments allow nuance — "the lights are back on my street but still out on the main road" — that the structured report fields cannot capture.
The Verification Score and Its Meaning
Each report has a verification score, which is the number of community confirmations it has received. A report submitted by one person with zero confirmations has a score of 0. A report that ten neighbours have confirmed has a score of 10.
The verification score serves as a proxy for the confidence that the report represents a real, widespread outage rather than a local anomaly or a mistaken submission. We use the score in several ways:
Map prominence. Reports with higher verification scores appear more prominently on the map and are given greater visual weight.
City statistics. The average restoration time and outage frequency statistics on city pages are computed from reports above a minimum verification threshold, to reduce noise from single-observer events.
Evidence quality. The Evidence page certificate shows verification scores alongside report timestamps, allowing users to present not just their own report but the community confidence level.
One deliberate design choice: a single browser session can only confirm a report once. This prevents score inflation from a single person repeatedly confirming their own report and ensures that high scores represent genuinely distributed community agreement.
The Activity Window: The Self-Cleaning Mechanism
Outage.gr reports have a one-hour activity window. The clock resets every time the report receives a new confirmation or a new comment. If no activity occurs within one hour, the report transitions from "active" to "expired" and disappears from the live map.
This self-cleaning mechanism is the most important design choice in the platform. Without it, the map would accumulate historical reports indefinitely and become unreadable. With it, the map always shows only what is happening right now — or at most in the last hour.
The tradeoff: reports for brief outages that resolve quickly (under an hour) may expire before they receive significant community confirmation. This is an acceptable loss. Our focus is on outages that are ongoing and affecting multiple people simultaneously — the events where community information genuinely helps.
The activity window also naturally handles the transition from "active outage" to "resolved": when power is restored, no new confirmations arrive, and the report expires within one hour. Reporters can also explicitly mark their own report as resolved, which moves it immediately to the history archive.
The DEDDIE Scheduled Data Pipeline
Beyond community reports, Outage.gr integrates official scheduled maintenance data from DEDDIE. This pipeline runs daily and involves three phases:
Discovery: The pipeline identifies all Greek prefectures from DEDDIE's scheduling system at siteapps.deddie.gr. DEDDIE organises scheduled outage data by prefecture and municipality.
Harvesting: For each prefecture, the pipeline harvests the list of municipalities and areas included in upcoming scheduled maintenance windows, including start and end times, area descriptions, and maintenance purposes.
Geocoding: DEDDIE's area descriptions are human-readable text ("Area west of the Kifissos river, including Menidi") rather than geographic coordinates. The pipeline geocodes these descriptions using Nominatim (the OpenStreetMap geocoding service), converting them to latitude/longitude pairs that can be placed on the map and linked to city and area records in our database.
Geocoding DEDDIE's descriptions is the most challenging step — DEDDIE's area naming conventions are not always consistent and sometimes use historical or colloquial place names. Our geocoding approach combines automated geocoding with a manually maintained mapping table that corrects the most common ambiguities.
Spatial Indexing: National Scale Performance
Outage.gr serves queries for outages within a geographic bounding box (for the map), within a radius of a given point (for My Area), and within known city boundaries (for city pages). With tens of thousands of reports in the database, these queries must execute in milliseconds to feel responsive.
We achieve this through PostGIS — the spatial extension for PostgreSQL — combined with spatial indexes on the location columns. PostGIS allows queries like "give me all active outages within 10km of this point" to execute efficiently using geographic indexes rather than scanning every report in the database.
The use of PostgreSQL as the underlying database (via Supabase) also allows us to run the verification score update logic, the one-hour expiry logic, and the statistics queries as database-level functions, which is both more efficient and more consistent than application-level logic.
Transparency and Limitations
We publish this methodology because we believe transparency makes the data more trustworthy, not less. Knowing how the data was collected and what its limitations are allows you to use it appropriately.
Key limitations to understand:
Urban bias. Our data overrepresents urban areas where app adoption is higher and underrepresents rural areas and older demographics. Outage frequency in rural communities is likely higher than our per-capita figures suggest.
Single-reporter noise. Reports with zero confirmations may represent false positives — brief power blips, neighbour's outages, or incorrect location tagging. City statistics and analysis that filter for confirmed reports are more reliable than raw report counts.
Report type is user-selected. We cannot verify from the data alone whether a reported "power" outage is actually a power outage or a consequence of a water pump outage, for example. We rely on reporters to select the appropriate utility type.
Resolution bias. The one-hour expiry means that outages shorter than one hour may not accumulate confirmations before expiring. Very brief events are systematically underrepresented relative to longer outages.
These limitations are inherent to crowdsourced data. They do not invalidate the data — they define its appropriate uses and the appropriate level of confidence to place in specific data points.