๐Ÿ“ Methodology ยท Open

How the data is sourced, cleaned, and scored

No black box. SkyMind unifies fragmented government statistics for 845 regions and builds one transparent, re-weightable composite index. Every source is cited and every component is inspectable. If something doesn't add up, email us โ€” we welcome the scrutiny.

Last updated 14 May 2026ยท ~7 min read
Contents

1. What we measure โ€” and what we don't

SkyMind takes government statistics that are normally scattered across dozens of national portals, in several languages, with inconsistent schemas and sometimes broken APIs โ€” and unifies them into one clean, comparable dataset for 845 regions across five countries. On top of that data, we compute a transparent composite index that summarises each region's economic, demographic and social profile into comparable scores.

This is a descriptive product. It shows the measured state of a region from official data. It is not a forecast, not a probability of any event, and not a rating of the future. A score that moves over time reflects a change in the underlying published statistics โ€” nothing more, nothing less.

2. The data & sources

Every figure traces back to an official, public source. We ingest no personal data โ€” Zero Personal Data Architecture, GDPR public-interest basis (Article 6).

CountryRegionsCoverageSources
๐Ÿ‡ฉ๐Ÿ‡ช Germany401 Kreise2015โ€“2026Eurostat, Destatis, INKAR/BBSR
๐Ÿ‡ฎ๐Ÿ‡ฑ Israel255 municipalities2002โ€“2018 coredata.gov.il, CBS, audited financial reports
๐Ÿ‡ฆ๐Ÿ‡ช UAE109 districts2015โ€“2026Dubai Land Department, Bayanat.ae, Bayut, World Bank
๐Ÿ‡ธ๐Ÿ‡ฆ Saudi Arabia51 governorates2015โ€“2025GASTAT, KAPSARC, RCRC, World Bank
๐Ÿ‡ถ๐Ÿ‡ฆ Qatar29 municipalities2015โ€“2025data.gov.qa, World Bank

Total: ~1,100 metrics, ~2 million observations. Coverage and depth are not uniform โ€” we say so explicitly. Israel, for example, has a deep, fully-populated core for 2002โ€“2018; later years are partial because the underlying government datasets thin out. Where the source data is thin, the data is thin โ€” we do not paper over it.

3. How the composite index is built

The index is a standard, transparent composite โ€” the same family of method as the UN Human Development Index or city liveability rankings. Three steps:

  1. Normalise each metric to a 0โ€“100 scale (minโ€“max across all regions and years for that metric).
  2. Group metrics into three axes and average the normalised metrics within each: Economic, Demographic, Social / Infrastructure.
  3. Combine into a composite: 40% Economic + 30% Demographic + 30% Social/Infrastructure.
โš  This is a modelling choice โ€” and we treat it as one

Which metric belongs to which axis, and the 40/30/30 weighting, are deliberate choices, not laws of nature. Change the weights and the ranking changes. That is why the index is re-weightable: the axis scores and every underlying metric are exposed in the API, so you can apply your own weights and judge the result yourself.

4. Data integrity โ€” no fabricated values

Scores are re-derived directly from the raw fact tables. Two rules make the dataset honest:

This means our coverage looks smaller than a "fill every cell" approach would โ€” by design. An empty cell is more useful than a fabricated one.

5. Honest limitations

6. FAQ

Isn't this just aggregating CSV files?

The aggregation is the hard part, and we don't pretend otherwise. Government data for these five countries is fragmented across dozens of portals, in multiple languages, with inconsistent identifiers and frequently broken APIs. Cleaning it, geocoding it, translating metric names, reconciling schemas and keeping it current is months of unglamorous work โ€” and it is genuinely hard to reproduce. That labour, and the coverage breadth, is the product. We are not selling a model on top of it.

Can I reproduce a single region's score?

Yes. The API exposes every normalised metric, every axis score and the composite. Pick any region from /map/data, apply the normalisation and the 40/30/30 weights by hand, and you will match within rounding. No login, no API key (rate-limited).

Why these particular weights?

40/30/30 is a transparent default in line with how comparable composite indices balance economic and non-economic factors. It is not sacred. The API returns the axis scores separately precisely so you can re-weight, drop an axis, or build your own composite.

Do you predict prices, crises or migration?

No. We made a deliberate decision not to be in the prediction business. SkyMind shows what the official data currently says about a region. Any forecasting on top of that is the user's call, with their own assumptions.


If you find an error in the data, the sourcing or the index construction โ€” including small ones โ€” email us at info@sky-mind.com. We post methodological corrections publicly with credit to whoever found them.