Files
d4c-service-main-site/content/_index.md
T
2026-02-06 08:39:23 -05:00

8.3 KiB

title, toc
title toc
Welcome to Data for Canada false

Mission

Data for Canada exists to bridge the gap between open data availability and data usability. We curate, clean, and re-engineer high-value Canadian datasets into high-performance, analysis-ready formats for researchers, developers, and systems.

The Problem

Canada creates incredible amounts of open data, from foundational road networks to federal census statistics and orthoimagery. However, these datasets are often locked in legacy formats, fragmented portals, or structures that require significant engineering effort to normalize. For a researcher or system developer, the "time-to-insight" is often bottlenecked by data preparation.

The Solution

We act as the transformation layer. We aggregate datasets with permissive licenses and process them into "digestible" standards optimized for modern downstream applications.

  • For Data Engineers, Researchers/Scientists, and Developers: Skip the cleaning phase. Access normalized, documented data ready for analysis.
  • For Systems: Standardized data structures designed to feed directly into pipelines, data warehouses, and downstream services.

Our Stewardship: Data for Canada takes ownership of the datasets we create, from start to finish. We ensure that data structures remain consistent, allowing for reliable analysis across time and space.

What Guides Us

We prioritize our work in a utilitarian manner, aiming to provide the greatest amount of good to the greatest amount of individuals, though we remain open to making exceptions where necessary.

Our approach is informed by the following:

High-Level Overview

flowchart TD
    subgraph ds [Data Sources]
        Statistical@{ shape: lean-l}
        Foundation@{ shape: lean-l}
        Orthoimagery@{ shape: lean-l}
        FieldImagery@{ shape: lean-l, label: "Field Imagery"}
        EnvironmentClimate@{ shape: lean-l, label: "Environmental & Climate"}
        Elevation@{ shape: lean-l}
        WebCorpus@{ shape: lean-l, label: "Web Corpus"}
    end

    subgraph pp [Processing Pipeline]
        Raw@{ shape: rect, label: "Raw Data Ingestion"}
        Transform@{ shape: rect, label: "Transform and Optimize"}
    end

    subgraph df [Dissemination Formats]
        Parquet@{ shape: lean-l}
        FlatGeoBuf@{ shape: lean-l}
        MVT@{ shape: lean-l}
        MLT@{ shape: lean-l}
        PMTiles@{ shape: lean-l}
        COG@{ shape: lean-l}
        Zarr@{ shape: lean-l}
        WebP@{ shape: lean-l}
        JPEGXL@{ shape: lean-l, label: "JPEG XL"}
        AV1@{ shape: lean-l, label: "AV1"}
    end

    subgraph di [Distribution Infrastructure]
        ObjectStorage@{ shape: bow-rect, label: "Object Storage"}
        Metadata@{ shape: rect}
        HTTP@{ shape: rect, label: "Static Files"}
        DecentralizedDistribution@{ shape: rect, label: "Decentralized Distribution"}
    end

    subgraph ei [Experimental Infrastructure]
        GeoSpatialServices@{ shape: rect, label: "Geospatial Services"}
        %%Martin@{ shape: rect}
        %%GeoServer@{ shape: rect}
        %%ZOOProject@{ shape: rect, label: "ZOO-Project"}
        %%BBOXServer@{ shape: rect, label: "BBOX Server"}
        %%Panoramax@{ shape: rect}
        %%Pelias@{ shape: rect}
    end

    subgraph "Consumption"
        DataSci@{ shape: rect, label: "Researchers & Developers"}
        Systems@{ shape: rect, label: "Systems"}
    end

    %% Relationships
    Statistical a1@--> Raw
    a1@{animate: true, animation: slow}
    Foundation a2@--> Raw
    a2@{animate: true, animation: slow}
    Orthoimagery a3@--> Raw
    a3@{animate: true, animation: slow}
    FieldImagery a4@--> Raw
    a4@{animate:true, animation: fast}
    EnvironmentClimate a5@--> Raw
    a5@{animate: true, animation: fast}
    Elevation a6@--> Raw
    a6@{animate: true, animation: slow}
    WebCorpus a7@--> Raw
    a7@{animate: true, animation: fast}
    Raw a8@--> Transform
    a8@{animate: true, animation: slow}
    Transform a9@--> df
    a9@{animate: true, animation: slow}
    Parquet a10@--> FlatGeoBuf
    a10@{animate: true, animation: slow}
    FlatGeoBuf a11@--> MVT
    a11@{animate: true, animation: slow}
    FlatGeoBuf a91@--> MLT
    a91@{animate: true, animation: slow}
    MVT a90@ --> PMTiles
    a90@{animate: true, animation: slow}
    MLT a92@ --> PMTiles
    a92@{animate: true, animation: slow}
    Zarr a12@ --> WebP
    a12@{animate: true, animation: slow}
    df a13@ --> di
    a13@{animate: true, animation: slow}
    COG a14@--> WebP
    a14@{animate: true, animation: slow}
    WebP a93@--> PMTiles 
    a93@{animate: true, animation: slow}
    ObjectStorage a15@--> Metadata
    a15@{animate: true, animation: slow}
    Metadata a16@--> HTTP
    a16@{animate: true, animation: slow}
    HTTP a17@--> ei
    a17@{animate: true, animation: slow}
    HTTP a18@--> DecentralizedDistribution
    a18@{animate: true, animation: slow}
    HTTP a19@--> DataSci
    a19@{animate: true, animation: slow}
    DecentralizedDistribution a20@--> Systems
    a20@{animate: true, animation: fast}
    DecentralizedDistribution a21@--> DataSci
    a21@{animate: true, animation: fast}
    Systems a22@ --> DataSci
    a22@{animate: true, animation: fast}
    ei a23@ --> DataSci
    a23@{animate: true, animation: slow}

    %% URLs
    click Foundation "https://github.com/dataforcanada/process-foundation-labs/" _blank
    click Statistical "https://github.com/dataforcanada/process-statistical-labs/" _blank
    click Orthoimagery "https://github.com/dataforcanada/process-orthoimagery-labs/" _blank
    click FieldImagery "https://github.com/dataforcanada/process-field-imagery-labs/" _blank
    click EnvironmentClimate "https://github.com/dataforcanada/process-environmental-climate-labs/" _blank
    click Elevation "https://github.com/dataforcanada/process-elevation-labs/" _blank
    click WebCorpus "https://github.com/dataforcanada/process-web-corpus-labs/" _blank

    click Parquet "https://github.com/apache/parquet-format/" _blank
    click FlatGeoBuf "https://flatgeobuf.org/" _blank
    click MVT "https://github.com/mapbox/vector-tile-spec/" _blank
    click MLT "https://github.com/maplibre/maplibre-tile-spec/" _blank
    click COG "https://cogeo.org/" _blank
    click Zarr "https://github.com/zarr-developers/geozarr-spec/" _blank
    click WebP "https://developers.google.com/speed/webp/" _blank
    click PMTiles "https://github.com/protomaps/PMTiles/blob/main/spec/v3/spec.md" _blank
    click JPEGXL "https://jpeg.org/jpegxl/" _blank
    click AV1 "https://aomedia.org/specifications/av1/" _blank
    click DecentralizedDistribution "https://www.dataforcanada.org/docs/dissemination/" _blank
    click Metadata "https://stac-utils.github.io/stac-geoparquet/latest/spec/stac-geoparquet-spec/" _blank
    click GeoSpatialServices "https://github.com/dataforcanada/geo-services-labs/" _blank
    click Martin "https://martin.maplibre.org/" _blank
    click GeoServer "https://geoserver.org/" _blank
    click ZOOProject "https://zoo-project.org/" _blank
    click BBOXServer "https://www.bbox.earth/" _blank
    click Panoramax "https://gitlab.com/panoramax" _blank
    click Pelias "https://pelias.io" _blank

Get Involved: We Are Looking for Members

We are actively looking for new members to help shape this project.

Right now, we primarily need feedback on our datasets and the underlying processes used to generate them. If you have thoughts on data quality, format optimization, or pipeline improvements, we want to hear from you.

How to Contribute

  • Discussions: Head over to #dataforcanada:matrix.org to chat, or go to the individual process GitHub repos to comment on specific issues.

License

This project is licensed under the MIT License.