14 KiB
title, weight, next, sidebar
| title | weight | next | sidebar | ||
|---|---|---|---|---|---|
| File Naming Convention | 2 | /docs/d4c-pkgs |
|
Data for Canada File Naming Convention (D4C-FNC)
{{< callout type="important">}} We are open to feedback on the current file naming convention. {{< /callout >}}
Background
{{< callout type="info">}} You will need to understand these concepts to fully grasp the file naming convention. {{< /callout >}}
See Statistics Canada's geographic hiearchy and use the Census of Population 2021 Dictionary to understand their conceptual model of representing Canada.
1. The Current Schema
All published datasets must adhere to the following structure to ensure files are machine-parsable, sortable by region, and identifiable by human readers. This file naming convention will be modified as we solidify our processes.
{{% details title="File Naming Convention" closed="true" %}}
Syntax
[iso-region]_[data-source-name]-[DGUID]_[data-pkg]_[iso-date]_[variant]_[version].[extension]
Component Breakdown
| Segment | Definition | Format / Rules | Example |
|---|---|---|---|
| 1. ISO Region | The ISO 3166-1 alpha-2, which is a two-letter country code, or ISO 3166-2 which identifies the principal subdivisions (ex. provinces, states, etc.) | Lowercase. Hyphenated. | ca, ca-ab |
| _ | Separator | Underscore | |
| 2. Data Source Name and DGUID | Data Source Name and DGUID. | Use [data-source-name] for the data source and the [DGUID] for the geographic area it covers. |
city-of-edmonton-2023A00054811061, statcan-2021A000011124 |
| _ | Separator | Underscore | |
| 3. Data Package | The package name for the dataset (see High-Level Overview). | Lowercase. d4c-datapkg-orthoimagery |
|
| _ | Separator | Underscore | |
| 4. ISO Date | The vintage of the data source. | ISO 8601. Flexible precision. | 2023, 2023-06, 2023-06-01, 2026-02-11T19:50:58 |
| _ | Separator | Underscore | |
| 5. Variant | Resolution or specific subset info. | No Projections. Alphanumeric. Units included. | 075mm, 30cm |
| _ | Separator | Underscore | |
| 6. Version | Semantic Versioning. | v[Major].[Minor].[Patch] |
v0.0.1 |
| {{% /details %}} |
2. Component Detail
{{% details title="Details" closed="true" %}}
A. Source / Location ID (Flexible)
This segment defines the "Who" of the dataset.
- Use the Data Source Name + Hyphen + DGUID.
- Example:
city-of-edmonton-2023A00054811061
- Example:
B. The DGUID (Capitalization Exception)
If using a DGUID (Dissemination Geography Unique Identifier), you must adhere to Statistics Canada standards.
- Link: Statistics Canada: DGUID Definition
- Rule: While the rest of the filename is lowercase, you must capitalize the structural type letter (e.g., 'A' for Administrative areas, 'S' for Statistical areas) within the DGUID.
- Example:
2021A0005...(Correct) vs2021a0005...(Incorrect).
C. ISO Date Flexibility
Dates follow strictly ISO 8601, but the precision can vary based on the nature of the data (Year, Month, or Day).
- Learn More: Wikipedia: ISO 8601 Date and Time Format
D. Variant
This field is strictly for resolution (e.g., 075mm, 1m) or content subsets.
- Rule: Do not include projection information (e.g.,
EPSG:3857,NAD83) in the filename. - Reasoning: Projection details are handled exclusively in the file format metadata or the accompanying FAIR Data Catalogue item.
E. Semantic Versioning
We use SemVer (vMAJOR.MINOR.PATCH) to track changes to datasets.
- Link: SemVer.org
| Component | Logic for Data | Example Scenario |
|---|---|---|
| MAJOR | Breaking Change. The schema changed, columns were renamed/removed, or the meaning of the data changed significantly. Old code will break. | v0.0.1 → v1.0.0(Renamed column geo_id to dguid) |
| MINOR | New Feature (Non-Breaking). New columns were added, or coverage was expanded, but old columns remain. Old code still runs. | v0.0.1 → v0.1.0(Added a population_density column) |
| PATCH | Bug Fix. Incorrect data values were fixed, but the schema (columns) is identical. | v0.0.1 → v0.0.2(Fixed typo in metadata or coordinate precision) |
{{% /details %}}
3. Helper Tools
{{% details title="Statistics Canada Geography Search" closed="true" %}} {{< callout type="warning" >}} In the tool below, you can click on each individual DGUID to see their associated geography. {{< /callout >}}
To accurately populate the DGUID segment of the schema, use this tool to find 2021 Census geographies and their corresponding DGUIDs.
- Tool URL: https://statcan-geography.labs.dataforcanada.org/
- Source Code: GitHub Repository
- Usage: Enter a city or region name to retrieve the correct colloquial name and DGUID pairing (e.g., searching "Ottawa" returns
2021A00053506008). {{% /details %}}