Update README files

This commit is contained in:
Diego Ripley
2025-06-06 11:20:41 +00:00
parent ce999a3a4d
commit 2350d6d8d7
6 changed files with 8 additions and 64 deletions
+8 -6
View File
@@ -12,16 +12,15 @@ All output datasets are written in GeoParquet format to support modern geospatia
This project processes the following datasets:
- **Geographic Boundaries** (20012021)
- **Road Network Files** (20012021)
- **Health Regions** (20032023)
- **National Address Register** (20222024)
- **Census of Population** (20012021)
- **Census of Agriculture** (20012021)
- **National Household Survey** (20112016)
- **Census of Agriculture** (20012021)
- **National Address Register** (20222024)
- **Road Network Files** (20012021)
## How to Run
This project uses a **Dev Container** environment for setup and execution:
This project uses a Dev Container environment for setup and execution. If you are using VS Code all you need is the [Dev Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers) installed and Docker installed on your system.
```shell
# Clone the repository
@@ -29,10 +28,13 @@ git clone https://github.com/dataforcanada/process-statcan-data.git
# Navigate to the project directory
cd process-statcan-data
# In Dev Container
./main.sh
```
## License
This product is distributed under an MIT license.
[Back to top](#top)
[Back to top](#top)
-46
View File
@@ -1,46 +0,0 @@
# TODO
- Process 2023 Federal Electoral Districts
- For `load.sh`
- Finish processing 2001 data
- For `country.sql`
- Create `country_2001` from 2001 geometries. Need to finish `load.sh`
- Add English abbreviation for all years
- Add French abbreviation for all years
- For `geographic_regions_of_canada.sql`
- Add other years (2016, 2011, 2006, 2001)
- Add GRC abbreviation english
- Add GRC abbreviation french
- According to this, Territories DGUID should be `2021A00016` https://www150.statcan.gc.ca/n1/en/geo?geotext=Territories%20%5BRegion%5D&geocode=A00016
- According to the link above, British Columbia DGUID should be `2021A00015`
- For `er_2021`, split `er_name` into English and French components. There's some records that are separated by `/`
- South Coast--Burin Peninsula / Côte-sud--Burin Peninsula
- West Coast--Northern Peninsula--Labrador / Côte-ouest--Northern Peninsula--Labrador
- Prince Edward Island / Île-du-Prince-Édouard
- For `cma_2021`, split `cma_name` into English and French components. There's some records that are separated by `/`
- Greater Sudbury / Grand Sudbury
- Ottawa - Gatineau (Ontario part / partie de l'Ontario)
- For `ccs_2021`, split `ccs_name` into English and French components. There's some records that are separated by `/`
- West Nipissing / Nipissing Ouest
- French River / Rivière des Français
- Greater Sudbury / Grand Sudbury
- The Nation / La Nation
- For `csd_2021`, split `csd_name` into English and French components. There's some records that are separated by `/`
- The Nation / La Nation
- West Nipissing / Nipissing Ouest
- Greater Sudbury / Grand Sudbury
- Beaubassin East / Beaubassin-est
- For `csd_2021`, figure out what level of geography the sac_code and sac_type belongs to so I can name it appropriately
- For `pop_ctr_2021`, split `pop_ctr_name` into English and French components. There's one record that is separated by `/`
- Grand Falls / Grand-Sault
- For `dpl_2021`, split `dpl_name` into English and French components. There's records that are separated by `/`
- Saint Irénée and Alderwood / Saint Irénée et Alderwood
- `Sainte-Anne-de-Kent part B / partie B` - this one would need to be split into `Sainte-Anne-de-Kent part B` and `Sainte-Anne-de-Kentpartie partie B`
-1
View File
@@ -1 +0,0 @@
- See email that I sent to Statistics Canada titled Reporting Mistakes in Census of Agriculture: Data Linked to Geographic Boundaries for mistakes in the Census of Agriculture data
-5
View File
@@ -1,5 +0,0 @@
# TODO
- Get download links for 2001 and 2006 Census
- For `process_2021.ipynb`
- Finish processing CMA
- Finish processing HR and Local health integration networks
-2
View File
@@ -1,2 +0,0 @@
# TODO
- Process 2021 hydro
-4
View File
@@ -1,4 +0,0 @@
# TODO
- Process 2006 Geographic Attribute File Road Network
- Process 2001 road network
- Change loading of Census Road Network files as a function