d4c-datapkg-statistical

mirror of https://github.com/dataforcanada/d4c-datapkg-statistical.git synced 2026-06-13 14:10:55 +02:00

Author	SHA1	Message	Date
Diego Ripley	a55e1d325d	Made changes	2025-06-26 13:35:37 +00:00
Diego Ripley	4ed5fb4bbb	Add DuckDB example for duplicate column name	2025-06-25 15:38:12 +00:00
Diego Ripley	b71a7b326e	DuckDB issue with duplicate column names (ex. 'Value' and 'VALUE' are treated the same)	2025-06-25 15:30:36 +00:00
Diego Ripley	e929850d4a	Finish comment on issue with Value and VALUE columns being treated the same by DuckDB	2025-06-21 18:03:29 +00:00
Diego Ripley	8875722d10	Made changes to processing of data tables	2025-06-21 18:01:16 +00:00
Diego Ripley	7c8211cb5f	Found some issues with the output parquet files	2025-06-21 05:26:50 +00:00
Diego Ripley	887291d2f7	Read all DGUIDs from subset parquet output (100,000 records each)	2025-06-21 00:54:26 -04:00
Diego Ripley	72ca6c87e1	Made changes	2025-06-20 17:32:01 -04:00
Diego Ripley	5a95616b3c	Calculate CSV file size by viewing inside of zip file	2025-06-20 16:01:20 -04:00
Diego Ripley	e836363cd1	Had to optimize the code. Leaving it outside of function for now in case I need to continue working on it	2025-06-20 16:00:51 -04:00
Diego Ripley	f6d88c5fd0	Continue work on processing data tables	2025-06-19 15:58:30 -04:00
Diego Ripley	ab8f40c708	Keeping track of processed files in case processing crashes and I have to restart again	2025-06-19 11:46:31 -04:00
Diego Ripley	faa63451ab	Experiment with Jupyter notebook on downloading and processing statcan cubes	2025-06-18 21:26:51 +00:00
Diego Ripley	c0899080f4	Remove scratch files after processing. Was running out of space	2025-06-18 09:26:18 -04:00
Diego Ripley	ea603f2914	Convert statcan CSV into parquet	2025-06-17 20:46:24 +00:00

15 Commits