Data source | Dataset | Processing script |
---|---|---|
National Emissions Inventory | epa_nei_smoke_ff.RDS | data-raw/epa_nei_smoke_ff.R |
EQUATES | equates_cmas_mn_wi.RDS | data-raw/epa_equates_read.R |
Air Emissions Modeling | onroad_mn_wi.RDS | data-raw/epa_air_emissions_modeling_onroad.R |
3 Data sources
3.1 EPA emissions data
The EPA releases various emissions estimates as part of several programs and initiatives.
All datasets are compiled from Sparse Matrix Operator Kernel Emissions (SMOKE) Flat File 10 (FF10) formatted data downloaded from the EPA website. SMOKE FF10 is a standardized format regularly released by the EPA for NEI, EQUATES, and Air Emissions Modeling platforms (CMAS 2024, sec. 2.2.3).
SMOKE FF10 files were processed using read_smoke_ff10()
, which reads in the raw data, records relevant metadata, filters to only include relevant counties and pollutants, and saves an intermediary dataset. These intermediary datasets are read back in, combined, and saved.
SMOKE FF10 data were aggregated to include all MOVES processes for on- and off-network vehicle operation, including running, starting, and idling exhaust, tire and brake wear, evaporative permeation, fuel leaks, and fuel vapor venting, and crankcase exhaust (CMAS 2024, sec. 2.7.4.9). 1
Direct URLs and download information are available in the EPA downloads guide.
Each data source and year uses a different MOVES edition. These are listed in Table 3.2.
Data source | MOVES edition | Years |
---|---|---|
Various pollutants are available.
Data source | Years | Pollutants |
---|---|---|
Pollutant descriptions
Pollutant | Pollutant code | Description |
---|---|---|
Vehicle types
Vehicle weight label | Vehicle types |
---|---|
Fuel types. Note that not all vehicle type/fuel type/year combinations are available.
Vehicle weight label | Fuel types |
---|---|
3.1.1 National Emissions Inventory
The National Emissions Inventory (NEI) is a comprehensive and detailed estimate of air emissions of criteria pollutants, criteria precursors, and hazardous air pollutants from air emissions sources. The county-level GHG emissions included in the NEI for this category are calculated by running the MOVES model with State-, Local-, and Tribal-submitted activity data and EPA-developed activity inputs based on data from FHWA and other sources (USEPA 2023b).
NEI data were pulled using the EnviroFacts API and processed in R scripts: epa_nei.R and epa_nei_envirofacts.R.
NEI SMOKE FF10 data are processed in epa_nei_smoke_ff.R.
NEI on-road regional summaries are processed in epa_nei_onroad_emissions.R.
Ultimately, NEI data used in the Metropolitan Council inventory were compiled from SMOKE FF10 for year 2020.
Verification and validation
NEI data were cross-verified by comparing county level emissions totals compiled from NEI EnviroFacts, NEI data summaries by region, and compiled SMOKE FF10.
epa_verify_nei_envirofacts_smoke.R found that data compiled from SMOKE FF10 and regional summaries aligned exactly for year 2020 and closely for other years. Similarly, data compiled from EnviroFacts also aligned closely with SMOKE FF10 and regional summaries.
Data published on the EPA website are subject to change at any time. Every effort was taken to align versions, model runs, and other opportunities for differentiation.
3.1.2 EQUATES
EQUATES (EPA’s Air QUAlity TimE Series) is a set of modeled emissions and supporting data developed by EPA scientists spanning years 2002 to 2019. EQUATES is particularly useful in that it uses modern source classification codes (SCCs) to provide a continuous time series (K. M. Foley et al. 2023).
Between the 2008 and 2011 NEI releases, the EPA completed major changes to their source classification codes (SCCs), which rendered direct comparison between 2008 and prior years with 2011 and later years impossible.
EQUATES is based on the 2017 NEI and uses MOVES3 (K. M. Foley et al. 2023).
EQUATES data are available for years 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018 and 2019.
EQUATES SMOKE FF10 data are processed in epa_equates_read.R.
Verification and validation
Though EQUATES datasets are available on the EPA file transfer site and the CMAS Data Warehouse Google Drive, individual file names and file contents were identical.
Limitations
In addition to limitations described in Section 3.1.4, EQUATES has its own set of limitations.
- EQUATES does not contain emissions estimates for N2O (nitrous oxide) for years 2002-2017. N2O was added to the EPA Emissions Modeling Framework (EMF) after EQUATES was compiled. N2O does not affect air quality monitoring and so was not included in older emissions work (K. Foley, Eyth, and Allen 2024). When compared with the NEI and Air Emissions Modeling, including N2O in total CO2e resulted in a maximum difference of around 3% for some counties and years. See epa_verify_n2o_differences.R for more detail.
- EQUATES includes only on-road emission sources.
3.1.3 Air Emissions Modeling Platforms
The EPA continually works on emissions inventories for various projects.
Air Emissions Modeling data are available for several years, but only years 2021 and 2022 are used in the final inventory.
Both the 2021 and 2022 estimates are based on the 2020 NEI USEPA (2024).
Air Emissions Modeling SMOKE FF10 data are processed in epa_air_emissions_modeling_onroad.R.
Verification and validation
Air Emissions Modeling data are only available from a single consistent website, and so verification across locations was not necessary.
Limitations
In addition to limitations described in Section 3.1.4, Air Emissions Modeling has its own set of limitations.
- Air Emissions Modeling datasets are in active development and subject to change.
3.1.4 Consistent EPA data limitations
- The NEI, EQUATES, and Air Emissions Modeling platforms are based on MOVES, which does not account for activity on local roads.
- NEI, EQUATES, and Air Emissions modeling use different MOVES editions (see Table 3.3), which may result in discrepancies between years.
- To reduce run times, the EPA uses fuel months to represent summer and winter fuels. The month of January represents October through April (winter), while July represents May through September (summer) (USEPA 2023a, sec. 5.6.6.2). Variation within the summer and winter months is not accounted for using this method.
- The 2020 NEI had particular challenges due to the COVID-19 pandemic
- Minnesota did not submit custom data inputs for the 2020 NEI, meaning that inputs to MOVES were based on national default values. Wisconsin submitted custom data for VMT, vehicle population, and road type distribution. Both Minnesota and Wisconsin submitted data for 2017, 2014, and 2011 USEPA (2015).
- The NEI augmented vehicle miles traveled (VMT) data for Minnesota and Wisconsin in 2020 using federal and state-level datasets due to data availability issues (USEPA, Godfrey, and Eyth 2022).
- To reduce model run-time, the EPA groups counties together and only runs MOVES on a single representative county. The resulting MOVES emissions factors are multiplied by county-specific activity data (including VMT, vehicle population, hourly speed distribution, among others) to get county-specific emissions (USEPA 2023a, sec. 5.6.2.1). Effectively, emissions factors are generated on a single representative county, and are then applied to similar counties.
- Though nitrous oxide N2O has a high global warming potential (Section A.2), the amount of N2O released is relatively small when compared to other sectors. N2O is unavailable in EQUATES, except years 2018 and 2019.
3.2 State DOT data
As required by federal law, Minnesota and Wisconsin state departments of transportation (MnDOT and WisDOT) report various traffic measures for planning, forecasting, and various analysis endeavors.
3.2.1 Vehicle miles traveled
Vehicle miles traveled (VMT) is a standardized measure created by multiplying average annual daily traffic (AADT) by centerline miles. AADT is an estimate of the total vehicles on a road segment on any given day of the year in all directions of travel. VMT and AADT are common traffic measures and standardized across the United States.
MnDOT and WisDOT derive VMT using traffic counts from continuous and short term traffic monitoring sites. These raw counts are adjusted by multiplying seasonal, day-of-week, and axle adjustment factors WisDOT (2023). Data is not collected for every site every year, but the data are sufficient for year-over-year comparisons.
County vehicle miles traveled
We consider county-level data to be of the highest quality and most reliable measure of VMT.
These data were compiled from MnDOT and WisDOT county level reports. MnDOT provides Excel workbooks with VMT by county and route system on their website. These were downloaded, filtered to include the relevant counties, and aggregated to the county level by summing VMT by county/route system. Processing code can be found in mndot_vmt_county.R.
VMT data for 2015 were interpolated at the county and year level using the midpoint method.[^ MnDOT VMT for year 2015 is unavailable due to significant and fundamental changes in underlying data structure that make directly comparing data prior- and post-2015 inappropriate. However, our interpolation here is based on the county level summary of all VMT and use for comparison purposes only. We used the midpoint method, which is the average of the observation directly before and directly after the missing data point.]
WisDOT publishes PDF tables with county-level VMT. These were downloaded and data was extracted using {tabulapdf}
, an R package interfacing with the Tabula PDF extractor library. Processing code can be found in wisdot_vmt_county.R.
City vehicle miles traveled
City VMT is available only for cities, townships, unorganized areas (CTUs) in the core 7-county metro area.
These data were compiled from MnDOT city and route system reports available on their website. Reports were downloaded and aggregated at the CTU level by summing VMT up for all route systems. Processing code can be found in mndot_vmt_ctu.R. Due to limitations in data consistency, the MnDOT CTU dataset was subsetted to only include CTUs with reliable data. Reliable CTUs are defined as such:
- CTU has a complete time series from 2014-2023 with no missing years. 2015 data were interpolated in the same manner as the county VMT data.
- CTU has sampled data on local route systems (including Municipal State Aid Streets) during any year from 2017-2023. Read more about route system designations in mndot_route_system.R.
Only 145 cities were considered reliable for analysis. All other CTUs were modeled. See Section 3.4.1 for more information.
3.2.2 Limitations
- AADT/VMT data rely on modeling, and not every site will have new observed data every year.
- AADT/VMT are generally estimated for high-use arterial roads and highways, leaving most local roads out.
- We may want to consider using non-permanent counters and/or counters from just outside the study region to increase the total number of calibration roads.
- City VMT
- Due to geographic data source differences, MnDOT reports a small amount of VMT invalid CTU/county combinations (i.e., Minneapolis, a Hennepin County CTU, centerline miles and VMT reported in Anoka County). We discussed these anomalies with MnDOT staff and determined this to be a non-issue. The county designations for each CTU were corrected such that summing to the CTU by the CTU name determines the total VMT for each CTU No changes to county designation were made to CTUs known to be split across multiple counties (Blaine, Chanhassen, Hastings, Saint Anthony, Shorewood, Spring Lake Park and White Bear Lake).
- Shoreview, Blaine, and West Saint Paul are split among more than one county. For some CTU/county/year combinations, only data from 2016 onward were available. For consistency in the time series, we assigned 2016 VMT data to year 2015 for these CTU/county combinations.
3.3 Regional Travel Demand Model
VMT forecasts for counties and cities are generated from our regional travel demand model. We use the most recent available model runs, concurrent with TPP amendments.
The current regional travel demand forecast model (TourCast) is an activity-based model, which means that it simulates transportation decisions made by individuals ranging from long-term (e.g. regular work/school location, whether to own an automobile), day-level (e.g, what activities to engage in, with whom, where, and when), and trip-level (what transportation mode to use, what route to take) in order to evaluate policy and investment choices at a high level of detail.
Model inputs include
- Current population, employment, and other demographic characteristics
- Demographics from Council long-range forecasts
- Road networks based on all projects programmed through year 2025. The projects generally include
- any project that has a change in capacity (number of lanes) or major interchanges
- any regionally significant project
- long-range capital projects
The base-year model outputs best represent 2025 and forecast out to year 2050.
3.3.1 Calculating VMT from the model network
The regional travel demand model network is made up of nodes and links (segments). We use the network link-level information to calculate VMT.
The network based approach is based on attributing all the vehicle traffic that occurs within a given city or county to that city or county, regardless of where the trip starts or ends. VMT is calculated by multiplying link vehicle volume (vehicles) by link length (miles traveled). Network links are attributed to cities by a spatial join. When a link crosses more than one city boundary, the link is split at the boundary. The link total volume is attributed to both sub-link, and the link length is re-calculated for each sub-link. Thus, no volume is lost. All time periods and road types are aggregated to represent average daily vehicle miles traveled. Daily VMT are expanded to annual VMT using an annualization factor of 340 (Castigliego et al. (2019)).
The 2025 base-year includes projects programmed to be built in 2025.
Code for processing RTDM outputs relies on internal file system access and is not available publicly. Please contact us for more information and reproducible examples.
VMT forecasts were completed for all counties and for cities within the 7-county metro area.
3.3.2 Emissions forecast with MOVES
Emissions forecasts for our region were calculated using the EPA’s Motor Vehicle Emissions Simulator (MOVES) (USEPA 2023c). MOVES calculates emissions using outputs from the Regional Travel Demand Model, Minnesota Department of Vehicle Services’ county vehicle registration data, and the Minnesota Pollution Control Agency’s vehicle age distribution. Each of these inputs helps the model estimate the characteristics of vehicles on the road in our region and expected changes in the regional fleet. The model takes into account differences in fuel economy (miles per gallon) depending on a vehicle’s age and size, as well as its fuel intake (diesel or gasoline).
Gross emissions rates in grams CO2e per vehicle mile traveled are provided in Appendix A.
3.3.3 Limitations
- The regional travel demand model, by definition, is built to function at a regional level. Scaling down to geographies smaller than counties might impact model outputs.
- The model outputs a base year estimate (2025) and future year estimate (2050). All intermediary years (2026-2049) are interpolated linearly between the two points.
- Truck VMT forecasts are generally weaker than passenger VMT forecasts. The Council will improve our truck VMT forecasting methodology in coming years.
- To better homogenize data sources, we assigned the model base-year (2025) estimate to year 2023 and did not use MnDOT VMT estimates from 2023 or 2024. Further research and validation to improve the regional travel demand model is in progress.
3.4 Original data analysis
3.4.1 Gap-filling city vehicle miles traveled
VMT for unreliable CTUs were modeled. We designed our model such that the total VMT of all CTUs within a county would equal the MnDOT-reported county VMT. Cities that cross multiple counties were modeled at the county-CTU (COCTU) level.
Years modeled | Cities | |
---|---|---|
Anoka County | ||
Township | 2010 - 2022 | Linwood |
Carver County | ||
Township | 2010 - 2022 | Benton, Camden, Dahlgren, Hancock, Hollywood, Laketown, San Francisco, Waconia, Watertown, Young America |
Dakota County | ||
City | 2010 - 2022 | Empire |
Township | 2010 - 2022 | Castle Rock, Douglas, Eureka, Greenvale, Hampton, Marshan, Nininger, Randolph, Ravenna, Sciota, Vermillion, Waterford |
Hennepin County | ||
Unorganized territory | 2010 - 2022 | Fort Snelling |
Ramsey County | ||
City | 2010 - 2014 | Blaine |
Township | 2010 - 2022 | White Bear |
Scott County | ||
City | 2010 - 2020 | Credit River |
Township | 2010 - 2022 | Belle Plaine, Blakeley, Cedar Lake, Helena, Jackson, Louisville, New Market, Saint Lawrence, Sand Creek, Spring Lake |
Washington County | ||
Township | 2010 - 2022 | Baytown, Denmark, Grey Cloud Island, May, Stillwater, West Lakeland |
For cities without reported annual MnDOT VMT (mostly townships and small cities under 5,000 people, see Table 3.7), we estimated VMT from 2010 to 2022 using a mixed-effects model incorporating historical and forecast city population, households, and employment, plus county and Imagine 2050 community designation. City-level VMT forecasts from the Regional Travel Demand Model (RTDM) were used to ensure a smooth rate of change over time and alignment with regional forecasts.
The model was trained on a subset of COCTUs, ensuring that all community types and counties were included.
Once the COCTUs VMT model was finalized and observed VMT data were incorporated into the dataset, we applied a scaling factor to predicted COCTU VMT so that the total of all COCTUs within a county would sum to the county total VMT.
Modeling took place in mndot_vmt_ctu_gap_fill_model.R.
The final COCTU VMT estimates are a combination of modeled values and reported MnDOT values.
Model validation
To cross check our assumptions, we can compare relative amount of VMT not accounted for when we sum up the reported city VMT for each county.
All counties have the majority of their VMT accounted for when totaling up reported city level VMT. Carver, Scott, and Dakota counties have the largest gaps. We refer to the difference in reported county VMT and summed city VMT as the marginal VMT.
We can then pull out our modeled CTUs and find the proportion of county marginal VMT that is accounted for by each CTU. For example, 100% of Hennepin County’s marginal VMT is attributed to Fort Snelling UT, which contains MSP Airport.
We can also check the correlation with city population and predicted VMT. The correlation is positive, with a few exceptions. Fort Snelling UT has very low population, but is a major job and activity center in our region, so has high VMT.
As more data become available, we will re-evaluate this framework and modeling approach.
All six MOVES emissions processes, including rate per distance (RPD), rate per vehicle (RPV), rate per hour (RPH), rate per profile (RPP), rate per start (RPS), and rate per hour for off-network idling (RPHO) were summed for each vehicle type, fuel type, and pollutant (Beidler and Eyth 2024)↩︎