| Detailed Mode Value | Detailed Mode Value | Mode Type Value | Mode Type Label |
|---|---|---|---|
| 1 | Northstar | 1 | Rail |
| 2 | Light rail (e.g., Blue Line, Green Line) | 1 | Rail |
| 3 | Other rail | 1 | Rail |
| 4 | School bus | 2 | School Bus |
| 5 | Bus rapid transit (e.g., A Line, C Line, Red Line) | 3 | Public Bus |
| 6 | Express/commuter bus | 3 | Public Bus |
| 7 | Local bus | 3 | Public Bus |
| 8 | Dial-A-Ride (e.g., Transit Link) | 3 | Public Bus |
| 9 | Metro Mobility | 3 | Public Bus |
| 10 | SouthWest Prime or MVTA Connect | 3 | Public Bus |
| 11 | Employer-provided shuttle/bus | 4 | Other Bus |
| 12 | University/college shuttle/bus | 4 | Other Bus |
| 13 | Other private shuttle/bus (e.g., a hotel's, an airport's) | 4 | Other Bus |
| 14 | Vanpool | 4 | Other Bus |
| 15 | Other bus | 4 | Other Bus |
| 16 | Intercity rail (e.g., Amtrak) | 5 | Long distance passenger mode |
| 17 | Intercity bus (e.g., Greyhound, Jefferson Lines) | 5 | Long distance passenger mode |
| 18 | Airplane/helicopter | 5 | Long distance passenger mode |
| 19 | Uber, Lyft, or other smartphone-app ride service | 6 | Smartphone ridehailing service |
| 20 | Regular taxi (e.g., Yellow Cab) | 7 | For-Hire Vehicle |
| 21 | Other hired car service (e.g., black car, limo) | 7 | For-Hire Vehicle |
| 22 | Household vehicle 1 | 8 | Household Vehicle |
| 23 | Household vehicle 2 | 8 | Household Vehicle |
| 24 | Household vehicle 3 | 8 | Household Vehicle |
| 25 | Household vehicle 4 | 8 | Household Vehicle |
| 26 | Household vehicle 5 | 8 | Household Vehicle |
| 27 | Household vehicle 6 | 8 | Household Vehicle |
| 28 | Household vehicle 7 | 8 | Household Vehicle |
| 29 | Household vehicle 8 | 8 | Household Vehicle |
| 30 | Other vehicle in household | 8 | Household Vehicle |
| 31 | Other motorcycle in household | 9 | Other Vehicle |
| 32 | Other motorcycle (not my household's) | 9 | Other Vehicle |
| 33 | Car from work | 9 | Other Vehicle |
| 34 | Friend/relative/colleague's car | 9 | Other Vehicle |
| 35 | Rental car | 9 | Other Vehicle |
| 36 | Carpool match (e.g., Waze Carpool) | 9 | Other Vehicle |
| 37 | Carshare service (e.g., Zipcar) | 9 | Other Vehicle |
| 38 | Peer-to-peer car rental (e.g., Turo) | 9 | Other Vehicle |
| 39 | Other vehicle (not my household's) | 9 | Other Vehicle |
| 61 | Electric vehicle carshare (e.g., Evie) | 9 | Other Vehicle |
| 40 | Electric bicycle (my household's) | 10 | Micromobility |
| 41 | Standard bicycle (my household's) | 10 | Micromobility |
| 42 | Borrowed bicycle (e.g., a friend's) | 10 | Micromobility |
| 43 | Bike-share - standard bicycle | 10 | Micromobility |
| 44 | Bike-share - electric bicycle | 10 | Micromobility |
| 45 | Other rented bicycle | 10 | Micromobility |
| 46 | Personal scooter or moped (not shared) | 10 | Micromobility |
| 47 | Scooter-share (e.g., Bird, Lime) | 10 | Micromobility |
| 48 | Moped-share (e.g., Scoot) | 10 | Micromobility |
| 49 | Segway | 10 | Micromobility |
| 50 | Other scooter or moped | 10 | Micromobility |
| 51 | Skateboard or rollerblade | 10 | Micromobility |
| 52 | Other boat (e.g., kayak) | 11 | Other |
| 53 | Vehicle ferry (took vehicle on board) | 11 | Other |
| 54 | Other public ferry or water taxi | 11 | Other |
| 55 | Golf cart | 11 | Other |
| 56 | Snowmobile | 11 | Other |
| 57 | ATV | 11 | Other |
| 58 | Medical transportation service | 11 | Other |
| 59 | Other | 11 | Other |
| 60 | Walk (or jog/wheelchair) | 12 | Walk |
| This mode type hierarchy table contains the values from the 2023 dataset; some names for mode types have changed slightly since the 2019 and 2021 surveys. For more information, consult the combined codebook. | |||
Dataset Preparation, Quality Assurance, and Quality Control
Overview
RSG conducted dataset preparation and quality control procedures at every stage of the study (before, during, and after data collection). These procedures were designed to validate survey logic, review participant experience, and confirm consistent data coding in the survey database. The following sections summarize the various dataset preparation and quality control steps. RSG provided a separate QAQC Plan to the Met Council for each wave of survey collection; these plans include data cleaning details for key elements.
Database Setup and Real-Time Quality Controls
Prior to a survey launch, RSG and the Met Council reviewed the survey instruments to ensure that the survey interface was clear and easy to use, questions were understandable, and variables wrote out to the database as expected. To reduce survey burden and improve final data quality, the survey also included real-time data checks and logic. Examples of these checks include the following:
- Validation logic to prevent skipped questions.
- Logic checks to hide irrelevant questions and answers (e.g., employment questions for children).
- Spatial and temporal checks within trip rosters to prevent overlapping trips.
These real-time data checks do not eliminate every inconsistency, but they do significantly reduce reporting errors and re-coding requirements after data collection.
Geographic Data Checks
During data collection, the survey instruments used the Bing Maps API to geocode the coordinates for reported home, work, school, and trip addresses.
Following data collection, RSG also coded home location points to block groups and broader regional definitions.
Trip Derivation for Nonparticipating Household Members
Household travel surveys require data for all household members to assess complete household travel patterns. However, some exceptions are allowed in the data collection process where travel can be reported by proxy, particularly for children.
Household adults were asked to report travel for the children in the household (under age 18). Participants could also report children of all ages as travel party members on their own trips. RSG used these records to derive diary records for children under age 18.
Completion Criteria
The last step of dataset preparation involved reviewing all data records to confirm that they met survey, travel day, and household completion criteria. Complete
households met the following conditions:
- The household completed the online recruitment/demographic survey.
- All ABS household members provided complete travel diary information (i.e., answered all surveys and reported all trips). Online panel members provided complete travel diary information for themselves (person 1 in the household).
- The household reported a home address within the study region.
In 2023, outreach segment households were marked as incomplete because they did not meet criteria 1 and 2: outreach participants completed the survey for themselves, but did not report complete information for their household.
Imputation
Departure Time
In some cases, the rMove™ app may have detected the start of a trip after its true start time, which can yield invalid or extreme values for trip duration and speed. In these cases, the fields depart_date, depart_hour, and depart_minute were adjusted for late pickup
conditions using the following approach:
- Departure time was imputed using the median speed between all locations along the trip, excluding the origin point, and the distance between the origin and the next point on the trip. For trips with fewer than three recorded locations, imputed departure time is set three minutes earlier than the original departure time to compensate for rMove’s 3-5-minute ping interval. Note that some trips that are the result of split loop trips may only have three or fewer points but will use the imputed depart time from before the loop trip was split and thus may not be included in this rule.
- If the imputed departure time overlaps with the previous trip’s arrival time, the previous trip’s arrival time was instead used as the departure time. Regardless of the number of locations along a trip, if the imputed departure time was later than the initially reported departure time, the imputed departure time is set to the original departure time. User-added trips as well as long distance passenger mode trips are also set to the original departure time, as user-added trips are not subject to
late pickup
conditions, and long-distance passenger modes are often plane trips where all collected traces contain speed information from other modes and thus are less reliable (as rMove™ cannot collect locations when a phone is inairplane mode
).
Duration and speed are calculated based on the imputed departure time.
Purpose
Respondents report the purpose of the trip destination in each trip survey. The origin purpose is derived from the destination purpose of the previous trip, except for the first trip in the travel period or where an rMove™ trip occurs after a trip with item non-response. For the first trip in the travel period, the origin purpose can be inferred from begin_day in the day table.
When purpose was not asked because an analyst split a user-reported trip during data cleaning (creating a new destination along a trip), purpose values are derived where possible based on proximity (within 150 meters) to estimated home, work, or school locations. If the location is not proximate to home, work, or school locations, the purpose is set to other.
The purpose category variables (o_purpose_category, d_purpose_category) contain aggregated purpose values based on the type of purpose at the origin/destination of each trip. Dataset users are welcome to perform their own recoding of the purpose categories as well.
Trip purposes have been imputed in cases where a purpose reported by the user is assumed to be inaccurate based on information about that person’s reported habitual locations and other trips (primarily to home, work, and school locations). The trip purpose imputation approach was applied to all rMove™ trips in person-days with at least 1 complete trip and no more than 10 incomplete trips. (Incomplete
trips are trips for which the respondent did not answer the trip-specific survey questions about purpose, mode, etc. for the given trip.)
The approach was to apply various tests
in logical sequence to trips for which the stated purpose is not consistent with the location type based on the reported habitual locations. In general terms, the tests were designed to:
- Check the respondent’s reported destination purpose when it conflicts with the destination location type. (The details of the tests depend on the trip purpose, with different criteria used for change-mode trips, escort trips, linked transit trips, trips with home destinations but other reported purposes, etc.)
- Identify cases where respondents swapped the order of two or more trips when reporting their details.
- Identify cases where respondents may have omitted a trip and shifted remaining reported trip details by one trip when reporting the rest of their trips.
- Fill in missing data by sampling destination purposes from other trips made to the same locations, either by the same respondent or by other respondents.
Mode type (mode_type)
Mode_type synthesizes mode_1 to mode_3 down to a single, easier-to-use variable for analytical purposes (so that data users can avoid always referencing all modes on a multimodal trips). Table 2.1 below shows the full crosswalk of which detailed modes correspond to which mode_types in the 2023 data. Higher values of mode_type are prioritized over lower mode_type values in the derivation. For example, transit trips, with mode_type 13, are prioritized over walk trips, with mode_type 1. When transit trips were unlinked using the Google API during cleaning, the non-transit legs of the trip were recoded using Google’s suggested mode (most frequently walk
or bike
) and do not have a reported mode_1, mode_2, or mode_3.
iOS Trip Trace Irregularities (Wave 3 2023 Data Only)
The release of iOS 16.4 by Apple on March 27, 2023, brought about significant changes to background location tracking, affecting apps such as rMove, which rely on collecting location information. Consequently, iPhone users with iOS 16.4 or later experienced irregular trip traces within the rMove™ app, impacting data accuracy for Spring 2023.
To address this issue, RSG swiftly updated the rMove™ app and monitoring scripts to mitigate inconsistencies in future data collection. Despite these efforts, the 2023 dataset remained affected. To manage the impact on the dataset and downstream processes, RSG developed a series of criteria to identify suspect trips and flag individuals or households accordingly. Additionally, adjustments in weighting and dataset delivery were made to ensure maximum data utility.
Any suspect trip trace records were identified and the dataset was provided with multiple weights. One set of weights with the full dataset and one set of weights to use if applying this strict criteria, so that any trip analysis metrics could exclude potential trip trace irregularities.
Combined Dataset (2019, 2021 and 2023)
To facilitate analyses across waves of the survey, RSG developed a cross-wave combined dataset and codebook.
Combined Codebook
Download an Excel version of the combined codebook by clicking here.
RSG typically delivers data in its raw form, with the numeric codes that correspond to survey entries instead of the text seen by participants. The codebook allows data users to translate survey results into human-readable format, and is comprised of two parts:
- A variable list, which includes attributes at the level of individual survey questions; and
- A value labels table, which corresponds to attributes of survey responses.
A combination of manual and scripted processes were used to create a combined codebook:
Variable crosswalk (manual process in Excel). First, a variable crosswalk table was constructed by aligning variable names and survey questions across years. Where variable names differed, but survey question meaning stayed the same, a
unified
variable name was chosen. Logic, variable descriptions, location of the variable in the database, and other attributes were inspected manually and with a combination of processes in an Excel workbook (i.e., using VLOOKUP processes and other formulas).Example: The variable
fuelin 2019 is renamedfuel_typein 2023. Theunified
variable name becomesfuel_type.Value label crosswalk (manual process in Excel). Next, a value label crosswalk was constructed in a similar manner. A crosswalk for numeric value inputs was created for where value labels differed for the same numeric value entry. These
unified
values and value labels were then used to constructupcoded
values and value labels, which consolidated across disparate categories with similar meanings.Example:
Plug-in hybrid (PHEV)
andHybrid (HEV)
vehicle fuel types, used in 2019 data, are upcoded toHybrid
(2021, 2023 data).
| Values | Labels | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| 2019 | 2021 | 2023 | Unified | Upcoded | 2019 | 2021 | 2023 | Unified | Upcoded |
| -9998 | NA | NA | -9998 | 995 | Missing: Non-response | Missing | Missing | ||
| 1 | 1 | 1 | 1 | 1 | Gas | Gas | Gas | Gas | Gas |
| NA | 2 | 2 | 2 | 2 | Hybrid (HEV) | Hybrid (HEV) | Hybrid (HEV) | Hybrid | |
| 3 | 3 | 3 | 3 | 2 | Hybrid | Plug-in hybrid (PHEV) | Plug-in hybrid (PHEV) | Plug-in hybrid (PHEV) | Hybrid |
| 4 | 4 | 4 | 4 | 3 | Electric | Electric (EV) | Electric (EV) | Electric (EV) | Electric (EV) |
| 2 | 5 | 5 | 5 | 4 | Diesel | Diesel | Diesel | Diesel | Diesel |
| NA | 6 | 6 | 6 | 5 | Flex fuel (FFV) | Flex fuel (FFV) | Flex fuel (FFV) | Other | |
| 997 | 7 | 7 | 7 | 5 | Other | Other (e.g., natural gas, bio-diesel) | Other (e.g., natural gas, bio-diesel) | Other (e.g., natural gas, bio-diesel) | Other |
| 995 | 995 | NA | 995 | 995 | Missing: Skip logic | Missing | Missing | Missing | |
Combined Dataset
A scripted process, written in R and relying on the combined codebook, was used to create a single dataset containing all three waves of survey data. The scripted process:
- Renamed dataset columns (variables) from their year-specific names to the
unified
names chosen in the combined codebook.
- In each column (for each variable), replaced year-specific numeric response codes with their
unified
response codes. - Repeated step 2 with
upcoded
response codes, to create anupcoded
dataset.
Special Output: Trip Purpose Table
The trip_purpose table was derived from the upcoded and unified trip tables. Its purpose is to aid data analysis of overall trip purposes. This table was developed because the origin and destination purpose
categories in the trip table can contain non-intuitive classifications. For example, a summary of destination purposes will have many trips home
but the overall trip purpose for that trip home might actually correspond to the non-home trip end (work, school, etc).
Removing trips with a destination of home
from the overall analysis is one option, but this can lead to some place types being missing from the final dataset when the trip roster for a person’s day is incomplete. For example, if a person’s trip diary for a day consists of a trip from a friend’s house to their home, the place type friend’s house
will be missing from the final summary of trip purposes.
To account for all place types – both origin and destination – in the final trip purpose summaries, the following steps were used to create the trip purpose table:
- Transit trips that had been unlinked into access, transit and egress legs were re-linked, by consolidating multiple legs of transit trips into a single record. This removes
change mode
trips from the table, except for long-distance trips. The trip weight for each linked trip was set to the maximum trip weight its composite unlinked trips.
- Trips were placed in two categories: home-based (having one trip end at home) and non-home-based trips.
- Home-based trips’ purposes were classified as the non-home end. The weight for this trip purpose record is equal to the original trip weight.
- Non-home-based trips were split into two records for each trip: one for the origin end, and a second for the destination end. The weight for each record was set to half of the original trip weight.
The tables below show a hypothetical example of this process for the travel diary corresponding to day_id 199885710201.
In the trip table, there are four records that correspond to this day. The person left home, went to work, went on an exercise trip or to the gym, picked up someone from school, and finally returned home.
| trip_id | o_purpose | d_purpose | trip_weight |
|---|---|---|---|
| 1998857102001 | Went home | Primary workplace | 233.4265 |
| 1998857102002 | Primary workplace | Exercise or recreation (e.g., gym, jog, bike, walk dog) | 363.7046 |
| 1998857102003 | Exercise or recreation (e.g., gym, jog, bike, walk dog) | Pick-up/drop-off to/from K-12 school or college | 363.7046 |
| 1998857102004 | Pick-up/drop-off to/from K-12 school or college | Went home | 363.7046 |
An analysis of trip purpose by destination place types (d_purpose) would yield an overall trip purpose share of 16.7% trips to work, 33.4% of trips to exercise, 30.2% of trips to escort others to school, and 19.8% of trips to home:
| d_purpose | trip_weight | purpose_share | |
|---|---|---|---|
| Primary workplace | 233.4 | 17.6% | |
| Exercise or recreation (e.g., gym, jog, bike, walk dog) | 363.7 | 27.5% | |
| Pick-up/drop-off to/from K-12 school or college | 363.7 | 27.5% | |
| Went home | 363.7 | 27.5% | |
| Total | — | 1,324.5 | 100.0% |
In the trip purpose table for the same day ID, there are six rows instead of four – the trip from work to exercise, and from exercise to pick-up someone from school, have both been expanded to two rows to allow trip weight to be distributed across them. For the home-based trips, the trip purpose has been assigned to the non-home end of the trip (work and escort, respectively).
| trip_purpose_id | purpose | trip_purpose_weight |
|---|---|---|
| 181704 | Primary workplace | 233.4265 |
| 181705 | Pick-up/drop-off to/from K-12 school or college | 363.7046 |
| 480995 | Primary workplace | 181.8523 |
| 480996 | Exercise or recreation (e.g., gym, jog, bike, walk dog) | 181.8523 |
| 732042 | Exercise or recreation (e.g., gym, jog, bike, walk dog) | 181.8523 |
| 732043 | Pick-up/drop-off to/from K-12 school or college | 181.8523 |
Summarizing the trip purpose table yields 33.4% of trips for
work (i.e., one-third of trips are work-related), 31.8% of trips for exercise or recreation, and 34.9% of trips to pick up others from school.
| trip_purpose_weight | purpose_share | |
|---|---|---|
| Primary workplace | ||
| 233.4 | 17.6% | |
| 181.9 | 13.7% | |
Subtotal |
415.3 | 31.4% |
| Pick-up/drop-off to/from K-12 school or college | ||
| 363.7 | 27.5% | |
| 181.9 | 13.7% | |
Subtotal |
545.6 | 41.2% |
| Exercise or recreation (e.g., gym, jog, bike, walk dog) | ||
| 181.9 | 13.7% | |
| 181.9 | 13.7% | |
Subtotal |
363.7 | 27.5% |
| Total | 1,324.5 | 100.0% |
Alternatively, the data user could remove trips home
and calculate purpose share using the destination purpose, using only the subset of trips that do not end at home. For this travel day, calculating purpose share from the d_purpose for the subset of trips that do not end at home yields a greater share of trips for exercise and pick-up relative to the calculations from the trip purpose table.
| d_purpose | trip_weight | purpose_share | |
|---|---|---|---|
| Primary workplace | 233.4 | 24.3% | |
| Exercise or recreation (e.g., gym, jog, bike, walk dog) | 363.7 | 37.9% | |
| Pick-up/drop-off to/from K-12 school or college | 363.7 | 37.9% | |
| Total | — | 960.8 | 100.0% |
Table 2.8 below shows how using destination purpose in the trip table compares to using the overall trip purpose in the trip purpose table. The vast majority of trips home
have been re-categorized (a small number remain, where both origin and destination were home,
i.e., loop trips without an intermediate stop point). The estimate of total number of trips differs across the two tables as well, because the trip purpose table has consolidated change mode
trips into a broader linked trip purpose.
| purpose/d_purpose | Trip Table, Selected Categories | Trip Purpose Table, All Categories | |||
|---|---|---|---|---|---|
| Share | Total | Share | Total | ||
| Home | 32.4% | 4,339,661 | 0.1% | 10,526 | |
| Shopping | 10.8% | 1,447,777 | 16.3% | 1,927,275 | |
| Social/Recreation | 9.7% | 1,305,811 | 16.0% | 1,889,502 | |
| Escort | 9.6% | 1,285,127 | 15.6% | 1,847,927 | |
| Work | 8.2% | 1,096,092 | 13.4% | 1,587,782 | |
| Meal | 6.9% | 929,880 | 10.4% | 1,229,728 | |
| Errand | 5.5% | 731,313 | 9.9% | 1,167,337 | |
| Work related | 5.2% | 694,438 | 7.3% | 867,563 | |
| School | 3.5% | 468,666 | 5.7% | 673,190 | |
| Change mode | 3.4% | 451,611 | NA | NA | |
| Overnight | 2.5% | 335,107 | 3.4% | 398,291 | |
| Other | 1.9% | 258,319 | 1.2% | 146,963 | |
| School related | 0.5% | 65,032 | 0.7% | 82,248 | |
| Total | — | 100.0% | 13,408,835.3 | 100.0% | 11,828,331.3 |
Dataset Composition
The final unweighted datasets includes seven distinct data tables. These tables include all user-input survey variables, certain survey metadata (e.g., survey completion mode), and variables derived to support data analysis.
| Table | Rows |
|---|---|
| Household | 19,170 complete households |
| Person | 38,691 people |
| Vehicle | 30,239 vehicles |
| Day | 157,947 days |
| Trip | 623,926 unlinked trips |
| Location | 12,345,002 points |
| Trip Purpose | 343,372 linked, single-ended trips |
| 251,366 unlinked, two-ended trips |
