library(tc.sensors)

Introduction

{tc.sensors} is an R package for processing data collected from Minnesota Department of Transportation (MnDOT) loop detectors installed on the Minnesota Freeway system. Data are published to a public JSON feed in 30-second interval measurements of occupancy and volume. Occupancy and volume data can be used to calculate speed and delay.

Primary terms and definitions

Raw data is accessed at the sensor level. Sensors are generally spaced in half mile intervals along a corridor.

Volume

Volume represents the number of vehicles that pass through a detector in a given time period. Flow represents the number of vehicles that pass through a detector per hour (Volume * Samples per Hour).

Occupancy

The term ‘occupancy’ does not here refer to the occupants of a vehicle but rather the occupancy of the sensor, or how long the sensor was ‘occupied’. In a 30-second time period, 1800 scans are produced (60 per second), and each scan is binary: either the sensor is occupied or not. Therefore, a sensor occupied for 1 second within the 30-second time period would have a value of 60. Raw occupancy values can be converted to percentages:

\[\frac{Occupancy}{1800}\times 100% \] The resulting percentage is the percentage of time in that 30 seconds that the sensor was ‘occupied’.

Data cleaning and interpolation

Where nulls exist (vs. a zero measurement), it is assumed the connection was disrupted and no measurement was taken. For aggregated intervals where other values exist, the nulls are interpolated with a rolling average of the other values within the interval. Impossible values are also interpolated with the mean of the interval. Impossible values are raw occupancy values greater than 1800 (given that only 1800 scans are taken in a 30-second period). If an entire interval contains only nulls, it is converted to ‘NA’ and no values within the interval are interpolated.

Note that a variable is created containing the percentage of nulls/impossible values in that interval; therefore, one can choose to exclude intervals with interpolation rates above a certain threshold of choice (e.g. if more than, say, 30% of the data is missing).

The following data cleaning steps and checks are taken in scrub_sensor()

  • Data is complete for all sensors (2,880 observations per sensor per day)
  • Make volume and occupancy values NA if
    • vehicle volume >= 20 vehicles, or 2,300 vehicles per hour
    • occupancy >= 1,800 scans, or 216,000 scans per hour [^ 60 scans per second * 60 secs per min * 60 mins per hour = 216,000 scans per hour]
  • The percentage of NA values is calculated for occupancy and volume. If the NA percentage is greater than a given threshold, all volume and occupancy values for a sensor are replaced with NA
    • If volume or occupancy are NA, speed is changed to NA

There is ongoing discussion on whether you can have valid occupancy with invalid volume, and vice versa.

Calculations

Calculations for speed, flow, headway, density and lost/spare capacity must be conducted for a given time aggregation. Common time intervals include 10, 15, and 30 minutes, hourly, peak traffic periods, and daily.

Density

Density is the number of vehicles per mile calculated from Flow and Speed (Flow / Speed). See full calculation method for additional context.

Speed

Speed must be an aggregated measure. We do not recommend going below a 15-minute interval for calculating speed.

Speed, in miles per hour, is calculated multiply the number of vehicles per hour by field length in miles divided by occupancy.

\[\frac{\text{Vehicles Per Hour} \times \text{Field Length}}{\text{Occupancy Percentage}} \]

Speed can also be calculated with Flow and Density.

\[\frac{Flow}{Density} \]

‘Vehicles Per Hour’ is calculated by summing all the vehicles over the given interval, and then multiplying that by the number of intervals in an hour (there are 4 15-minute intervals in an hour, so the multiplier is 4). ‘Vehicle Length’ is a static field in the sensor configuration dataset. ‘Occupancy Percentage’ is calculated by summing all the occupancy values over the given interval, and then dividing by the total number of scans in the interval period (54,000 = 1,800 scans in 30 seconds * 30 periods in 15-minute interval).

Lost or Spare Capacity

The average flow that a roadway is losing, either due to low traffic or high congestion, throughout the sampling period. Capacity is calculated using Flow and Density.

If flow exceeds 1,800 vehicles per hour, the sensor is considered well performing. If flow is under 1,800 vehicles per hour and vehicle density is greater than 43 vehicles per mile, the sensor is at lost capacity; there is more traffic than the road can accommodate at free-flow speed. If the flow is under 1,800 vehicles per hour and vehicle density is less than 43 vehicles per mile, the sensor has spare capacity; there is less traffic than the road is built to handle.

  • Flow > 1800: 0, roadway operating at appropriate capacity
  • Density > 43: Lost Capacity: Flow - 1800
  • Density ≤ 43: Spare Capacity: 1800 - Flow

Additional documentation

The Regional Traffic Management Center (RTMC) maintains the Intelligent Roadway Information System (IRIS) with an open-source repository on GitHub. IRIS documentation is served from the GitHub repo, as well.