Bus Data Ecosystem
1. Introduction
Scope
The dataset encompasses:
- Standard GTFS components (Stops, Routes, Trips, Stop Times, Shapes, Fare Rules, Fare Attributes).
- Additional datasets such as OD (Origin-Destination) flow for buses, operational schedules, and passenger boarding/alighting patterns.
Stakeholders
Transit agencies, GIS teams, developers, urban planners, and researchers.
Use Cases
- Enhancing transit planning and operations management.
- Supporting multi-modal transport integration and route optimization.
- Developing real-time passenger information systems and analytics tools.
2. Categories and Data Fields
Standard GTFS Categories
1. Stops (stops.txt
)
- Fields:
stop_id
: Unique identifier (mandatory).stop_name
: Name of the stop (mandatory).stop_lat
,stop_lon
: Geographic coordinates (mandatory).location_type
: Type of stop (e.g., 0 for stop, 1 for station).parent_station
: Reference to the parent station, if applicable.
Example:
stop_id,stop_name,stop_lat,stop_lon,location_type,parent_station
1001,Adarsh Nagar,28.7169,77.1703,0,
2. Routes (routes.txt
)
- Fields:
route_id
: Unique identifier (mandatory).agency_id
: Operator identifier (mandatory).route_short_name
: Short name or number (mandatory).route_long_name
: Full route description (mandatory).route_type
: Transport type (e.g., 3 for bus).
Example:
route_id,agency_id,route_short_name,route_long_name,route_type
2001,DTC,620,New Delhi to Dwarka,3
3. Trips (trips.txt
)
- Fields:
trip_id
: Unique identifier (mandatory).route_id
: Associated route (mandatory).service_id
: Service schedule identifier (mandatory).trip_headsign
: Destination or key stops.
Example:
trip_id,route_id,service_id,trip_headsign
3001,2001,1,Dwarka Sector 21
4. Stop Times (stop_times.txt
)
- Fields:
trip_id
: Associated trip identifier (mandatory).arrival_time
,departure_time
: Timings (mandatory).stop_id
: Stop identifier (mandatory).stop_sequence
: Stop order in the trip (mandatory).
Example:
trip_id,arrival_time,departure_time,stop_id,stop_sequence
3001,08:00:00,08:00:00,1001,1
5. Shapes (shapes.txt
)
- Fields:
shape_id
: Identifier for the shape (mandatory).shape_pt_lat
,shape_pt_lon
: Coordinates of shape points (mandatory).shape_pt_sequence
: Order of points (mandatory).
Example:
shape_id,shape_pt_lat,shape_pt_lon,shape_pt_sequence
4001,28.7169,77.1703,1
6. Fare Attributes (fare_attributes.txt
)
- Fields:
fare_id
: Unique identifier for the fare (mandatory).price
: Fare amount (mandatory).currency_type
: Currency code (e.g., INR).payment_method
: 0 (pay onboard) or 1 (prepaid).
Example:
fare_id,price,currency_type,payment_method
5001,10.00,INR,0
7. Fare Rules (fare_rules.txt
)
- Fields:
fare_id
: Fare identifier (mandatory).route_id
: Associated route.
Example:
fare_id,route_id
5001,2001
Additional Suggested Datasets
1. OD Flow for Buses (od_flow.csv
)
- Fields:
origin_stop_id
: Origin stop identifier.destination_stop_id
: Destination stop identifier.passenger_count
: Number of passengers traveling.
Example:
origin_stop_id,destination_stop_id,passenger_count
1001,2001,350
2. Operational Schedules (schedules.txt
)
- Fields:
route_id
: Route identifier.service_start_time
,service_end_time
: Operation hours.frequency
: Frequency of service in minutes.
Example:
route_id,service_start_time,service_end_time,frequency
2001,06:00:00,23:00:00,10
3. Boarding/Alighting Patterns (boarding_alighting.csv
)
- Fields:
stop_id
: Stop identifier.boarding_count
: Number of passengers boarding.alighting_count
: Number of passengers alighting.
Example:
stop_id,boarding_count,alighting_count
1001,120,90
3. Validation Standards
Field-Specific Validation Rules
- Coordinates: Must be valid geographical values.
- Identifiers: Must be unique and alphanumeric.
- Times: Must adhere to HH:MM:SS format.
Cross-Category Validation
- Ensure
route_id
intrips.txt
exists inroutes.txt
. - Validate
stop_id
instop_times.txt
againststops.txt
. - Ensure
trip_id
instop_times.txt
exists intrips.txt
.
4. Default Values and Tolerances
- Default missing
location_type
instops.txt
to0
. - Allow ±0.0001 for coordinate discrepancies.
- For scheduled times, tolerate ±1 minute.
5. Example Data Templates
Stops Template
stop_id,stop_name,stop_lat,stop_lon,location_type,parent_station
1001,Adarsh Nagar,28.7169,77.1703,0,
OD Flow Template
origin_stop_id,destination_stop_id,passenger_count
1001,2001,350