Overview
tif1 uses Pydantic for comprehensive data validation. Validation ensures data integrity, catches corruption early, and provides type safety - but comes with a small performance cost.
Validation is optional and disabled by default for maximum performance. Enable it when data integrity is more important than speed.
Validation Modes
tif1 supports three validation modes:
Disabled (default) : No validation, maximum speed
Non-strict : Validation runs but doesn’t raise on errors
Strict : Validation raises exceptions on errors
Disabling All Validation
from tif1 import get_config
config = get_config()
# Disable all validation (default, fastest)
config.set( "validate_data" , False ) # General data
config.set( "validate_lap_times" , False ) # Lap data
config.set( "validate_telemetry" , False ) # Telemetry data
Enabling Validation
# Enable validation (adds 5-10% overhead)
config.set( "validate_data" , True )
config.set( "validate_lap_times" , True )
config.set( "validate_telemetry" , True )
Or via environment variables:
export TIF1_VALIDATE_DATA = true
export TIF1_VALIDATE_LAP_TIMES = true
export TIF1_VALIDATE_TELEMETRY = true
Validation Schemas
tif1 defines Pydantic models for each data type.
# From validation.py:103-112
class DriverInfo ( BaseModel ):
driver: str = Field( ... , min_length = 3 , max_length = 3 , pattern = r " ^ [ A-Z ] {3} $ " )
team: str = Field( ... , min_length = 1 , max_length = 100 )
dn: str = Field( ... ) # Driver number
fn: str = Field( ... ) # First name
ln: str = Field( ... ) # Last name
tc: str = Field( ... ) # Team color
url: str = Field( ... ) # Profile URL
Validation rules:
Driver code: exactly 3 uppercase letters
Team name: 1-100 characters
All fields required
Lap Data
# From validation.py:121-163
class LapData ( ConsistentLengthsMixin , BaseModel ):
time: list[ float | None ] = Field( ... , min_length = 1 )
lap: list[ float | None ] = Field( ... , min_length = 1 )
compound: list[ str | None ] = Field( ... , min_length = 1 )
stint: list[ int | None ] = Field( ... , min_length = 1 )
s1: list[ float | None ] = Field( ... , min_length = 1 )
s2: list[ float | None ] = Field( ... , min_length = 1 )
s3: list[ float | None ] = Field( ... , min_length = 1 )
# ... many more fields
@model_validator ( mode = "after" )
def validate_consistent_lengths ( self ) -> "LapData" :
"""Ensure all lists have the same length."""
return self ._check_consistent_lengths( "lap data" )
@field_validator ( "stint" )
def validate_stint ( cls , v : list[ int | None ]) -> list[ int | None ]:
"""Validate stint numbers are positive."""
for stint in v:
if stint is not None and stint < 1 :
raise ValueError ( "Stint numbers must be >= 1" )
return v
Validation rules:
All list fields must have consistent lengths
Stint numbers must be >= 1
Tire life must be >= 0
At least one lap required
Telemetry Data
# From validation.py:237-301
class TelemetryData ( ConsistentLengthsMixin , BaseModel ):
time: list[ float | None ] = Field( ... , min_length = 1 )
speed: list[ float | None ] = Field( ... , min_length = 1 )
rpm: list[ float | None ] = Field( default_factory = list )
gear: list[ int | None ] = Field( default_factory = list )
throttle: list[ float | None ] = Field( default_factory = list )
brake: list[ bool | None ] = Field( default_factory = list )
# ... more fields
@model_validator ( mode = "after" )
def validate_consistent_lengths ( self ) -> "TelemetryData" :
"""Ensure all non-empty lists have the same length."""
return self ._check_consistent_lengths( "telemetry array" )
Validation rules:
All non-empty arrays must have consistent lengths
Time and speed required
Other fields optional
Race Control Messages
# From validation.py:304-335
class RaceControlData ( ConsistentLengthsMixin , BaseModel ):
time: list[ float | None ] = Field( ... , min_length = 1 )
category: list[ str | None ] = Field( default_factory = list , alias = "cat" )
message: list[ str | None ] = Field( default_factory = list , alias = "msg" )
status: list[ str | None ] = Field( default_factory = list )
flag: list[ str | None ] = Field( default_factory = list )
# ... more fields
Weather Data
# From validation.py:338-393
class WeatherData ( ConsistentLengthsMixin , BaseModel ):
time: list[ float | None ] = Field( ... , min_length = 1 , alias = "wT" )
air_temp: list[ float | None ] = Field( default_factory = list , alias = "wAT" )
humidity: list[ float | None ] = Field( default_factory = list , alias = "wH" )
pressure: list[ float | None ] = Field( default_factory = list , alias = "wP" )
# ... more fields
@model_validator ( mode = "before" )
def _normalize_pascalcase_keys ( cls , data : Any) -> Any:
"""Convert PascalCase keys from CDN to snake_case."""
# Maps "AirTemp" -> "air_temp" etc.
Validation Constants
Physical limits for validation:
# From validation.py:42-49
MAX_RPM = 20000 # Maximum engine RPM
MAX_SPEED = 400 # km/h
MAX_GEAR = 8 # Highest gear
MAX_THROTTLE = 100 # Throttle percentage
MAX_ACCELERATION = 500 # m/s²
MIN_YEAR = 2018 # Data available from 2018
MAX_YEAR = 2100 # Far future limit
Validation Functions
Validate Drivers
from tif1.validation import validate_drivers
data = {
"drivers" : [
{
"driver" : "VER" ,
"team" : "Red Bull Racing" ,
"dn" : "1" ,
"fn" : "Max" ,
"ln" : "Verstappen" ,
"tc" : "#1E41FF" ,
"url" : "https://..."
}
]
}
# Raises ValidationError if invalid
validated = validate_drivers(data)
print (validated.drivers[ 0 ].driver) # "VER"
Validate Laps
from tif1.validation import validate_laps, validate_lap_data
data = {
"time" : [ 90.123 , 89.456 ],
"lap" : [ 1 , 2 ],
"compound" : [ "SOFT" , "SOFT" ],
"stint" : [ 1 , 1 ],
"s1" : [ 30.1 , 29.9 ],
"s2" : [ 30.0 , 29.8 ],
"s3" : [ 30.0 , 29.7 ],
"life" : [ 0 , 1 ],
"pos" : [ 1 , 1 ],
"status" : [ "VALID" , "VALID" ],
"pb" : [ True , True ],
}
# Strict validation (raises on error)
validated = validate_laps(data)
# Non-strict validation (returns original on error)
validated = validate_lap_data(data, strict = False )
Validate Telemetry
from tif1.validation import validate_telemetry, validate_telemetry_data
data = {
"time" : [ 0.0 , 0.1 , 0.2 ],
"speed" : [ 100.0 , 120.0 , 140.0 ],
"rpm" : [ 10000 , 11000 , 12000 ],
"gear" : [ 3 , 4 , 5 ],
"throttle" : [ 80.0 , 90.0 , 100.0 ],
"brake" : [ False , False , False ],
}
# Strict validation
validated = validate_telemetry(data)
# Non-strict validation
validated = validate_telemetry_data(data, strict = False )
Validate Race Control
from tif1.validation import validate_race_control_data
data = {
"time" : [ 100.0 , 200.0 ],
"cat" : [ "Flag" , "Flag" ],
"msg" : [ "GREEN FLAG" , "YELLOW FLAG" ],
"status" : [ "CLEAR" , "CAUTION" ],
"flag" : [ "GREEN" , "YELLOW" ],
}
validated = validate_race_control_data(data, strict = False )
Validate Weather
from tif1.validation import validate_weather_data
data = {
"wT" : [ 0.0 , 60.0 ],
"wAT" : [ 25.0 , 26.0 ], # Air temp
"wH" : [ 60.0 , 58.0 ], # Humidity
"wP" : [ 1013.0 , 1014.0 ], # Pressure
"wR" : [ False , False ], # Rainfall
"wTT" : [ 35.0 , 36.0 ], # Track temp
}
validated = validate_weather_data(data, strict = False )
Consistent Length Validation
The ConsistentLengthsMixin ensures all arrays have the same length:
# From validation.py:18-39
class ConsistentLengthsMixin :
"""Mixin that validates all list fields have consistent lengths."""
def _length_check_fields ( self ) -> tuple[ str , ... ]:
raise NotImplementedError
def _check_consistent_lengths ( self : T, error_label : str ) -> T:
first_len: int | None = None
for name in self ._length_check_fields():
values = getattr ( self , name)
if not values: # Skip empty lists
continue
current_len = len (values)
if first_len is None :
first_len = current_len
elif current_len != first_len:
raise ValueError ( f "Inconsistent { error_label } lengths" )
return self
This catches common data corruption issues:
# Invalid: arrays have different lengths
data = {
"time" : [ 90.1 , 89.2 ], # 2 items
"lap" : [ 1 , 2 , 3 ], # 3 items - ERROR!
"compound" : [ "SOFT" ], # 1 item - ERROR!
}
# Raises: "Inconsistent lap data lengths"
validate_laps(data)
Anomaly Detection
Detect data quality issues:
from tif1.validation import detect_lap_anomalies
laps = [
{ "lap" : 1 , "time" : 90.1 },
{ "lap" : 2 , "time" : 89.5 },
{ "lap" : 4 , "time" : 89.3 }, # Lap 3 missing!
{ "lap" : 5 , "time" : 200.0 }, # Outlier!
]
anomalies = detect_lap_anomalies(laps)
for anomaly in anomalies:
print ( f " { anomaly.type } : { anomaly.description } (severity: { anomaly.severity } )" )
print ( f "Details: { anomaly.details } " )
Output:
missing_laps: Missing 1 lap(s) (severity: medium)
Details: {'missing_laps': [3]}
outlier_times: 1 outlier lap time(s) detected (severity: low)
Details: {'outlier_count': 1, 'average_time': 89.725}
Anomaly Types
# From validation.py:86-91
class AnomalyType ( str , Enum ):
MISSING_LAPS = "missing_laps" # Gaps in lap sequence
DUPLICATE_LAPS = "duplicate_laps" # Same lap number multiple times
OUTLIER_TIMES = "outlier_times" # Lap times >3x average
Anomaly Structure
# From validation.py:94-100
class Anomaly ( BaseModel ):
type : AnomalyType
severity: str = Field( ... , pattern = r " ^ ( low | medium | high ) $ " )
description: str
details: dict[ str , Any] = Field( default_factory = dict )
Enum Types
tif1 defines enums for categorical values:
Tire Compounds
# From validation.py:52-61
class TireCompound ( str , Enum ):
SOFT = "SOFT"
MEDIUM = "MEDIUM"
HARD = "HARD"
INTERMEDIATE = "INTERMEDIATE"
WET = "WET"
UNKNOWN = "UNKNOWN"
TEST_UNKNOWN = "TEST-UNKNOWN"
Lap Status
# From validation.py:64-70
class LapStatus ( str , Enum ):
VALID = "VALID"
INVALID = "INVALID"
OUTLAP = "OUTLAP"
INLAP = "INLAP"
Session Type
# From validation.py:73-83
class SessionType ( str , Enum ):
PRACTICE_1 = "Practice 1"
PRACTICE_2 = "Practice 2"
PRACTICE_3 = "Practice 3"
QUALIFYING = "Qualifying"
SPRINT = "Sprint"
SPRINT_QUALIFYING = "Sprint Qualifying"
SPRINT_SHOOTOUT = "Sprint Shootout"
RACE = "Race"
Validation adds overhead to data loading:
Dataset No Validation With Validation Overhead 100 laps 50ms 53ms +6% 1000 laps 150ms 165ms +10% 5000 laps 500ms 550ms +10%
Recommendation : Disable validation in production for maximum speed, enable during development to catch issues.
When to Enable Validation
Development Enable validation while developing to catch data issues early
Testing Run tests with validation enabled for comprehensive checks
New Data Sources Validate when using new/untested data sources
Critical Applications Enable for apps where data integrity is critical
When to Disable Validation
Production Disable for maximum performance in production
Known Good Data Skip validation for trusted, cached data
Performance Critical Disable when every millisecond counts
Batch Processing Skip validation for large batch operations
Error Handling
Validation errors are raised as InvalidDataError:
from tif1.exceptions import InvalidDataError
from tif1.validation import validate_laps
try :
data = { "time" : [ 90.1 ], "lap" : [ 1 , 2 ]} # Length mismatch!
validated = validate_laps(data)
except InvalidDataError as e:
print ( f "Validation failed: { e } " )
print ( f "Reason: { e.reason } " )
With strict=False, validation failures are logged but don’t raise:
import logging
logging.basicConfig( level = logging. DEBUG )
# Returns original data on validation failure
validated = validate_lap_data(data, strict = False )
# Logs: "Lap validation failed (non-strict): ..."
Validation in Async Fetch
Validation automatically runs during data fetching:
# From async_fetch.py:28-85
def _validate_json_payload ( path : str , data : dict , config ) -> dict :
"""Validate fetched payloads based on path and config toggles."""
if path == "drivers.json" and config.get( "validate_data" , True ):
from .validation import validate_drivers
return validate_drivers(data).model_dump()
if path.endswith( "/laptimes.json" ) and config.get( "validate_lap_times" , True ):
from .validation import validate_lap_data
return validate_lap_data(data, strict = False )
if path.endswith( "_tel.json" ) and config.get( "validate_telemetry" , True ):
from .validation import validate_telemetry_data
return validate_telemetry_data(data[ "tel" ], strict = False )
return data
Validation runs automatically when enabled, catching issues at fetch time.
Best Practices
Use non-strict mode (strict=False) for automatic validation without breaking
Enable during development to catch issues early
Disable in production for maximum performance
Check anomalies for data quality monitoring
Monitor debug logs to see validation failures
Configuration Example
from tif1 import get_config
config = get_config()
# Development: enable all validation
if DEBUG :
config.set( "validate_data" , True )
config.set( "validate_lap_times" , True )
config.set( "validate_telemetry" , True )
# Production: disable all validation
else :
config.set( "validate_data" , False )
config.set( "validate_lap_times" , False )
config.set( "validate_telemetry" , False )
Or use environment variables:
# Development
export TIF1_VALIDATE_DATA = true
export TIF1_VALIDATE_LAP_TIMES = true
export TIF1_VALIDATE_TELEMETRY = true
# Production
export TIF1_VALIDATE_DATA = false
export TIF1_VALIDATE_LAP_TIMES = false
export TIF1_VALIDATE_TELEMETRY = false
Next Steps
Performance Optimize performance with validation off
Circuit Breaker Handle network failures gracefully
Configuration Full configuration reference