Cheat sheet

Three Ways to Encode an Hour — Cheat Sheet

Forecasting hourly bike-share demand in Washington, D.C. The horse race between numerical, one-hot, and cyclic sin/cos encodings — plus the October 2012 anomaly that broke the dataset.

Read the full projectUpdated June 2026
1

The problem

UCI / Kaggle bike-share dataset: hourly rentals in Washington, D.C., 2011–2012. Target: total rentals per hour.

The challenge is that hour of day has a circular structure — 23:00 is one hour before 00:00, not 23 hours away. How you tell the model this changes the answer.

Goal: cleanly compare three encoding strategies under time-aware validation.

2

The three encodings

EncodingRepresentationImplies
Numericalhour = 0..23Linear distance — wrong (23 is not "far" from 0).
One-hot24 binary columnsNo distance between hours. Big sparse columns.
Cyclicsin(2πh/24), cos(2πh/24)True circular distance. Two columns.

Same trick works for: day-of-week, month, season, wind direction — any cyclical feature.

3

The horse race

Three models × three encodings = nine cells. Each model trained, validated, scored.

Linear models strongly preferred cyclic — they can't recover the wrap-around from numerical, and one-hot exploded dimensionality.

Tree-based models (Random Forest, Gradient Boosted): cyclic won slightly even though theory says trees should handle numerical OK. The win came from fewer splits needed to express "evening rush hour".

Bottom line: cyclic > one-hot > numerical, consistently.

4

Time-aware CV

Never shuffle time-series data for CV. I used:

  • Rolling-origin evaluation — train on [0, t], score on [t, t+h], slide forward.
  • Strict no-leakage: features that depend on the future of the train window are out.

The naïve k-fold split would have inflated scores by 10+ % and given a model that fails in production.

5

The October 2012 anomaly

A whole week of October 2012 had zero rentals — Hurricane Sandy shut down the city. A weather event the dataset's weather features didn't fully capture.

If that week falls inside the training window → no problem. If it falls in the validation window → the model looks like garbage.

Lesson: always plot residuals over time. The biggest errors cluster at the events the model couldn't see coming.

6

What I learned

  • Cyclical features deserve cyclical encoding, even for trees.
  • Time-aware CV is non-negotiable. Shuffle once, lie about your scores forever.
  • Anomalies + small data = unstable validation. Always inspect residuals along time.
  • Domain features beat tuning. "Is rush hour" was a stronger feature than any hyperparameter change.