Overview

Case study: How does a bike-share navigate speedy success?

Cyclistic’s Story

We’re exploring how rider duration patterns differ between casual and member types—who rides longer, when, and where. For example, who rides longer? Rider duration patterns paired by station names? Rider day of week vs weekend lengths and also ride duration broken by time of day. The company believes that maximising the number of annual members will be key to future growth. Rather than creating a marketing campaign that targets all-new customers, there is a solid opportunity to convert casual riders into members. Executives understand that casual riders are already aware of the Cyclistic program and have chosen Cyclistic for their mobility needs. I will unravel interesting rider patterns and provide a summary of my analysis at the end of the report.

Three questions will guide the future marketing program:

  1. How do annual members and casual riders use Cyclistic bikes differently?
  2. Why would casual riders buy Cyclistic annual memberships?
  3. How can Cyclistic use digital media to influence casual riders to become members?

đźš´ Rider Duration Patterns

By conducting my analysis I can see there is a clear difference in how casual and member types differ. Casual riders tend to take longer trips on average than members, often using the bikes for leisure rather than commuting. Members show shorter average durations approximating 12 minutes, which aligns with commuter behavior — quick, point-to-point travel, often during peak hours.This pattern supports targeted planning around peak weekend usage and pricing strategy.

# Average Ride Duration by Rider Group

avg_by_rider <- station_data_clean %>%
  group_by(rider_group) %>%
  summarise(avg_duration_mins = mean(ride_length_mins_2, na.rm = TRUE)) %>%
  arrange(desc(avg_duration_mins))

ggplot(avg_by_rider, aes(x = reorder(rider_group, avg_duration_mins), y = avg_duration_mins, fill = rider_group)) +
  geom_col(width = 0.3) +
  scale_fill_manual(values = c("casual" = "#D55E00", "member" = "#009E73")) +
  labs(title = "Average Ride Duration by Rider Group",
       x = "Rider Group",
       y = "Average Duration (mins)") +
  theme_minimal()

While casual riders tend to ride longer, members appear to use the service more frequently. This duality highlights the importance of tailoring engagement strategies to each group’s usage pattern.

ride_count_by_group <- station_data_clean %>%
  group_by(rider_group) %>%
  summarise(ride_count = n()) %>%
  arrange(desc(ride_count))

ggplot(ride_count_by_group, aes(x = reorder(rider_group, ride_count), y = ride_count, fill = rider_group)) +
  geom_col(width = 0.3) +
  scale_fill_manual(values = c("casual" = "#D55E00", "member" = "#009E73")) +
  geom_text(aes(label = scales::comma(ride_count)), vjust = -0.5, size = 4) +
  labs(title = "Ride Count by Rider Group",
       x = "Rider Group",
       y = "Total Rides") +
  theme_minimal()

Here’s another quick visual cue:

station_data_clean %>%
  group_by(rider_group) %>%
  summarise(
    avg_duration = round(mean(ride_length_mins_2, na.rm = TRUE), 1),
    ride_count = n()
  ) %>%
  arrange(desc(ride_count)) %>%
  knitr::kable()
rider_group avg_duration ride_count
member 13.2 720311
casual 89.5 67877

🌆 When Do We Ride? Time-of-Day Patterns Across Rider Groups

“Casual riders linger longer, especially after dark — while members keep it short and steady.

This chart reveals that members ride short and steady, while casual riders linger longer—especially after dark. To convert more casual riders, I suggest we offer them flexible membership plans especially for those who are simply visiting the city for a short period of time. We could use the following tag line to grab their attention, “Grab your 10% evening membership discount now.”

#Time of Day: average ride duration

station_data_clean <- station_data_clean %>%
  mutate(
    hour = lubridate::hour(started_at),
    time_of_day = case_when(
      hour >= 5 & hour < 12 ~ "Morning",
      hour >= 12 & hour < 17 ~ "Afternoon",
      hour >= 17 & hour < 21 ~ "Evening",
      TRUE ~ "Night"
    ),
    time_of_day = factor(time_of_day, levels = c("Morning", "Afternoon", "Evening", "Night"))
  )

duration_by_time <- station_data_clean %>%
  group_by(time_of_day, rider_group) %>%
  summarise(avg_duration = mean(ride_length_mins_2, na.rm = TRUE)) %>%
  arrange(match(time_of_day, c("Early Morning", "Mid Morning", "Afternoon", "Evening", "Night")))
## `summarise()` has grouped output by 'time_of_day'. You can override using the
## `.groups` argument.
station_data_clean <- station_data_clean %>%
  mutate(time_of_day = factor(time_of_day,
                              levels = c("Early Morning", "Mid Morning", "Afternoon", "Evening", "Late Night")))

ggplot(duration_by_time, aes(x = time_of_day, y = avg_duration, fill = rider_group)) +
  geom_col(position = "dodge") +
  scale_fill_manual(values = c("casual" = "#D55E00", "member" = "#009E73"))

  labs(title = "Average Ride Duration by Time of Day",
       subtitle = "Members ride short and steady; casual riders linger longer afte dark.\nConsider flexible evening memberships to boost conversations",
       x = "Time of Day",
       y = "Average Duration (mins)",
       fill = "Rider Group") +
  theme_minimal()
## NULL

📊 Weighted Duration Analysis

To avoid skew from high-volume rider types, we calculated weighted average ride durations by station pair:

# Weighted Average Duration by Station Pair

library(tidyverse)

# Create top_pairs from most frequent station pairings
top_pairs <- station_data_clean %>%
  count(pair, sort = TRUE) %>%
  slice_head(n = 10)


# Step 1: Compute totals per pair and rider type
length_counts_2 <- station_data_clean %>%
filter(pair %in% top_pairs$pair) %>%
group_by(pair, rider_group) %>%
summarise(total_length = sum(ride_length_mins_2, na.rm = TRUE),
rider_count = n(),
.groups = "drop")
# Step 2: Get total rides per pair for weighting
pair_totals_2 <- length_counts_2 %>%
group_by(pair) %>%
summarise(pair_total = sum(rider_count))
# Step 3: Join and calculate weighted contribution
weighted_stats_2 <- left_join(length_counts_2, pair_totals_2, by = "pair") %>%
mutate(weight = rider_count / pair_total,
weighted_length = weight * (total_length / rider_count))
# Step 4: Summarize weighted average per pair
final_weighted_2 <- weighted_stats_2 %>%
group_by(pair) %>%
summarise(weighted_avg_length = sum(weighted_length))
# Optional: Visualise 📊
ggplot(final_weighted_2, aes(x = reorder(pair, weighted_avg_length), y = weighted_avg_length)) +
geom_col(fill = "#009E73") +
coord_flip() +
labs(title = "Weighted Average Ride Duration \nby Station Pair",
x = "Station Pair",
y = "Weighted Avg Duration (min)")

In Summary

  1. How do annual members and casual riders use Cyclistic bikes differently? Members ride shorter, more frequent trips—likely for commuting. Casual riders take longer, leisure-oriented rides, especially in the evening.
  2. Why would casual riders buy Cyclistic annual memberships? Flexible membership options (e.g., short-term or evening-focused plans) could appeal to casual users, especially tourists or night riders.
  3. How can Cyclistic use digital media to influence casual riders to become members? Targeted ads at high-traffic station pairs and time-specific promotions (e.g., “10% off evening rides”) could drive conversions.