Website Traffic Analysis Report

Author

Wenting Jiang

library(dplyr)
library(bigrquery) library(ggplot2) library(magrittr)

Introduction

This report analyzes the website traffic and user behavior of my personal blog using data collected from Google Analytics 4 (GA4) and exported to Google BigQuery.
The purpose is to understand how users interact with my website, what type of content performs best, and how I can optimize the website for better engagement and conversions.

project <- “ic-term-project-website” dataset <- “analytics_505865185” table <- “events_*”

sql <- ” SELECT event_date, event_name, COUNT(*) AS events FROM ic-term-project-website.analytics_505865185.events_* GROUP BY event_date, event_name ORDER BY event_date ”

ga_data <- bq_project_query(project, sql) %>% bq_table_download()

Data Source

The data used in this report comes from: - Google Analytics 4 (GA4) - Automatic daily export to BigQuery - Dataset: analytics_505865185 - Table: events_*

Below is the code for loading and processing the data:

library(bigrquery)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(ggplot2)

project <- "ic-term-project-website"
dataset <- "analytics_505865185"

sql <- "
SELECT event_date, event_name, COUNT(*) AS events
FROM `ic-term-project-website.analytics_505865185.events_*`
GROUP BY event_date, event_name
ORDER BY event_date
"

ga_data <- bq_project_query(project, sql) %>% 
  bq_table_download()
! Using an auto-discovered, cached token.
  To suppress this message, modify your code or options to clearly consent to
  the use of a cached token.
  See gargle's "Non-interactive auth" vignette for more details:
  <https://gargle.r-lib.org/articles/non-interactive-auth.html>
ℹ The bigrquery package is using a cached token for 'wenting4685@gmail.com'.
## Visualization 1: Daily Events Trend
daily_events <- ga_data %>%
    group_by(event_date) %>%
    summarise(total_events = sum(events))

plot(daily_events$event_date,
     daily_events$total_events,
     type="l",
     col="blue",
     xlab="Date",
     ylab="Total Events",
     main="Daily Total Events Trend")

## Visualization 2: Top Events (Most Frequent Event Names)
top_events <- ga_data %>%
  group_by(event_name) %>%
  summarise(total = sum(events)) %>%
  arrange(desc(total)) %>%
  head(10)

barplot(
  top_events$total,
  names.arg = top_events$event_name,
  col = "skyblue",
  las = 2,
  main = "Top 10 Most Frequent Events",
  ylab = "Event Count"
)

## Visualization 3: Traffic by Device Category
device_traffic <- ga_data %>%
summarise(total = sum(events))

pie(
  device_traffic$total,
  labels = paste0(device_traffic$device_category, " (", device_traffic$total, ")"),
  col = rainbow(length(device_traffic$device_category)),
  main = "Traffic Distribution by Device Category"
)
Warning: Unknown or uninitialised column: `device_category`.
Unknown or uninitialised column: `device_category`.