Skip to content

Conversation

@tgxworld
Copy link
Contributor

@tgxworld tgxworld commented Dec 12, 2025

This commit adds a X-Discourse-BrowserPageView-URL response header, alongside the X-Discourse-BrowserPageView response header, with the header value set to the referrer that triggered the browser page view tracking.

Do note that the HTTP_REFERER request does not provide us with this information due to the way we piggyback on existing ajax requests to track browser page views. The value of the request header will always be set to the document's location href.

@tgxworld tgxworld force-pushed the browser_page_view_url branch 11 times, most recently from 99d5da8 to e906210 Compare December 12, 2025 03:23
@tgxworld tgxworld force-pushed the browser_page_view_url branch from e906210 to 039fee0 Compare December 12, 2025 05:06
@tgxworld tgxworld force-pushed the browser_page_view_url branch 3 times, most recently from 7ef2e76 to 211dae3 Compare December 12, 2025 07:34
@tgxworld
Copy link
Contributor Author

tgxworld commented Dec 12, 2025

Reviewer notes

I tested locally in development with TRACK_REQUESTS=1.

Full page load (tracking via MessageBus)

Screenshot 2025-12-12 at 3 36 06 PM

onPageChange (tracking via first ajax request)

Screenshot 2025-12-12 at 3 37 00 PM

@tgxworld tgxworld force-pushed the browser_page_view_url branch from 211dae3 to dd89af4 Compare December 12, 2025 08:02
Comment on lines +11 to +16
@capture =
lambda do |_status, headers, _body|
browser_page_view_response_headers << headers if headers["X-Discourse-BrowserPageView"]
end

ResponseCaptureMiddleware.register_response_capture(@capture)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is abit of a hack to capture response headers but I feel that this area of the code is important enough to test since accuracy is important. Therefore, I think the trade-off is worth it here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree it's worth testing 💯

If you wanted to avoid the complexity of the middleware, an alternative approach could be to use Playwright's request/response interception. For example: https://github.com/discourse/discourse/blob/536acef425166a5980762148cf9b27bb30d0f3e3/spec/system/script_encoding_spec.rb#L10C2-L13C21

Since we don't need to manipulate the response here, maybe the even-simpler playwright_page.requests info will be enough: https://playwright.dev/docs/api/class-page#page-requests

@tgxworld tgxworld force-pushed the browser_page_view_url branch from dd89af4 to 929ebb6 Compare December 12, 2025 08:18
@tgxworld tgxworld marked this pull request as ready for review December 12, 2025 08:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants