Archive | speaker RSS feed for this section

Applying Foucault’s Archaeology of Knowledge to Google Analytics

 Introduction: A Brief Overview of Google Analytics

Google Analytics consists of two main components: Google-programmed Javascript code embedded on each page within a website “which collects and sends visitor activity to your Google Analytics account” (“How Analytics Impacts,” 2014) and the reporting mechanism connected to the Javascript code where visitor activity is collected and displayed at www.google.com/analytics. The data are sent to Google’s servers for storage via Internet, mediated by the networked hardware elements (switches, routers, fiber, etc.) of the Web.

A visit to a web page in which Google Analytics code is embedded activates the embedded snippet, generates data, and sends those data points to Analytics.

Code snippet sample (from spcs.richmond.edu)

<script type="text/javascript">
 var _gaq = _gaq || [];
 // Main Site Account
 _gaq.push(['_setAccount', 'UA-xxxxxxx-1']);
 _gaq.push(['_trackPageview']);
 // Legacy Account
 _gaq.push(['l1._setAccount', 'UA-xxxxxxx-2']);
 _gaq.push(['l1._trackPageview']);
 //rollup account 
 _gaq.push(['rup._setAccount', 'UA-xxxxxxx-1']);
 _gaq.push(['rup._setDomainName', 'richmond.edu']);
 _gaq.push(['rup._trackPageview']);

 (function() {
 var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
 ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
 var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
 })();
 </script>

These data points include hundreds of characteristics of the visit, including page visited, time on site/time on page, referral sources, links selected to exit the site, and more. A visual representation of these data points is available below in Figure 1. All data points are recorded in Analytics at the instant of the visit (delayed or rerouted as needed by network hardware). The action of the visit generates these data points; once the visit has been recorded, there is no writing of data in Analytics until another set of data points is recorded via user interaction with content on the page.

Diagram

Figure 1: Visualizing a sample Google Analytics data set—Popplet 

An active user interaction with the web page itself is required for Google Analytics to register data points. An important distinction is that web crawlers, like Google crawlers, are not recorded as visits; only visitor interactions trigger a response from the embedded code snippet. There is a direct relationship between the data being collected in Analytics and a user’s interaction. However, the user does not enter these data in a conscious or meaningful way; they are simply collected and inscribed to Analytics in a transparent fashion.

Data collected in Google Analytics are aggregated and, for the end user, not individualized. Analytics’ data privacy and security notes that individual user data may be collected, but not provided to the end user (the Analytics account owner, manager, or specialist): “Google Analytics customers are prohibited from sending personally identifiable information to Google, but this principle might not apply in some instances in which Google Analytics is used to to analyze how Google products and services are used by signed in account holders” (“We Use Our Own Products,” 2014). The end user is unable to re-constitute personally identifying information about visitors from the data provided.

However, the aggregated data are able to describe a nuanced portrait of our website visitors, to the point that multiple aggregate profiles are created. The data can help us answer questions about our website, like how many users access the site using a mobile device or tablet or how many pages in the site an average user visits. The answers to these questions, in turn, generate action items to customize web page content to user technologies and patterns of behavior.

Relationship to Foucault

Foucault sought to avoid transcribing discourse within traditional unities like genre or oeuvre; instead, he sought dynamic dispersions to describe the sum of component parts brought together for a specific exigence. “The rules of formation are conditions of existence (but also of coexistence, maintenance, modification, and disappearance) in a given discursive division” (p. 38). In Google Analytics, aggregated data contribute to a discursive monument (p. 139) that describes visits to the website, ostensibly for a specific exigence (e.g. to learn more about the School of Professional and Continuing Studies degree programs: see Figure 2).

Web page screen capture

Figure 2: Sample University of Richmond School of Professional and Continuing Studies degree programs web page

These data, when examined by the end user, help determine whether content or information architecture of the website needs to be revised (e.g. visitors are spending less time on one program page than on another – does this suggest content is more or less compelling on one page than another? See Figure 3).

google analytics screen capture

Figure 3: Exigencies that arise from reviewing Analytics data: Should we revise content in the /hr-management/ folder because average time on page was so much less than /education/ during last month?

This process describes what I might consider a double exigence. On one hand, a visitor’s exigency inscribes visit data in Google Analytics; on the other hand, the end user reviews aggregated visit data to answer questions about the content and/or structure of the website.

Discursive Formation 

There is a single moment, one that is likely measured in milliseconds, even nanoseconds, in which the result of a user’s concrete interaction on a specific web page is the inscription of data on an encrypted Google server containing our Google Analytics account. This moment describes the discursive formation of a statement. Within Analytics, there is no way to have predicted that irruptive moment would occur, as the moment involved a single independent individual having a single, concrete, specific interaction with a specific web page. There is also no way to repeat that exact irruptive moment or that exact discursive enunciation. Even if the individual were to visit that page again within 30 days, the statement would be described in terms of a repeat rather than new visit, likely resulting from selecting a local browser bookmark or conducting a different search. Its existence as an Analytics artifact would therefore differ from previous recorded visits. The statement is not a structural unity; rather, it’s a function of the user’s instantiated interaction with a web page. “This is because it [a statement] is not itself a unit, but a function that cuts across a domain of structures and possible unities, and which reveals them, with concrete contents, in time and space” (Foucault 2010/1972, p. 87). In fact, I can see the monument of that moment in time and space in Analytics (see Figure 3, above, for a visualization of aggregated moments over a month).

Nodes in the Network

Google Analytics defines nodes in its network in terms of metrics and dimensions. Metrics are “quantitative measurements of users, sessions and actions” and dimensions are “characteristics of users, their sessions and actions” (Google Analytics Academy, 2013). In the Popplet Figure 1 (above), “Referring Source” describes metrics and “Visitor Info” describes dimensions. To generate any relationship among metrics and dimensions, a visitor actively engages with a web page that contains embedded code. The visitor to the page, in this case, would be Foucault’s subject. Foucault describes discourse as being formed in the differential relationship among speaker, site, and subject’s relationship to the object (p. 55). Within Analytics, we can see these elements working together to generate a statement. The creator(s) of the web page, both its content and its embedded Analytics code, and the host of the web page, in physical and virtual space, act together as speaker. The speaker presents the page in question (the object) to the subject. The site is described in several different ways as the visitor interacts with the page: site is captured in dimensions that define user characteristics like amount of time spent on a page, browser type, platform, time of day, IP address of the visiting computer’s physical network, approximate geographic location of the visitor’s browser, and more. The subject’s relationship to the page (object) is captured by metrics that measure activity, including referring source (the link clicked or URL entered to arrive at the website in question). Metrics and dimensions work together as discursive formation that is collected in Analytics. Without a differentiated relationship (in which the subject is entering URLs, selecting links, or some other positivistic action that generates browser activity), no discursive content is collected.

Definition

Google Analytics is a Foucauldian archive of networked discourse. The discursive formation occurs the moment a subject follows or enters a web link. The active interaction of subject, object, and speaker/author/creator generates a discursive statement. That statement’s networked archive is inscribed as an assemblage of data points. A summary of those data points—in relationship to one another as metrics and dimensions and in relationship to subject, object, speaker, goals, and events—appears below in Figure 4.

Popplet screen capture

Figure 4: Google Analytics as networked archive of a discursive statement—Popplet

Agency and Flow

Google Analytics nodes are metrics and dimensions. These nodes have no agency of themselves. They are created and inscribed in the moment of visiting a web page.

However, Analytics requires agency at higher levels of the network hierarchy, in the differentiated relationship among speaker (page author, coder, and host), site (metrics and dimensions), and subject (visitor) relationship to the object (web page). Among these nodes (which are tangentially part of the Analytics network because the object contains the embedded Analytics code snippet), the subject is the agent that creates and sustains the network. As the result of a concrete action on a tracked web page, visit data are generated by the embedded code snippet and transmitted, via network hardware, to Google servers. At the same moment, a separate snippet of code is written to (or updated on) the subject’s browser cache (a cookie) that assists the tracking snippet in determining whether the visitor is new or returning to the page. User agency can erase the cookie, which may the dimension of new or returning visitor, and the user can determine whether to follow links, stay on the page, or follow an embedded event (like watching a video or reviewing a news feed). Agency and flow are largely “single bus” activities—they travel from the visitor to the Google server, but not directly back to the visitor. Some indirect agency can be found in the speaker (author and coder) in that results of metrics and dimensions analyses may include changes to web pages that become new again to the subject (visitor).

The Archive and the Archaeologist

As the person who has been granted administrative authority by our central website authority (Director of Web Services) to interact with data in Google Analytics, I have access to a vast (albeit potentially incomplete, given Google’s ownership of the archive itself) portion of the archive of discourse. Foucault describes an archive as the collection of discursive formations, a finite collection that does not point to some transcendent future or some ideal meaning. “The never completed, never wholly achieved uncovering of the archive forms the general horizon to which the description of discursive formations, the analysis of positivities, the mapping of the enunciative field belong” (p. 131). Analytics does not ascribe meaning to the discursive moment itself. Rather, it records the irruptive actions of the discursive formation as a collection of statements in an archive. As an administrative user, I can access that archive and recreate a visualization of discursive moments that occurred. They are inscribed in the metrics and dimensions recorded at the irruptive moment. At best, I can “dig into” the archived results to determine patterns of activity (metrics) and characteristics (dimensions). I and other users with access to some or all aspects of the Analytics account are archaeologists plumbing the depths of the archive.

Google Analytics visualizes flow by archiving the actions that generated flow, but Analytics data themselves are not in flow. They’re an archive of data generated via discourse. For lack of a better analogy, GA is a chapter book I can read that contains archived evidence of discourse. Those traces represent, but are not themselves, the discursive formations of statements.

Conclusions

Google Analytics is a networked archive and an archived network.

Networked archive: The archive is networked in that it collects interrelated data points and demonstrates the relationship among those data points using visualizations and aggregated data. Those relationships can be explored by someone with user access to the Google Analytics account. In this networked instant, my role as archive archaeologist activates the network, which otherwise represents little more than a collection of data points that, at the moment of web browsing, represented active discourse.

Archived network: The network is archived in that Google Analytics collects the network activity of subjects, objects, and creators/speakers—their discourses. The subject’s interaction with a web page results in discursive formation of statements; a sample statement is visualized in Figure 4 (above). A collection of such statements from a single subject is aggregated as a user session, which I would consider Foucault’s concept of a monument. A collection of those user sessions (monuments) in aggregate is the archive, and that’s what Analytics gives access to.

Note

Original snippet: [...they are simply collected and inscribed to Analytics in a transparent fashion...]

“Transparent” probably isn’t the right term. If you’ve ever seen a page load delayed by a message at the bottom of the browser window that says something like “Loading analytics.google.com/ga.js,” you’ve encountered the code snippet at work, struggling through network latency to load the data to Google’s servers. [return]

References

Foucault, M. (2010). The archaeology of knowledge and the discourse on language. (A. M. Sheridan Smith, Trans.). New York, NY: Vintage Books. (Original work published in 1972)

Google Analytics Academy. (2013, October). Key metrics and dimensions defined [Video transcript]. Digital Analytics Fundamentals. Retrieved from https://analyticsacademy.withgoogle.com/assets/pdf/DigitalAnalyticsFundamentals-Lesson3.2KeymetricsanddimensionsdefinedText.pdf

How Analytics impacts your website code. (2014). Retrieved 2014, 10 February from https://support.google.com/analytics/answer/1008009?hl=en

We use our own products. (2014). Retrieved 2014, 10 February from https://support.google.com/analytics/answer/3000986?hl=en&ref_topic=2919631

[“Rue Foucault”: Creative Commons licensed image by Flickr user sarahstarkweather]