Archive | Google Analytics RSS feed for this section

Case Study #4: FrankenTheory

Boundaries in My Analysis of Google Analytics

I am limiting my analysis of Google Analytics as an object of study by focusing on its activities and its data model as reported in terms of dimensions and metrics.

  • Google defines Analytics activity as collection, collation, processing, and reporting.
  • Google describes its data model as consisting of user, session, and interaction.
  • Google collects and reports data in terms of dimensions (“descriptive attribute or characteristic of an object”) and metrics (“Individual elements of a dimension that can be measured as a sum or ratio”) (Google, 2014).

These limits and terms are described in detail in my earlier Re/Proposed Object of Study: Google Analytics blog post.

I chose GA as my object of study because it’s a tool with which I work on a daily basis. I proposed GA as my object of study to my boss, the director of our school’s marketing and communications team, before formally proposing it in class because I wanted approval to use our school’s GA account in my study. I also expected my study to contribute to my understanding and use of GA in web development and management. A deeper understanding of GA as a network has provided both a tool for theoretical exploration and practical application.

Here’s an example of how applied this theoretical study has become. On April 16, with little fanfare, Google announced that it was replacing the term “visit” with the term “session” in its reports. I missed the announcement entirely, so I was surprised while measuring the result of online advertising efforts in our campus newspapers to discover that the “unique visits” metric that I had been using was no longer available; instead, it had been replaced by the “sessions” metric, without the “unique” modifier. I was also surprised to discover that the “unique visits” metric I had been using did not match the “sessions” metric when I re-ran prior reports to test data accuracy reports; “sessions” reported higher numbers than “unique visits” had reported. As we reached the first of May, when I normally complete April reports, I realized the full extent of the terminology change: “unique visits” were no longer being measured. Two plus years of reporting data were potentially compromised as inaccurate, since we report data for month on month and year on year comparisons (e.g. does April 2014 look better than April 2013 in terms of overall unique web visits, and does the calendar year-to-date period of January-April 2014 look better than the previous January-April 2013 period?).

As a result of my study of the structure and function of Google Analytics, I had learned how GA counts session data. Critical inquiries had questioned whether GA’s reporting of unique visits could be accurate given the browsing patterns of today’s web visitors. Visits (now sessions) are defined as individual browsing sessions on a given website on a given browser and platform. A visitor (now user) who visits the same website using two different browsers (Chrome and Firefox, for instance) would be calculated as two unique visits (when unique visits were provided) because the session is browser specific. Furthermore, a visitor who visits the same website on a desktop platform browser, then revisits the same website on a mobile device, would be calculated as two unique visits, because the session is platform specific. In short, “unique visit” is really a calculation of “individual session” without a distinction of uniqueness of the visitor. Using the term “unique visit” suggested (and my marketing team and I took it to mean) visits by unique users, a measurement we considered superior because it suggested the actual number of visitors. What we should have been measuring, however, was visits, regardless of their “uniqueness,” because there was no unique quality to the visit in terms of the visitor. The end result is that I will need to re-record our historical data in terms of sessions rather than unique visits, potentially revealing visit patterns we had not before seen or understood.

Without this study of GA as a network, I would not have understood why reporting data did not match, and I would have struggled to find documentation of the issue. There remains little documentation from Google itself about the disappearance of unique visit as a reported metric as of this date. In short, the application of my theoretical exploration directly benefited my and my team, and ultimately our school and our understanding of our data within the framework of industry benchmarks.

Theories of Networks and Google Analytics

I’m using two theories — Castells’ network society and Deleuze & Guattari’s rhizome — to flesh out my understanding of Google Analytics and sketch out my Frankentheory of a network.

First, here’s a review of some familiar territory: My application of Castells’ network society to GA from Case Study #2. I’ve brought this in as a piece rather than linking to it because I’d like to make departures from specific aspects of this application in discussing Deleuze & Guattari and in sketching out a Frankentheory.

Defining Google Analytics

Castells (2010) considers technology to be society (p. 5). As a result, GA can be considered social. As an information technology, GA creates active connections between websites (data collection), Google data centers (data configuring and processing) including aggregated tables (processing), and GA administrator accounts (configuring and reporting). These active connections collect, mediate (configure and process), and report on the three aspects of the GA data model consisting of users, sessions, and interactions. These connections represent social actions. So Castells (2010) might define GA as a global informational network (p. 77) that collects data from and reports data to local nodes (websites). Google servers where data are configured and processed might be considered mega-nodes (xxxviii) that, through the iterative process of increasing user visits and interaction by improving website design and content based on GA reported results, impose global logic on the local (xxxix).

Nodes in Google Analytics

Screen Shot 2014-04-12 at 9.33.26 PM

Google Data Center Locations: Image from Google Data Centers.

Individual websites, GA account administrators, and website visitors are local nodes in the global informational network. Google data center servers are mega-nodes in the network. Google employees who program GA and maintain Google servers and centers are localized nodes in the global network. Google’s data centers are located in a variety of locations that include North America, South America, Europe, and Asia. Several are found in Castells’ (2010) “milieux of innovation” (p. 419) including Taiwan, Singapore, and Chile. Others are found in what appear to be unlikely global spaces, including Council Bluffs, Iowa, and Mayes County, Oklahoma. These locations reiterate Castells’ insistence that local and global are not mutually exclusive polar opposites; rather, the new industrial system is neither global or local, but a new way of constructing local and global dynamics (p. 423). Websites, administrators, visitors, servers, and employees are simultaneously localized nodes (even the the mega-nodes are situated in space and time) in the global informational network.

Agency among Google Analytics Nodes

GA account administrators and website visitors have the greatest level of agency in the network, while Google employees exert limited agency within the confines of their labor relationships and conditions. Account administrators would likely be considered among Castells’ (2010) “managerial elites” (p. 445), while Google employees who maintain and program the servers might be part of Castells’ disposable labor force (p. 295). Account administrators have the authority to configure GA data, including the ability to filter out results, narrow data collection according to metrics and dimensions, and even integrate external digital metrics in GA. This authority is not, of course, the authority of Google’s corporate structure and hierarchy, but within the boundaries of GA data model and activities, account administrators exude authority. Website visitors may choose to visit, or not visit, any given website, once or more than once (meaning a single session or multiple sessions). This agency includes the power to intentionally separate themselves from the network, meaning that, for users, they only enter into the network as a node when they visit the tracked website. Interestingly, only the GA administrator has authority to eliminate users from the network; account configurations may filter out visitors along several dimensions.

Nodal Situation and Relation

Nodes are locally situated. While simultaneously part of the global informational economy, all of the nodes in the GA network are situated in a space and time. This simultaneous here/there compression of space and time is the origin of Castells’ (2010) “space of flows” (p. 408) and “timeless time” (p. 460). Websites are simultaneously hosted on physical servers around the world and locally viewed on specific platforms and media. Users are simultaneously accessing global data in territorial space on hardware. GA administrators are situated while configuring accounts and loading reports from the cloud. Google data centers are situated in specific locations, but they collect and process global data from local spaces and times. Google employees are culturally and territorially situated in the global Google labor pool.

Data rarely travels along parallel paths in the GA data model or GA activities. Website visit data are collected in the data modeluser, session, and interaction data — and sent to Google data centers for processing and configuration. Other than writing unique user identification data onto cookies on users’ browsers or apps, little data travels from GA to users. Website content is indirectly affected by GA reports configured and read by GA administrators, but within the GA activity network, websites are unaffected by GA activity on the data model. Beyond the boundaries of the OoS, of course, Google serves plenty of data, in the form of ads, back to users. But that’s now beyond the scope of this study.

Movement in the Network

Framework for Movement: Wires in The Dalles, Oregon, Google Data Center. Photo from the Google Data Center Gallery.

Data moves in GA. More specifically, data in the GA data model moves in GA. Data are initiated by users visiting tracked websites. Specific frameworks must be in place for connections to occur and data in the data model to be collected. Namely, websites must contain GA tracking code, embedded in the website code through the agency of the GA administrator. The embedded GA tracking code enables, and the web browser and hardware afford (Norman, n.d.), the user to initiate a tracking pixel (gif) and generate data to be collected in the GA data model. Once collected, the data are configured (by the account administrator and by the GA algorithms), processed (in a largely opaque manner) and collated in aggregated data tables, and reported in visual and tabular representations. In Castells’ (2010) terms, data represent flow in the GA network (p. 442). That data is both spatial and temporal (it comes from and is attached to a specific territory and represents a specific, chronological activity), but it is also entirely global and digital.

Content in the Network

Data are collected and packaged — literally, in a gif image pixel — in parameters relating to user, session, and interaction. The GA tracking code encodes data and sends it to Google data centers where the data are decoded, configured based on administrator preferences, processed and repackaged in aggregated data tables, and made available to the account administrators. The reporting function remediates the data in visual and tabular formats for ease of reading and use. While the data reported are considered authoritative and authentic, the actual processing function remains largely proprietary, with only end results available to extrapolate what processing actually occurs. This black boxed processing function seems unlikely to represent Latour’s (2005) intermediary; as Fomitchev (2010) claims, there are probably processing functions that result in highly mediated, possibly even inaccurate, results. Castells (2010) would likely measure GA performance based on “its connectedness, that is, its structural ability to facilitate noise-free communication between its components” (p. 187). I hope we will see increased academic scrutiny focused on this perceived intermediary function in GA, even as we scholars rely on its results.

Birth and Death of a Network

Killing the Network: Failed Google data hard drives to be destroyed at the St. Ghislain, Belgium, Google Data Center. Photo from the Google Data Center Gallery.

Castells (2010) indicates that global informational networks emerge within milieux of innovation. These main centers of innovation are generally the largest metropolitan areas of the industrial age (p. 66), able to “generate synergy on the basis of knowledge and information, directly related to industrial production and commercial applications” (p. 67), and combine the efforts of the state and entrepreneurs (p. 69). Nodes on the network get ignored (and therefore cease to be part of the network) when they are perceived, by either the network or by its managerial elites, to have little value to the network itself (p. 134). The GA network grows as more nodes are added, either as users or as web pages with tracking code. GA administrators have agency to kill network nodes by removing tracking code from pages, or by directing IT managers to remove poorly performing web pages. Users have agency to quit visiting a website, thereby removing its value to the person. While many other actions by agents outside the GA network may affect the growth or dissolution of the network, they are outside the boundaries of the GA activity and data model.

And Now, the Rhizome

First a note about using Deleuze & Guattari. I did not enjoy or particularly “get” this reading the first time around. I grasped the broad strokes of the argument, but this is a chapter that requires close, multiple readings. What I discovered as I re-read the chapter in light of this analysis was that it addresses a significant aspect of networks that Castells does not — namely, a rhizomatic approach to networks problematizes the very definition of GA I established during my Re/Proposal. In short, applying Castells profited from the boundaries I placed on the OoS; applying Deleuze & Guattari requires eliminating the boundaries, preferring instead a situated, chronological cross-section as a set of boundaries enabling analysis.

Second, a note about this cross-sectional approach. In my scaffolding outline, I referred to a “flattened, rhizomatic” approach to composing and networks. Placing these two concepts together elicited useful feedback and discussion during the following class, as a result of which I realized that rhizomes are not naturally flattened. While Deleuze & Guattari (1980/1987) refer to flattened multiplicities, they do so in the context of many dimensions: “All multiplicities are flat, in the sense that they fill or occupy all of their dimensions” (p. 9). In fact, rhizomes are unpredictably dimensional; connections can and must occur along all dimensions: “any point of a rhizome can be connected to anything other, and must be” (p. 7). Since the boundaries of such a “network” can’t really be established, one way to analyze the rhizome is to take a cross-sectional slice, situated in space and time, of the rhizome and examine the relationships among points in the rhizome in this “flattened” slice. The rhizome is a multidimensional assemblage, not a flattened network.

These two notes represent realizations that complicate and problematize the restrictive perspective I offered of GA as a network. Limiting the network to GA activities and data model resulted in limits to what I could discuss in my application of Castells. For example, in discussing the birth and death of the network, I cut short my analysis with this limiter: “While many other actions by agents outside the GA network may affect the growth or dissolution of the network, they are outside the boundaries of the GA activity and data model.” Similarly, when addressing nodal situation and relations, I wrote this limiting statement: “Beyond the boundaries of the OoS, of course, Google serves plenty of data, in the form of ads, back to users. But that’s now beyond the scope of this study.” These limits were real — the boundaries I established for describing GA as a network did, in fact, prevent addressing aspects of the network — but they do not reflect an accurate mapping of GA network activity. Deleuze & Guattari (1980/1987) point out that “the rhizome is altogether different, a map and not a tracing” (p. 12, emphasis original). Tracing is the role of centralized control, of perspectives limited by binaries and “tree logic”: “What distinguishes the map from the tracing is that it is entirely oriented toward an experimentation in contact with the real” (p. 12). A mapped understanding of GA must address its real complexity, its nodes and connections in terms of real experiences, not centrally-defined boundaries.

A mapped, cross-sectional perspective on GA as a network was, to my surprise, the goal of my first case study. In fact, the first visualization of the network I provided was a portion of a Popplet titled “Visualizing a Partial Google Analytics Data Set.”

Screen Shot 2014-05-06 at 8.03.11 PM

Figure 1: Visualizing a sample Google Analytics data set from Case Study #1Popplet

My original attempt to visualize and define GA as a network was more chaotically rhizomatic than any other depiction I’ve attempted since. In fact, for much of the rest of the semester, I’ve been struggling to trace my understanding of GA as a network, when in fact Deleuze & Guattari would have me do precisely the opposite: map the multiplicity of GA as assemblage, depicted as a cross-sectional portion of the network situated in time and space.

Mapping GA as rhizome means accepting that users, servers, computers, mobile devices, browsers, operating systems, marketers, developers, programmers, designers, GA account administrators, Google data centers, Google programmers and server maintenance personnel, homes, home offices, office buildings, network cables, routers, switches, weather conditions, satellites, trans-Atlantic communications cables, seawater, signal degradation, electrons, light energy, insulators, and theorists must be included as nodes in the GA rhizome. GA collects data on some of these dimensions; other dimensions, however, are embedded as affordances and constraints to the web technologies that enable GA to measure dimensions at all, so these affordances and constraints must also be depicted in a cross-section of GA as rhizome.

There’s a reason Deleuze & Guattari did not include a visualization of the rhizome on their chapter. It’s too complex, too multi-dimensional, to capture in a 2-dimension drawing. But I’m going to give it a shot.

Popplet mind map

Figure 2: Visualizing Google Analytics as a Rhizome—Popplet

Figure 2 depicts a rhizome cross-section of a single node, User, and the connections that exist among dimensions of the GA data model, website affordances and constraints, website creators, and Google personnel. What this depicts is that a User connects from and to most of the nodes, that the nodes connected to the User are connected to one another, and that relationships proliferate exponentially if extrapolated to the entire list of dimensions. And these dimensions are themselves necessarily limited (perhaps even cross-sectioned) by the visualization technology and my own time and patience. Were I to connect all of the non-technological aspects to the User—like location and weather conditions — the rhizome could go on forever. The point is that mapping the actual rhizome, rather than tracing the limits of the network, generates the rhizome itself. Or, as Deleuze & Guattari (1980/1987) propose, “The map does not reproduce an unconscious closed in upon itself; it constructs the unconscious. It fosters connections between fields, the removal of blockages on bodies without organs, the maximum opening of bodies without organs onto a place of consistency. It is itself a part of the rhizome” (p. 12).

Closing Gaps

Castells offers a remarkably cogent and highly matched means of analyzing GA as a network as defined by Google itself: in terms of GA activities and the GA data model. Castells addresses issues of localization and globalization in ways that make sense for GA defined as Google defines it. Here’s my conclusion from Case Study #2.

While Castells addresses the local, he tends to discuss localization in terms of groups rather than individuals. In this way, Castells more closely resembles ecological theories that apply to organism categories rather than to individual organisms. He regularly refers to groups of people and nodes: the managerial elites (rather than individual leaders), the technological revolution (rather than revolutionary technology pioneers), and the global and local economy (rather than the economic wellbeing of the individual small business owner). The result is that I can’t really address the individual user as a single agent in GA. Then again, this is hardly a hardship, in that GA aggregates data and anonymizes identities. GA, too, resembles an ecological theory rather than a rhetorical theory; it focuses on profiles of territorially localized users rather than individual users in a specific city. As a result, Castells and GA match rather nicely in defining the boundaries of the discussion. In fact, I’d argue that GA (and Google more broadly) represent precisely the network society Castells defined in his text. It’s interesting that he didn’t predict or recognize the rise of Google as I would have expected him to do in his 2010 preface. And Castells’ (2010) discussion of communication media clearly did not predict the popularity or ubiquity of Google’s YouTube on the network as a differentiated medium whose content is driven by user tastes and users-as-producers (p. 399).

Once we admit the possibility that GA is not just what Google says it is, but that GA represents a much wider and broader rhizome of connections, Castells no longer adequately describes the network. GA as rhizome requires additional theoretical application for understanding and visualizing.

Frankentheory

After a semester of theorizing, what’s my own theory of networks?

Rhizome illustration

What I think a rhizome looks like. “The Opte Project” by Barrett Lyon. Creative Commons licence CC BY-NC-SA. From The Accidental Technologist‘s post The Way of the Rhizome #h817open

Networks are local. They are also global. This is not dualism, but convergence. Local and global converge in time and space, and we must be prepared to engage in both simultaneously. The global remains rooted in the local; local conditions and environments affect and influence connections to the global. In our efforts to understand global network activity, we should not lose sight of the affordances and constraints of local conditions, including available access to the internet, proximity to other nodes, and the politics of nodal connectivity.

Networks enable nodes. A collection of nodes does not a network make. Networks enable nodal activity; this means that network frameworks must be in place for networks to exist and start collecting nods. This also means that the activity of collecting nodes in networked. The network can grow well beyond its framework in unexpected and unpredictable ways, and this should be expected, anticipated, and planned to the extent possible.

Networks are rhizomes. Or at least rhizomatic. They are unlikely to require or have inherent hierarchical structures; these will have to be applied to the network. Rhizomatic structure and growth suggest unpredictability of nodal connections. As I understand rhizomes, the importance of any node being able to connect to any other node — or to anything, for that matter — cannot be overstated. It is this aspect of rhizomatic connectivity that I would consider “flat.” There are neither more nor less important nodes; there are no inherent political relationships between and among nodes. Any political power attributed to the node will either be self-contained or bestowed from outside the rhizome; within the ecology of the rhizome, all nodes are equally capable of connecting to all other nodes and to anything outside the rhizome. In this sense, I would suggest that rhizomes are politically flat.

Networks can be analyzed in cross-section; they are very difficult to analyze in real time as they exist. They are both too large to examine as a whole and too complex to analyze as active connections are “firing.” Cross-sections can be taken of specific aspects of the network or of the network as a whole. Cross-sections are frozen in time and show little activity, merely traces that can be followed and explored. Networks contains a multiplicity of simultaneous connective activity; our abilities to analyze simultaneity is limited. Instead, we must follow specific threads of connectivity through time and space to analyze them. Such analysis is made possible through cross section.

Google Analytics’ Contributions to English Studies

First, GA can and should be critically examined as a rhetorical technology. GA activity includes reporting. These reports are discursive and rely on visual and written rhetoric to communicate meaning. The “meaning” of a GA report can be manipulated like any other statistical data. Its meanings depend on local environment and conditions, comfort with standard and local meanings of GA terminology (like “session” or “user,” for example), and familiarity with the GA data collection model. Its visualizations can be analyzed for clarity and transparency, for cultural or sociological bias (related to colors used, default views, and other determined factors), and for its connectedness to other discursive elements (like websites whose visitor traffic it measures). Critical rhetorical analysis of GA reports could easily be an object of study by itself.

Second, GA can and should be critically approached as a black-boxed network whose data manipulation and configuration are largely hidden, lacking transparency. Google’s business model depends on its proprietary search results algorithms. It protects that algorithm carefully; while GA reporting is not directly dependent on the search algorithm, website visit data contribute to search results. Full disclosure of its data configuration and processing activities would likely reveal much about Google’s search algorithm; as a result, these processes are only partially disclosed. Google’s own Analytics help files and tutorials explain the order, purpose, and general procedures of data configuration and processing, but these files and tutorials do not reveal in-depth specifics on how collected data are processed into aggregate tables, nor how those tables are then indexed for rapid, near-instant on-the-fly reporting. Google’s market share in web search and advertising result in the formation of what Althusser (1971) called a repressive state apparatus; I suggest that GA is an ideological expression of that apparatus, or an ideological state apparatus. While neither Google nor GA is a state in a political sense, its size and clout suggest an industrial state-like entity with resources and influence strong enough to manipulate or evoke responses from other political entities, as it has done recently in relations with the government of Russia (Khrennikov & Ustinova, 2014).

Third, GA results themselves can and should be critically examined. Far too many otherwise critically-written journal articles use GA results as instrumental rather than mediated. That is, GA report data are accepted as unqualified and accurate reflections of website traffic rather than mediated reports of visitor activity. Little care is given to providing GA-specific definitions of terminology like “session” and “user.” This acceptance can result in significant reporting issues — I’m experiencing a particular situation as I type in which Google has revised a reporting criterion from “visits” to “sessions.” While these two terms are being used synonymously, one implication is that GA has removed the dimension of “unique visit” from its reporting matrix. GA’s definition of session doesn’t differentiate between unique or repeat visits among sessions, as each session is considered a unique event regardless of the identity (which may not be accurately known) of the visitor. Several reports I provide my dean and marketing director were based on unique visit numbers; as a result, I’m forced to rework all of my reports to reflect sessions rather than unique visits. This has implications for perceptions of “progress” and “improvement” among senior leadership, a particularly uncomfortable reality brought to bear this week. (Google changed its reporting structure without fanfare on April 16, announced in a Google+ post.)

Finally, GA’s data collection method can and should be understood as discursive. Individual GIF calls that report data back to Google servers do so in text tags attached to tracking pixels generated through data collection. For example, every GA tag begins with “utm,” a prefix whose meaning is unclear. Many data points are collected in abbreviations whose symbolic meanings would be interesting to explore. Again, GA offers few clues for more obscure abbreviations, although Google does provide a list of many (but not all) dimensions collected via tracking pixel calls. Some of these symbols are explained in the Google Developers (2014) Tracking Code Overview. While parameter abbreviations are obscure, the values themselves are even less clear. Consider the parameter/value pair “utmul=pt-br”: the utmul parameter represents “browser language” while the pt-br value represents “Brazilian Portuguese.” This symbolic communication system is itself fodder for rhetorical analysis and interpretation.

References

Althusser, L. (1971). Ideology and ideological state apparatuses (Notes towards an investigation). In B. Brewster (transl.) & A. Blunden (trans.), Louis Althusser archive. Retrieved from https://www.marxists.org/reference/archive/althusser/1970/ideology.htm (Original work published in Lenin philosophy and other essays)

Castells, M. (2010). The rise of the network society [2nd edition with a new preface]. Chichester, UK: Wiley-Blackwell.

Deleuze, G., & Guattari, F. (1987). A thousand plateaus: Capitalism and schizophrenia. (B. Massumi, Trans.) Minneapolis, MN: University of Minnesota Press. (Original work published 1980)

Google. (n.d.). Algorithms. Inside Search. Retrieved from 1 May 2014 from https://www.google.com/insidesearch/howsearchworks/algorithms.html

Google. (2014). Dimensions and metrics. Google Analytics Help. Retrieved from https://support.google.com/analytics/answer/1033861?hl=en

Google Analytics. (2014, April 16). Understanding user behavior in a multi-device world (Web post). Google+. Retrieved 1 May 2014 from https://plus.google.com/+GoogleAnalytics/posts/LCLgkyCn4Zi

Google Developers. (2014, April 16). Tracking code overview. Google Developers. Retrieved from https://developers.google.com/analytics/resources/concepts/gaConceptsTrackingOverview#gifParameters

Krennikov, I., & Ustinova, A. (2014, May 1). Putin’s next invasion? The Russian web. Bloomberg Businessweek. Retrieved from http://www.businessweek.com/articles/2014-05-01/russia-moves-toward-china-style-internet-censorship

[ Feature image: Today's latte, Google Analytics. CC licensed image from Flickr user Yuko Honda ]

Case Study: Scaffolding Outline

OoS: Google Analytics

  • Activities addressed in my OoS: Collection, Collation, Processing, Reporting
  • GA Data Model: User (Visitor), Session (Visit), Interaction (Hits)
  • Data Model Collections and Reports: Dimensions (“descriptive attribute or characteristic of an object”) and Metrics (“Individual elements of a dimension that can be measured as a sum or ratio”) (Google, 2014).

Theories & Selection Rationale

  • Ecosystem Ecology (Bateson, 1972/1987; Gibson, 1972/1986; Guattari, 1989/2012; Spellman, 2007)
    • Boundaries are difficult to define: Mirrors struggle to define GA boundaries
    • Inter-relatedness to neighboring ecosystems: GA connects and measures incoming & outgoing links
    • Limits analysis to groups of (rather than individual) living and/or nonliving things: GA only reports aggregated behaviors, even though it collects user data
  • Neurobiology (Annenberg Learner, 2013)
    • Demonstrates interconnectedness of various nodes and frameworks: GA data model reports metrics interconnected with dimensions to reflect user behaviors; GA also enables both SPCS account and UR roll-up account
    • Uses hippocampus as server metaphor: Google data center as input/output hub for GA data collation and processing
    • Affirms difference between input and output: GA collects data via data model (input) and reports results via aggregated data tables and visualizations (output)
  • Network Society  (Castells, 2010)
    • Limits analysis to groups rather than individuals: GA only reports aggregated behaviors, even though it collects user data (cf. Ecosystem Ecology, above)
    • Addresses movement of data through the network: GA focuses on movement of data from website server (collection) to Google data centers (collation & processing) to administrative accounts (reporting), although this movement is entirely serial rather than parallel
    • Provides hierarchy of nodes: GA endows administrators with creative, destructive, and manipulative authority in relation to data; other nodes have far less agency
  • Social Network (Deleuze & Guattari, 1980/1987; Scott, 2000; Rainie & Wellman, 2012)
    • Recognizes value of social capital in network growth: GA enables measurement of increased or decreased engagement and provides help to increase engagement (social capital)
    • Reveals rhizomatic (and unpredictable) character of network connections: GA visualizes network connectivity in myriad visualizations, tables, and downloadable files (which can also be visualized)
    • Values growth and sustenance of weaker ties: GA sets up goals that seek to measure and value increased engagement on less-engaging content

Similarities

  • Focus on flattened network
  • Emphasis on rhizomatic rather than hierarchical connections
  • Address difficulties of establishing boundaries
  • Recognize value of grouping in discussing large-scale network systems
  • Focus on nodal groupings rather that individual nodal identities
  • Define network as mediator rather than intermediate (Latour, 2005)

Minding the Gaps

  • Localization: Neurobiology and Network Society affirm the value and influence of local conditions on global networks that Ecosystem Ecology and Social Network either undervalue or do not address.
  • Activity and Flow: Ecosystem Ecology, Neurobiology, and Network Society address movement of data and value across or through the network that Social Network does not directly address.
  • Agency: Social Network and Neurobiology ascribe local agency to nodes that Ecosystem Ecology (focused on instinct) and Network Society (focused on hierarchical relationships among managerial elites) do not accept or address.

My Position as Scholar

These theories align with the following statements of my theories of scholarship and pedagogy:

  • I embrace the flattened, rhizomatic character of the 21st-century classroom as a (possibly the most) valid model for preparing students for the world of the 21st-century networked workplace.
  • I embrace composition as social and situated within a larger global context, and I embrace and value local and global aspects of the composing experience as preparation for both academic scholarship and professional management.
  • I embrace scholarship as collaborative and networked, and revel in the breakthroughs made more likely and/or possible through collaborative, rather than individual, scholarship.
  • I embrace pedagogy as joining with a group of students in a flattened community of learners in which, to the extent possible, hierarchical teacher-student relationships are replaced by flattened learner-learner relationships.
  • I embrace and seek connections between scholarship and utility, between theory and praxis, and between academic and alt-academic pursuits and theorizing.
  • I embrace Yagelski’s (2006) “troublemaking collectivity” as a mantra for the disruptive role of my own and my collaborative scholarship and pedagogy in institutions entrenched in antiquated, outdated theoretical paradigms.
  • I embrace as vital the role of network activity in learning activities.
Satellite image - night

U.S Atlantic Seaboard at Night: May 23, 2011. Original image from NASA Earth Observatory.

My Biases and Background

These theories align with my own biases and background in the following ways:

  • I am now, and have been since 2000, employed in an alt-academic role as a full-time marketing web manager and part-time adjunct professor of liberal arts and scholar of English studies. This role influences the value I place on connections between theory and praxis, between research and application.
  • As former director of a summer residential governor’s school for gifted and talented high school students, I value pedagogical theory and praxis that views standards-based education as little more than a starting point for true academic excellence. This experience influences my preference for network activity in learning activities, especially over standardized assessment tools and products.
  • As a professional writer and marketer, I use academic skills like research and collaborative composing in non-academic settings. This experience influences my preference for collaborative, team-based solutions to professional challenges, including audience research.
  • As a third culture kid who grew up outside of the U.S., I embrace the global nature of communications, commerce, development, employment, and growth. This experience influences my desire to place local activities and culture within global networks.
  • As a web developer, I value and prefer platform- and system-agnostic open-source software solutions over commercial, and especially proprietary, software solutions. This influences my desire to flatten hierarchical structures, especially of proprietary commercial interests, in favor of open-source and open-access models wherever feasible.
  • I am a social media marketer. As a result, I value social networks beyond their community-building application; I value them for monetization via targeted advertising. My role as a social media marketer influences my willingness to find value in globally-accessible (but not open-access or open-source) products like Google Analytics while pushing for greater openness and access to these social networking products (see the troublemaking collectivity statement, above).
  • I measure web visit data, and my job as web manager exists because I can demonstrate value through higher visit rates, greater visibility across networks, and ultimately higher admissions and enrollment figures. In a professional and continuing studies unit, the value of individual admissions and enrollments is taken very seriously. This experience forces me to work with Google Analytics, which directly influenced by choice of Google Analytics as my object of study. I enter this study with an eye towards providing my team and my administration critical theoretical approaches to data measurement that result in better, clearer communication with prospective and current students.

References

Annenberg Learner. (2013). Neurobiology. Rediscovering biology: Molecular to global perspectives [Online textbook]. Retrieved from http://www.learner.org/courses/biology/units/neuro/index.html

Bateson, G. (1987). Steps to an ecology of mind: Collected essays in anthropology, psychiatry, evolution, and epistemology. Northvale, NJ: Jason Aronson, Inc. Originally published in 1972

Castells, M. (2010). The rise of the network society [2nd edition with a new preface]. Chichester, UK: Wiley-Blackwell.

Deleuze, G., & Guattari, F. (1987). A thousand plateaus: Capitalism and schizophrenia. (B. Massumi, Trans.) Minneapolis, MN: University of Minnesota Press. (Original work published 1980)

Gibson, J. J. (1986). The ecological approach to visual perception. Hillsdale, NJ: Lawrence Erlbaum Associates. Originally published in 1979

Google. (2014). Dimensions and metrics. Google Analytics Help. Retrieved from https://support.google.com/analytics/answer/1033861?hl=en

Guattari, F. (2012). The three ecologies. Trans. Ian Pindar & Paul Sutton. London, UK: Continuum International Publishing Group. Originally published in 1989

Latour, B. (2005). Reassembling the social: An introduction to actor-network-theory. Oxford, UK: Oxford University Press. Clarendon Lectures in Management Studies

Rainie, L., & Wellman, B. (2012). Networked: The new social operating system. Cambridge, MA: MIT Press.

Scott, J. (2000). Social network analysis: A handbook (2nd ed.). Los Angeles, CA: Sage.

Spellman, F. R. (2007). Ecology for nonecologists. Lanham, MD: Government Institutes, 3-23; 61-84.

Yagelski, R. P. (2006). English education. In B. McComiskey (Ed.), English Studies: An Introduction to the Discipline(s) (pp. 275-319). Urbana, IL: NCTE.

[ Feature image: Bamboo Scaffolding, Cambodia. CC licensed image from Flickr user Lorna ]

Case Study #3: GA and Castells’ Network Society

Literature Review

As I noted in Case Study #2, Google Analytics (GA) appears most often in scholarship as a black-boxed application that reports (presumed accurate) visitor frequency and browsing behavior on websites. Websites are said by be “successful” in terms of reported visitor traffic to the site, number of pages viewed while on the site, length of browsing session, and additional metrics and dimensions. Few questions are asked of the application itself; its results are considered authoritative and accurate.

For this literature review, I sought scholarship that challenges the assumption of accuracy or convenience of GA data, either in term of collecting, configuring, processing, or reporting data. I also shifted my focus from searching in social sciences and humanities databases to searching in computer sciences-related databases. The results were mixed. On one hand, I found more scholarship that questioned Google Analytics and web/digital analytics in general; on the other hand, I found the scholarship less thorough than humanities or social sciences research.

Dhiman and Quach (2012) report briefly on the rationale and results of a workshop at CASCON ‘12 (Center for Advanced Studies on Collaborative Research) introducing Google’s Go and Dart, two applications under development (at the time) to enable “better analytics” and “better applications” (p. 253). The challenge Dhiman and Quach identify related to GA is that “in a world where there is an emergence of extensive use of analytics, data and fact-based decision making, spontaneous sorting of data becomes imperative…. [A]nalytics are crucial for knowledge discovery, business growth and technological improvements” (p. 253). Google Go is described as a “language that allows programmers to exploit concurrency in program by providing simple yet powerful features” that “make it an excellent language deploying application on concurrent systems” (p. 253). GA is one of many applications engaged in providing digital performance data; Go appears to provide programmers a language that enables concurrently-operating applications the ability to communicate with one another and to report on multiple application data at the same time. GA and other data-generating tools are implicitly critiqued for reporting data in a delayed and proprietary form that requires a mediating application to collate and report data spontaneously.

Fomitchev (2010) is far more direct in his GA critique. In a two-page poster presented at the 9th International Conference on World Wide Web, Fomitchev identifies specific inaccuracies in GA’s collecting of recurring website traffic using cookies. Specifically, Fomitchev finds that “Google Analytics ‘absolute unique visitor’ measure is shown to produce a similar 6x overestimation” of unique visitors (p. 1093). Based in comparative studies that collect recurrent visitor data via multiple methods, Fomitchev elaborates that “Google’s ‘absolute unique visitors’ are not at all unique: the inflation depends on the visitation frequency and grows linearly with time” (p. 1094, emphasis original). Given the potential, even likely, inflation of unique visitor numbers in GA reporting, Fomitchev concludes that the “discrepancy between unique cookies and unique visitors eases doubts in the accuracy of published unique visitor stats used to solicit advertising money” (p. 1094). While the critique of GA collecting methods is explicit, the implicit critique of using GA unique visitor reports to solicit funds for advertising seems more damning. GA as a free service must be monetized in Google ledgers, and advertising is where Google excels. If its reported data are inaccurate, its ethical foundation on accurate reporting (accuracy that is taken for granted, as shown in most studies) becomes suspect.

Back to the OoS

When I re-proposed Google Analytics as my object of study, I narrowed my discussion of GA to its data model and its activities. Both Dhiman and Quach (2012) and Fomitchev (2010) offer meaningful connections between GA and my theoretical lens, Castells’ (2010) social network theory. Dhiman and Quach reiterate the validity of Castells’ “space of flows” and “timeless time” in their needs assessment for a programming language that demonstrates “lightweight concurrency” in its ability to create sets of “lightweight communicating processes” between various programs running and reporting simultaneously (p. 254). Fomitchev (2010) corroborates Castells’ construction of “real virtuality” in which the local and the global function interchangeably and simultaneously, recognizing that GA, a global analytics application, is “fooled by periodic [local] cookie clearing and the multitude of [local] Internet access locations/devices…” (p. 1094).

Defining Google Analytics via Social Network Theory

Castells (2010) considers technology to be society (p. 5). While this seems extreme — I’d be more willing to accept technology as an aspect of society — the result is that GA can be considered social. As an information technology, GA creates active connections between websites (data collection), Google data centers (data configuring and processing) including aggregated tables (processing), and GA administrator accounts (configuring and reporting). These active connections collect, mediate (configure and process), and report on the three aspects of the GA data model consisting of users, sessions, and interactions. These connections represent social actions. So Castells (2010) might define GA as a global informational network (p. 77) that collects data from and reports data to local nodes (websites). Google servers where data are configured and processed might be consider mega-nodes (xxxviii) that, through the iterative process of increasing user visits and interaction by improving website design and content based on GA reported results, impose global logic on the local (xxxix).

Nodes in Google Analytics

Screen Shot 2014-04-12 at 9.33.26 PM

Google Data Center Locations: Image from Google Data Centers.

Individual websites, GA account administrators, and website visitors are local nodes in the global informational network. Google data center servers are mega-nodes in the network. Google employees who program GA and maintain Google servers and centers are localized nodes in the global network. Google’s data centers are located in a variety of locations that include North America, South America, Europe, and Asia. Several are found in Castells’ (2010) “milieux of innovation” (p. 419) including Taiwan, Singapore, and Chile. Others are found in what appear to be unlikely global spaces, including Council Bluffs, Iowa, and Mayes County, Oklahoma. These locations reiterate Castells’ insistence that local and global are not mutually exclusive polar opposites; rather, the new industrial system is neither global or local, but a new way of constructing local and global dynamics (p. 423). Websites, administrators, visitors, servers, and employees are simultaneously localized nodes (even the the mega-nodes are situated in space and time) in the global informational network.

Agency among Google Analytics Nodes

GA account administrators and website visitors have the greatest level of agency in the network, while Google employees exert limited agency within the confines of their labor relationships and conditions. Account administrators would likely be considered among Castells’ (2010) “managerial elites” (p. 445), while Google employees who maintain and program the servers might be part of Castells’ disposable labor force (p. 295). Account administrators have the authority to configure GA data, including the ability to filter out results, narrow data collection according to metrics and dimensions, and even integrate external digital metrics in GA. This authority is not, of course, the authority of Google’s corporate structure and hierarchy, but within the boundaries of GA data model and activities, account administrators exude authority. Website visitors may choose to visit, or not visit, any given website, once or more than once (meaning a single session or multiple sessions). This agency includes the power to intentionally separate themselves from the network, meaning that, for users, they only enter into the network as a node when they visit the tracked website. Interestingly, only the GA administrator has authority to eliminate users from the network; account configurations may filter out visitors along several dimensions.

Nodal Situation and Relation

Nodes are locally situated. While simultaneously part of the global informational economy, all of the nodes in the GA network are situated in a space and time. This simultaneous here/there compression of space and time is the origin of Castells’ (2010) “space of flows” (p. 408) and “timeless time” (p. 460). Websites are simultaneously hosted on physical servers around the world and locally viewed on specific platforms and media. Users are simultaneously accessing global data in territorial space on hardware. GA administrators are situated while configuring accounts and loading reports from the cloud. Google data centers are situated in specific locations, but they collect and process global data from local spaces and times. Google employees are culturally and territorially situated in the global Google labor pool.

Data rarely travels along parallel paths in the GA data model or GA activities. Website visit data are collected in the data modeluser, session, and interaction data — and sent to Google data centers for processing and configuration. Other than writing unique user identification data onto cookies on users’ browsers or apps, little data travels from GA to users. Website content is indirectly affected by GA reports configured and read by GA administrators, but within the GA activity network, websites are unaffected by GA activity on the data model. Beyond the boundaries of the OoS, of course, Google serves plenty of data, in the form of ads, back to users. But that’s now beyond the scope of this study.

Movement in the Network

Framework for Movement: Wires in The Dalles, Oregon, Google Data Center. Photo from the Google Data Center Gallery.

Data moves in GA. More specifically, data in the GA data model moves in GA. Data are initiated by users visiting tracked websites. Specific frameworks must be in place for connections to occur and data in the data model to be collected. Namely, websites must contain GA tracking code, embedded in the website code through the agency of the GA administrator. The embedded GA tracking code enables, and the web browser and hardware afford (Norman, n.d.), the user to initiate a tracking pixel (gif) and generate data to be collected in the GA data model. Once collected, the data are configured (by the account administrator and by the GA algorithms), processed (in a largely opaque manner) and collated in aggregated data tables, and reported in visual and tabular representations. In Castells’ (2010) terms, data represent flow in the GA network (p. 442). That data is both spatial and temporal (it comes from and is attached to a specific territory and represents a specific, chronological activity), but it is also entirely global and digital.

Content in the Network

Data are collected and packaged — literally, in a gif image pixel — in parameters relating to user, session, and interaction. The GA tracking code encodes data and sends it to Google data centers where the data are decoded, configured based on administrator preferences, processed and repackaged in aggregated data tables, and made available to the account administrators. The reporting function remediates the data in visual and tabular formats for ease of reading and use. While the data reported are considered authoritative and authentic, the actual processing function remains largely proprietary, with only end results available to extrapolate what processing actually occurs. This black boxed processing function seems unlikely to represent Latour’s (2005) intermediary; as Fomitchev (2010) claims, there are probably processing functions that result in highly mediated, possibly even inaccurate, results. Castells (2010) would likely measure GA performance based on “its connectedness, that is, its structural ability to facilitate noise-free communication between its components” (p. 187). I hope we will see increased academic scrutiny focused on this perceived intermediary function in GA, even as we scholars rely on its results.

Birth and Death of a Network

Killing the Network: Failed Google data hard drives to be destroyed at the St. Ghislain, Belgium, Google Data Center. Photo from the Google Data Center Gallery.

Castells (2010) indicates that global informational networks emerge within milieux of innovation. These main centers of innovation are generally the largest metropolitan areas of the industrial age (p. 66), able to “generate synergy on the basis of knowledge and information, directly related to industrial production and commercial applications” (p. 67), and combine the efforts of the state and entrepreneurs (p. 69). Nodes on the network get ignored (and therefore cease to be part of the network) when they are perceived, by either the network or by its managerial elites, to have little value to the network itself (p. 134). The GA network grows as more nodes are added, either as users or as web pages with tracking code. GA administrators have agency to kill network nodes by removing tracking code from pages, or by directing IT managers to remove poorly performing web pages. Users have agency to quit visiting a website, thereby removing its value to the person. While many other actions by agents outside the GA network may affect the growth or dissolution of the network, they are outside the boundaries of the GA activity and data model.

Boundaries of Discussion

Two sets of boundaries apply. First, the boundaries I set in re-proposing my object of study, namely limiting the application of theory to GA’s activity and data model. By narrowing my object of study, I believe I’ve given myself the ability to tackle each aspect of the theory’s application to GA more specifically and directly. The result is greater clarity in describing GA function and in applying particular aspects of theory to the object.

Second, Castells sets some boundaries to the application. While Castells addresses the local, he tends to discuss localization in terms of groups rather than individuals. In this way, Castells more closely resembles ecological theories that apply to organism categories rather than to individual organisms. He regularly refers to groups of people and nodes: the managerial elites (rather than individual leaders), the technological revolution (rather than revolutionary technology pioneers), and the global and local economy (rather than the economic wellbeing of the individual small business owner). The result is that I can’t really address the individual user as a single agent in GA. Then again, this is hardly a hardship, in that GA aggregates data and anonymizes identities. GA, too, resembles an ecological theory rather than a rhetorical theory; it focuses on profiles of territorially localized users rather than individual users in a specific city. As a result, Castells and GA match rather nicely in defining the boundaries of the discussion. In fact, I’d argue that GA (and Google more broadly) represent precisely the network society Castells defined in his text. It’s interesting that he didn’t predict or recognize the rise of Google as I would have expected him to do in his 2010 preface. And Castells’ (2010) discussion of communication media clearly did not predict the popularity or ubiquity of Google’s YouTube on the network as a differentiated medium whose content is driven by user tastes and users-as-producers (p. 399).

Castells claims that his three-volume series did not try, and is not trying, to predict future evolution of the network. He also claims to avoid ethical judgments on the managerial elites’ treatment of those lacking connectivity in the global network. I found neither claim satisfactory. As GA “black boxes” processes that need to be problematized, so Castells “black boxes” prediction and judgment as processes without taking personal responsibility. In this way, too, Castells and GA are good matches.

References

Castells, M. (2010). The rise of the network society [2nd edition with a new preface]. Chichester, UK: Wiley-Blackwell.

Dhiman, K., & Quach, B. (2012). Google’s Go and Dart: Parallelism and structured web development for better analytics and applications. In Proceedings of the 2012 Conference of the Center for Advanced Studies on Collaborative Research, (pp. 253-254). Riverton, NJ: IBM Corporation.

Fomitchev, M. I. (2010, April 26). How Google Analytics and conventional cookie tracking techniques overestimate unique visitors [Poster]. In Proceedings of the 19th International Conference on World Wide Web, (pp. 1093-1094). New York, NY: Association for Computing Machinery.

Google Data Centers. (N.d.). Data center locations. Retrieved from http://www.google.com/about/datacenters/inside/locations/index.html

Latour, B. (2005). Reassembling the social: An introduction to actor-network-theory. Oxford, UK: Oxford University Press. Clarendon Lectures in Management Studies

Norman, D. (n.d.). Affordances and design. Don Norman Designs. Retrieved from http://www.jnd.org/dn.mss/affordances_and_desi.html

Re/Proposed Object of Study: Google Analytics

I’m sticking with Google Analytics as my object of study. I’m too invested in the object, and it remains an important part of my professional responsibilities and therefore an object that I need to study, whether for this class or for professional development. In fact, this month I earned another certificate of completion for a Google Analytics Academy program, “Google Analytics Platform Principles.” The outcomes benefit me academically and professionally: the course contributed to my understanding of the underlying data structure and collection principles for the assignment’s ongoing case study, and it provided me some intriguing ideas for importing data into GA beyond those data collected by the tracking code to help my team measure our marketing effort success.

The GA platform consists of four activities based on dimensions (user characteristics) and metrics (quantitative interaction information): collecting, configuration, processing, and reporting. The Google Developers guide provides the following helpful visualization to describe the platform’s activity.

Google Analytics Platform Components visualization

Google Analytics Platform Components. Original image on the Google Developers Guide.

Collection: User-interaction data are collected through either the embedded code snippet or through the measurement protocol, an alternative system for manually submitting user-interaction data from mobile apps and other internet-connected appliances.

Configuration: Data are configured by the GA account manager(s) through the GA web interface or management API. Configuration settings permanently delimit data collections; as a result, at least one configuration is required to be unfiltered to ensure all possible data are accessible in at least one configuration, or, as GA refers to these configurations, Views.

Processing: Based on configuration settings (filters, groupings, etc.), raw data are processed and stored in aggregated data tables and in configured raw forms. Data tables organize data in pre-determined collections for quick access, but queries can be constructed to pull data from configured raw forms. Often such queries will sample data rather than pull all values, once again to speed the presentation of results.

Reporting: Data are reported via the GA web interface or via the Core Reporting API or Multi-Channel Funnel Reporting API. Reports can be constructed that will not provide meaningful results; not all dimensions are compatible or reportable with all metrics. As a result, GA account managers must construct views carefully and develop reporting goals and practices that yield meaningful and accurate results.

The GA data model consists of three levels that help collect and organize dimensions and metrics: user (visitor), session (visit), and interaction (hit). Lesson 1.3 in the GA Academy Platform Principles course offers the following visualization of this model.

Google Analytics data model visualization

Overview of the Google Analytics data model. Original image from the GA Academy Platform Principles Lesson 1.3.

User (visitor): The user is identified by the browser or mobile device the visitor used to access the site.

Session (visit): The session is defined as the time the user (browser or device) was active on the website.

Interaction (hit): Interactions are individual actions taken by a user that sends hit data to GA servers. These may be pageviews (loading the page), events (clicking on a movie button), a transaction (checking out of an online store), or a social interaction (sharing content on a social device).

As this chart reveals, the GA data model breaks engagement into a hierarchy. Interactions occur within sessions, and sessions are associated with a user. A user may have multiple sessions, and each session may have multiple interactions and interaction types. The GA account manager must determine measurement scope using this hierarchy. Is the goal to measure and report on interaction-level activity (number of pageviews regardless of user or session); session-level activity (common entrance or exit pages for sessions regardless of user); or user-level activity (number of unique users who completed a specific task, regardless of session)? The measurement goal determines the reporting scope.

So far, I’ve struggled to define the scope of GA as I’ve applied theories to it.

I’ve described GA as the reporting “arm” of a web development and visitor ecology in which nodes include web marketers and web developers, web services technicians and coders, database managers, marketing writers, content managers, website visitors, browsers and platforms, Internet hardware and software, and GA servers. In this model, GA collects traces of the active relationships that occur among these nodes.

I’ve also described GA as a mediating technology that directly and indirectly limits and controls the data collected from website interactions. Specific, delimited data points are the target of data collection and reporting. Those data points, and only those data points, are available to GA end users who seek information about user behavior on a website.

Defining Google Analytics

While neither of these descriptions is inaccurate, neither quite achieves the focus I’d like to apply to my case studies. I propose a description that focuses more directly on the GA platform’s four activities and the GA data model. Specifically, GA is a digital tool that collects user interaction data at three levels — user, session, and interaction — in the form of dimensions and metrics. Data collected are configured based on specific, targeted, goal-oriented decisions by GA administrative users, processed in accordance with those decisions, and output through aggregated data tables to GA users, both administrative and standard or limited-access users. This description focuses specifically on agency of GA administrators; in the case of my GA account for the University of Richmond School of Professional and Continuing Studies, that agency resides primarily in me and indirectly in our marketing team.

Application to English Studies

GA focuses on assessing outcomes. GA administrators configure data collected in GA to assess the results of specific marketing efforts. For example, in order to examine general and specific browsing patterns of external (non-UR) visitors, I need to configure our GA account with a view that filters out internal (UR-based) web traffic by IP address. Examining these browsing patterns enables our marketing team to determine whether the information we’re providing is attracting prospective students in ways that our strategic marketing plan requires or expects. In short, we are using configured data reports to assess the extent of success of our web-based marketing efforts. Such assessments offer English studies models for assessment that can and should be incorporated into writing assessment, writing program assessment, perhaps even departmental assessments. Data-driven assessments can and should include both user characteristics and metrics; that is, they should be based on user profiles intentionally constructed to include or reflect contemporary, lived experience. For English studies, this means our data collection efforts must be based in localized environments and configured to process and report on specific objectives and outcomes.

GA collects metrics, but its ability to collect dimensions (user characteristics) means that its reporting is verbal and numerical. As such, its reports are rhetorical. They can and should be problematized as rhetoric. Specific decisions to collect or not collect demographic data, for example, could be problematized using cultural studies or gender studies. Specific ways of reporting demographic data, including terms used to describe or define those demographic qualities, are also areas to be analyzed and problematized. From its use of colors to its data processing strategies (which remain obscure), GA is fair game for rhetorical analysis and critique, and scholars in English studies should focus more critical attention on analyzing GA rather than using GA to measure the success of web-based instructional or informational efforts.

GA as Network

GA is free and remarkably powerful. Google appears to be working to make it even more broadly applicable as a digital analytics platform, not simply a web analytics platform. The distinction is important to its role as a network. Web analytics are useful and meaningful, but they are limited in scope to websites and web interactions. Digital analytics, on the other hand, encompass a much broader category of data, like digital advertising (including web-based and localized advertising efforts, like digital billboards and online display ads), appliance function (including communications between digital devices like wifi-connected refrigerators or cell-connected washer/dryer sets), and mobile phone uses beyond calling. As GA broadens its applicability as a digital analytics platform, its reach and scope become global, both in location and function. GA can begin to measure global network functions; its ability to measure those functions is dependent on its own flexible network structure. Its collection, configuration, processing, and reporting functions are network-based and network-focused. Its internal structure, to the extent Google allows us to view it, is based on related aggregated data tables. And its objects of measurement are related digital nodes on networks. The result is that GA is both network reporter and networked reporter.

[Top image: Screen capture of Google Analytics homepage: google.com/analytics]

Case Study #2: Apply CHAT and ANT – CH[A(N)T] – to Google Analytics

Literature Review: Google Analytics, My Beloved OoS 

In general, researchers appear to use Google Analytics™ (GA) web analytics service as a tool for  measuring web visits and, to an extent, visitor behavior. In discursive terms, GA collects and visualizes an archive of traces of user interactions with web pages. The discursive activity of visiting (and, presumably, reading) a web page is seldom referenced in research that uses GA for measurement; instead, the archival trace of the discursive activity gets captured, archived, and visualized.

Most research uses an enthymeme that reads something like this: GA data can help developers improve websites. For example, Kirk et al. (2012), in an article seeking to monitor user engagement in an Internet-delivered genetics education resource developed for nurses, report that GA “informs approaches to enhancing visibility of the website; provides an indicator of engagement with genetics-genomics both nationally and globally; [and] informs future expansion of the site as a global resource for health professional education” (p. 559). Similarly, Mc Guckin & Crowley (2012), in an article evaluating the impact of an online cyber-bullying training resource, the CyberTraining Project, report that GA data have “allowed for the project team to further understand how best to optimize the product (i.e., the Website and the eBook) for ease of access and navigation by unique and referred users” (p. 629). Focusing more specifically on GA reporting over time, Plaza (2009) notes that “GA tells the web owner how visitors found the site and how they interact with it. Users will be able to compare the behaviour of visitors who were referred from search engines and emails, from referring sites and direct visits, and thus gain insight into how to improve the site’s content and design” (p. 475). Missing from the enthymeme are assumptions that connect GA to improved websites, assumptions that can be phrased in questions about the relationship between GA, website visitors, and website developers: What data are provided by GA that can directly relate to specific improvements in website design? What user behaviors can and should be examined via GA to evaluate the success of the website? What benchmarks should developers set to measure success or failure? While these questions are not ignored in research that uses GA reporting, they are not directly or specifically addressed. As a result, readers miss out on key assumptions that researchers make about specific ways the data provided by GA reports can and will be used to make concrete changes to website design and structure.

Bruno Latour’s (2005) introduction to actor-network-theory (ANT) identifies transporters of meaning among connections as “mediators” or “intermediaries.” An intermediary “transports meaning or force without transformation” while mediators “transform, translate, distort, and modify the meaning or the elements they are supposed to carry” (p. 39). When researchers present GA as a means of measuring user interaction with websites, they generally describe GA as an intermediary. By describing GA as an intermediary, researchers ignore, potentially to their peril, the mediating potential of GA reports. For example, Dahmen & Sarraf (2009), reporting visitor analytics of an online art museum exhibition, claim that “through the use of Google Analytics, this research seeks to understand how the public used the Web representation of the special exhibition” (p. 2). Their report represents GA data as authoritative and unmediated; the GA interface that visualizes and reports visit data is accepted as accurate, without comment. Mc Guckin & Crowley (2012) take a step toward recognizing the potential mediating effects of GA reports by claiming to “ascertain the efficacy of GA as an effective resource for measuring the impact of the CyberTraining project” (p. 628), but they conclude, “Such information [provided by GA] proves valuable in the iterative development and dissemination of the project and has, directly, informed the planning of the new CT4P project” (p. 629). GA is considered a blackboxed intermediary for reporting web visits. In other words, current research offers little theoretical perspective on the potential mediating effects GA may have on the data it reports and visualizes. This blog post seeks to remedy that omission by applying both ANT and cultural-historical activity theory (CHAT) to Google Analytics and the data it provides on visitor interactions with the website of the University of Richmond School of Professional and Continuing Studies (SPCS).

An OoS on the LOoSe

One of the most interesting aspects of using GA as my object of study (OoS) is that it remains a product continually in production. Although Google does not address it explicitly, it’s become clear that Google is working to make GA a digital analytics platform that expands well beyond the measurement of interactions on websites. I’m working toward a certificate of completion for Google Analytics Platform Principles (2014) as a followup to a certificate of completion I received for Digital Analytics Fundamentals (2013), and both of these online learning modules address Google Analytics as a broad-based digital analytics platform that handles data from a wide array of sources, even non-Internet-connected applications and appliances. The result, as I’ve experienced it, is that the Google Analytics Platform (yes, that’s the proper noun) is expanding its reach and scope on a weekly, perhaps even daily, basis.

This makes applying activity theories like the cultural-historical remapping of rhetorical theory (CHAT) and actor-network-theory (ANT) quite comfortable. GA as OoS is itself in active flux, continually redefining (perhaps more accurately expanding) itself for a fast-changing connected world.

ChOoSing a Definition

Screen capture of Google Analytics data model

Visualization of the laminated chronotope in Google Analytics. In this overview of the Google Analytics data model, the user (a CHAT node) engages with web content in space (interaction) and time (session).

CHAT might describe GA as a representation of practices within a laminated chronotope. As a tool that measures interactions between visitors and web pages, GA collects the results of “mediated activity:… action and cognition [that] are distributed over time and space among people, artifacts, and environments and thus also laminated, as multiple frames or fields co-exist in any situated act” (Prior et al., 2007, emphasis original). The action that gets represented as a visit in GA is loading a specific web page. Cognition gets represented in the action of following a link on a specific page to load a new page or resource. This activity is collected over time in a session, defined in GA as the time within 30 minutes a single visitor, identified by an anonymous, unique identifier and saved in a first-party cookie (“Platform Principles,” 2014) remains engaged within a surveilled website before leaving that domain or expiring the session time. GA represents all of the activity within that session in an aggregated visualization. Session data are collected over time and are the result of laminated activity among people, artifacts (like web pages) and environments (like browsers, computers, mobile devices and the like).

ANT might describe GA as traces of connections among networked actants. Actants captured in a web session might include the visitor, the technological interface (computer/mouse/monitor or mobile device), the web page content and links, the writer of the web content, the host server, the network gateways and cables, and many more too numerous to detail. ANT would likely chafe under the need to define the collecting mechanism itself, however, and suggest that GA might be an artificial data assemblage that needs to be reassembled. Specifically, since GA is a data framework that collects only preselected data points (“Tracking Code Overview,” 2012), GA might be accused of “filtering out” and “disciplining” the data collection: “Recording not filtering out, describing not disciplining, these are the Laws and the Prophets” (Latour 2005, p. 55, emphasis original). More useful might be the preprocessed data collected by Google Analytics servers; processing organizes the web session into a predefined framework, precisely the activity ANT seeks to avoid in its practice.

LOoSe the Nodes

CHAT might define nodes as literate activity “among people, artifacts, and environments” (Prior et al., 2007). Using this definition, GA includes such human nodes as website visitors, web writers (including CSS, XHTML, JavaScript, and other programmers), website designers and developers, and marketers who determine the content of the web pages and websites. In the case of the SPCS GA account, institutional nodes would include the University of Richmond and the School of Professional and Continuing Studies, each of which contributes in a meaningful way to the visual and textual rhetoric of the site. Working together as an ecology in the functional system of the website, these nodes would all be aspects of CHAT’s literate activity. Visitors might be ascribed limited agency for their roles in reading content and authoring linked narratives. Web writers, marketers, and developers would have full agency as content creators. The website itself is ascribed no agency; it’s not considered part of the natural ecology of the network. Institutional entities (UR and SPCS) have minimal agency as regulators of environment and work.

ANT defines nodes as actors, and there are myriad actors (more precisely, actor-networks) at work in GA. From the programmed codes written and interpreted to the software and hardware mediating and displaying web pages to the visitors and writers and programmers to the network providers and databases—ANT accepts any and all of these actants as nodes with the potential of agency. Latour (2005) refers to these objects as “the non-social means mobilized to expand them [the basic social skills] a bit longer” (p. 67) and confesses that ANT will “accept as full-blown actors entities that were explicitly excluded from collective existence by more than one hundred years of social explanation” (p. 69). The implication is that all the technological hardware and software — the GA code, the wired and wireless networks (cables, routers, and servers), and the Google Analytics processes server — work together to enable the web visitor to interact with this creation of the web writer, developer, coder, and marketer. This collective is incorporated at the moment of loading a web page, and its momentary connectivity is both enabled and expanded by agency of the object actors.

Where ROoSt the Nodes?

CHAT locates nodes in hierarchical relationships with one another in the network. Prior et al. (2007) conceive of literate activity producing socialized interaction within the functional system as part of the laminated chronotope of activity in space and time (Take 2: A Cultural-Historical Remapping of Theoretical Activity). In this hierarchy, web visitors are outside the system except during literate activity, defined as interacting with the multimodal text(s) within the site. Web writers, developers, and marketers are members of the functional system where literate activity (defined as creating and instantiating the multimodal text) occurs. The website itself is the functional system; the School gives the system chronological and spatial existence while the University gives the system technological existence. GA collects traces of literate activity among nodes within the functional system of the website, visualized and reported as interactions in space (between pages) and time (within sessions).

ANT flattens the network entirely. Latour’s (2005) conception of ANT works to keep the social flat (pp. 165-172), connecting all of the actor-networks (nodes) within the activity network in a single, non-hierarchical surface. Within GA, this flatness is largely retained within the report. All actor-networks have mediated, translated experiences of web content — there are no intermediary experiences, whether visitor or writer, software or hardware. GA reports a visualization of mediated network activity in a flattened data table. The flattened data table in GA treats the visitor’s web browser or operating system as equally significant to the actor-network represented by the visitor or web writer. Relationships between actors are largely un-disciplined; they are simply reported, regardless of the inherent logic (or lack thereof) in the relationship uncovered.

FootlOoSe Nodes

CHAT stresses an ecological relationship among nodes, limiting that ecology to the natural and material world (Prior et al., 2007). Visitors enter into the functional system of the website and navigate through it. Web writers, developers, and marketers engender the navigation links through the system, giving visitors pathways for narrative production. The website functions as the system, enabling web visits in time and space. The School provides content for the system, while the University provides the localized instantiation of the content in the website. GA records the traces of interactions within the functional system, visualizing them in laminated chronotopes in time and space. GA does not clearly identify the human actors in the network, preferring to aggregate identities. However, GA enables web writers, developers, and marketers to examine the traces of aggregated literate activity by visitors and revise website content and structure accordingly. This provides the opportunity for dialogue among human actors.

ANT stresses incoming connections among interconnected nodes. Latour (2005) frames this according to what it means to be a “whole”: “to be a realistic whole is not an undisputed starting point but the provisional achievement of a composite assemblage” (p. 208). Nodes that have more incoming connections than others are considered more settled and blackboxed, meaning they shift from being merely actors to becoming conduits for the flow of mediators: “an actor-network is what is made to act by a large star-shaped web of mediators flowing in and out of it” (p. 217). Such a star-shaped web of mediators is immediately visible in GA reports: the page in a website that receives the most visits or page views is the most connected page. This page is generally the website’s home page, and its purpose is not to provide content but to allow mediators to flow through it — to allow visitors to find what they’re seeking and connect to it.

WhOoShing through the Network

In GA, visit data — encrypted bits and bytes, assemblages of sequenced zeros and ones — moves from the visitor’s device to the GA server for processing and reporting. The collection process leading up to this movement differs between browsers (mobile and non-mobile) and mobile apps: browsers send data collections with every page load, but mobile apps bundle visit data and send it in timed intervals to protect mobile device battery life. This too simply describes a very complex ecology of network and computer hardware and software that transmits data from web content creators to web visitors to GA servers, but I’m limiting this discussion of movement to data from visitor’s device to GA servers. See the Google Analytics (2014) Academy “Data Collection Overview” video presentation (below) for additional details.

CHAT might describe this movement as distribution in the literate activity of viewing a web page or using a mobile app. Prior et al. (2007) define distribution as “the way particular media, technologies, and social practices disseminate a text and what a particular network signifies” (Mapping Literate Activity). In this case, two distributions occur: the distribution leading to reception (by the web page visitor) and distribution leading to the assemblage of visit data collected for interpretation on GA servers.

Screen capture from YouTube video

Visualization of Google Analytics data points. The tracking code packages visit (hit) data in an image request that looks like this. Screen capture from Google Analytics Platform Principles – Lesson 2.1 Data collection overview

ANT might describe this movement as the social. The assemblage of connections from hundreds of thousands of SPCS visitor pageviews flowing into the GA server could be what Latour (2005) calls “the social — at least that part that is calibrated, stabilized, and standardized — [that] is made to circulate inside tiny conduits that can expand only through more instruments, spending, and channels” (p. 241). In this case, the conduits are standardized in the GA’s preselected data points (“Tracking Code Overview,” 2012). When and if GA adds new data points for collection, these tiny conduits would be expanded. This definition also suggests that many other connections remain unsurveyed, Latour’s “plasma.” The assemblage of all connections would be the social fabric of the network.

Meaning Released from the HOoSegow?

CHAT might describe meaning as the result of literate activity in the functional system. Prior et al. (2007) map literate activity as a multidimensional process that can include production, representation, distribution, reception, socialization, activity, and ecology (Mapping Literate Activity). The results of this literate activity are recorded and transmitted from visitor’s devices to GA servers. The meaning of these data points are processed (interpreted) and reported as visualizations. That meaning becomes the basis of analysis; analysis leads to conclusions about visitor behavior, which in turn result in changes to the web content leading to new literate activities.

ANT, on the other hand, ascribes no meaning to the results of CHAT’s literate activity. Latour (2005) remains adamant into the conclusion of Reassembling the Social that the social is dynamic and active, not a substance: “the social is… detected through the surprising movements from one association to the next” (p. 246). As a result, what GA does in processing and visualizing the results of activity in the SPCS website is not about ascribing meaning, but about tracing associations. And because those associations (connections) are mediated by the limited data points collected, the processing done by the GA servers, and the visualizations available, the reassembled social of GA is likely too limited to trace the plasmatic connectivity of the visitor’s web browsing experience.

Networks Emerge, Networks VamOoSe

CHAT and ANT will agree on this: actors initiate, grow, and dissolve networks. Prior et al. (2007) and Latour (2005) build their arguments on the social activities of actors. CHAT engages those actors in literate activity, while ANT engages those actors as connected actor-networks. Only activity on the part of actors can cause the network to emerge. For CHAT, only the activity of web content creators, web developers, database administrators, marketers, and web visitors can generate the first packet of data to flow across the network from visitor device to GA server. For ANT, the list of actors can extend much farther into non-human actants, but the principle remains the same: actors must initiate the network. Actors can grow the network through more visitor sessions — by many measurements, adding visitor sessions and growing session length is my primary professional objective as web manager — and actors can also dissolve the network by removing a web page (authors) or no longer visiting the website (visitors).

ClOoSing Thoughts

GA itself is a fairly limited network. Its boundaries could easily be drawn around the connection between the GA code on the web page or in the mobile app and the GA server. Any other activity that either leads up to the connection or follows the connection — namely writing and viewing a web page or viewing and interpreting GA visualized data — could be seen outside the network. Except that CHAT and ANT seek to problematize such limited perspectives of networks by addressing the activity that enlivens connectivity. So for these two theories, I found myself widening the focus to include the biological (CHAT and ANT) and non-biological (ANT) nodes in the network. This perspective turns into an ecology whose various members are only momentarily connected at the moment of accessing a web page or mobile app. But in that moment, myriad connections reveal actors and build a remarkably complex assemblage of networked components. As a result, I found few limits in CHAT or ANT to addressing GA as my OoS — other than the shortage of meaningful English words that contain the character string “-oos”.

References

Dahmen, N., & Sarraf, S. (2009). Edward Hopper Goes to the Net: Media Aesthetics and Visitor Analytics of an Online Art Museum Exhibition. Conference Papers — International Communication Association, 1-28.

Digital analytics fundamentals [Online course]. (2013, October). Retrieved from Google Analytics Academy https://analyticsacademy.withgoogle.com/explorer

Google Analytics platform principles [Online course]. (2014, March). Retrieved from Google Analytics Academy https://analyticsacademy.withgoogle.com/explorer

Kirk, M., Morgan, R., Tonkin, E., McDonald, K., & Skirton, H. (2012). An objective approach to evaluating an internet-delivered genetics education resource developed for nurses: Using Google Analytics™ to monitor global visitor engagement. Journal of Research in Nursing, 17(6), 557–579. doi:10.1177/1744987112458669

Latour, B. (2005). Reassembling the social: An introduction to actor-network-theory. Oxford, UK: Oxford University Press. Clarendon Lectures in Management Studies

Mc Guckin, C., & Crowley, N., (2012). Using Google Analytics to evaluate the impact of the CyberTraining Project. CyberPsychology, Behavior & Social Networking, 15(11), 625-629. doi:10.1089/cyber.2011.0460

Platform principles: Website data collection [Video transcript]. (2014, March). Google Analytics Platform Principles. Retrieved from Google Analytics Academy https://analyticsacademy.withgoogle.com/course02/assets/html/GoogleAnalyticsAcademy-PlatformPrinciples-Lesson2.2-TextLesson.html

Plaza, B. (2009). Monitoring web traffic source effectiveness with Google Analytics: An experiment with time series. Aslib Proceedings, 61(5), 474-482. doi:http://dx.doi.org/10.1108/00012530910989625

Prior, P., Solberg, J., Berry, P., Bellwoar, H., Chewning, B., Lunsford, K. J., Rohan, L., Roozen, K., Sheridan-Rabideau, M. P., Shipka, J., Van Ittersum, D., & Walker, J. R. (2007). Re-situating and re-mediating the Canons: A cultural-historical remapping of rhetorical activity [Multimodal composition]. Kairos, 11(3). Retrieved from http://kairos.technorhetoric.net/11.3/binder.html?topoi/prior-et-al/index.html

Google Analytics. (2014, March 11). Google Analytics Platform Principles – Lesson 2.1 Data collection overview [Video file]. Retrieved from http://youtu.be/qQdPXouWeJE

Tracking code overview [Web page]. (2012, October 29). Google Analytics. Retrieved from Google Developers https://developers.google.com/analytics/resources/concepts/gaConceptsTrackingOverview#howAnalyticsGetsData

[Header image: I’m a Google Analytics Geek: Screen capture of the Google Analytics Academy]

OoS Outlines Reflection

My two case study outlines to read were Amy’s and Daniel’s, both of which proved to be extremely thorough and well thought-out.

For her case study, Amy plans to apply both genre theory and activity theory to MOOCs. While her outline is detailed, I’m concerned that she might be trying to do too much for the scope of this assignment. While I see the points of her conversation, I’m not sure how she is planning to discuss each point in relation to the different theories (but knowing Amy, I feel confident that’s something she already knows–it’s just not clear for me from the outline).

Daniel’s outline of applying both CHAT and ANT to Google Analytics is more similar to mine, which is potentially why it’s easier for me to follow. This seemed to be a thoroughly considered plan that conforms to the guidelines and questions of the assignment.

From their comments on their own outlines, I can see that Amy and Daniel were having the same difficulties I was–trying to consider how we’d outlined our rubrics for a case study against the questions. I feel like my outline is pretty bare bones compared to these two, but I was also trying to follow the instructions from Week 7 that said the outline should be an outline of applying theories, not what we would write about. I’m looking forward to reading the feedback my peers give me, but expect they will have had some of the same difficulties in providing feedback that I did.


Applying Foucault’s Archaeology of Knowledge to Google Analytics

 Introduction: A Brief Overview of Google Analytics

Google Analytics consists of two main components: Google-programmed Javascript code embedded on each page within a website “which collects and sends visitor activity to your Google Analytics account” (“How Analytics Impacts,” 2014) and the reporting mechanism connected to the Javascript code where visitor activity is collected and displayed at www.google.com/analytics. The data are sent to Google’s servers for storage via Internet, mediated by the networked hardware elements (switches, routers, fiber, etc.) of the Web.

A visit to a web page in which Google Analytics code is embedded activates the embedded snippet, generates data, and sends those data points to Analytics.

Code snippet sample (from spcs.richmond.edu)

<script type="text/javascript">
 var _gaq = _gaq || [];
 // Main Site Account
 _gaq.push(['_setAccount', 'UA-xxxxxxx-1']);
 _gaq.push(['_trackPageview']);
 // Legacy Account
 _gaq.push(['l1._setAccount', 'UA-xxxxxxx-2']);
 _gaq.push(['l1._trackPageview']);
 //rollup account 
 _gaq.push(['rup._setAccount', 'UA-xxxxxxx-1']);
 _gaq.push(['rup._setDomainName', 'richmond.edu']);
 _gaq.push(['rup._trackPageview']);

 (function() {
 var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
 ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
 var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
 })();
 </script>

These data points include hundreds of characteristics of the visit, including page visited, time on site/time on page, referral sources, links selected to exit the site, and more. A visual representation of these data points is available below in Figure 1. All data points are recorded in Analytics at the instant of the visit (delayed or rerouted as needed by network hardware). The action of the visit generates these data points; once the visit has been recorded, there is no writing of data in Analytics until another set of data points is recorded via user interaction with content on the page.

Diagram

Figure 1: Visualizing a sample Google Analytics data set—Popplet 

An active user interaction with the web page itself is required for Google Analytics to register data points. An important distinction is that web crawlers, like Google crawlers, are not recorded as visits; only visitor interactions trigger a response from the embedded code snippet. There is a direct relationship between the data being collected in Analytics and a user’s interaction. However, the user does not enter these data in a conscious or meaningful way; they are simply collected and inscribed to Analytics in a transparent fashion.

Data collected in Google Analytics are aggregated and, for the end user, not individualized. Analytics’ data privacy and security notes that individual user data may be collected, but not provided to the end user (the Analytics account owner, manager, or specialist): “Google Analytics customers are prohibited from sending personally identifiable information to Google, but this principle might not apply in some instances in which Google Analytics is used to to analyze how Google products and services are used by signed in account holders” (“We Use Our Own Products,” 2014). The end user is unable to re-constitute personally identifying information about visitors from the data provided.

However, the aggregated data are able to describe a nuanced portrait of our website visitors, to the point that multiple aggregate profiles are created. The data can help us answer questions about our website, like how many users access the site using a mobile device or tablet or how many pages in the site an average user visits. The answers to these questions, in turn, generate action items to customize web page content to user technologies and patterns of behavior.

Relationship to Foucault

Foucault sought to avoid transcribing discourse within traditional unities like genre or oeuvre; instead, he sought dynamic dispersions to describe the sum of component parts brought together for a specific exigence. “The rules of formation are conditions of existence (but also of coexistence, maintenance, modification, and disappearance) in a given discursive division” (p. 38). In Google Analytics, aggregated data contribute to a discursive monument (p. 139) that describes visits to the website, ostensibly for a specific exigence (e.g. to learn more about the School of Professional and Continuing Studies degree programs: see Figure 2).

Web page screen capture

Figure 2: Sample University of Richmond School of Professional and Continuing Studies degree programs web page

These data, when examined by the end user, help determine whether content or information architecture of the website needs to be revised (e.g. visitors are spending less time on one program page than on another – does this suggest content is more or less compelling on one page than another? See Figure 3).

google analytics screen capture

Figure 3: Exigencies that arise from reviewing Analytics data: Should we revise content in the /hr-management/ folder because average time on page was so much less than /education/ during last month?

This process describes what I might consider a double exigence. On one hand, a visitor’s exigency inscribes visit data in Google Analytics; on the other hand, the end user reviews aggregated visit data to answer questions about the content and/or structure of the website.

Discursive Formation 

There is a single moment, one that is likely measured in milliseconds, even nanoseconds, in which the result of a user’s concrete interaction on a specific web page is the inscription of data on an encrypted Google server containing our Google Analytics account. This moment describes the discursive formation of a statement. Within Analytics, there is no way to have predicted that irruptive moment would occur, as the moment involved a single independent individual having a single, concrete, specific interaction with a specific web page. There is also no way to repeat that exact irruptive moment or that exact discursive enunciation. Even if the individual were to visit that page again within 30 days, the statement would be described in terms of a repeat rather than new visit, likely resulting from selecting a local browser bookmark or conducting a different search. Its existence as an Analytics artifact would therefore differ from previous recorded visits. The statement is not a structural unity; rather, it’s a function of the user’s instantiated interaction with a web page. “This is because it [a statement] is not itself a unit, but a function that cuts across a domain of structures and possible unities, and which reveals them, with concrete contents, in time and space” (Foucault 2010/1972, p. 87). In fact, I can see the monument of that moment in time and space in Analytics (see Figure 3, above, for a visualization of aggregated moments over a month).

Nodes in the Network

Google Analytics defines nodes in its network in terms of metrics and dimensions. Metrics are “quantitative measurements of users, sessions and actions” and dimensions are “characteristics of users, their sessions and actions” (Google Analytics Academy, 2013). In the Popplet Figure 1 (above), “Referring Source” describes metrics and “Visitor Info” describes dimensions. To generate any relationship among metrics and dimensions, a visitor actively engages with a web page that contains embedded code. The visitor to the page, in this case, would be Foucault’s subject. Foucault describes discourse as being formed in the differential relationship among speaker, site, and subject’s relationship to the object (p. 55). Within Analytics, we can see these elements working together to generate a statement. The creator(s) of the web page, both its content and its embedded Analytics code, and the host of the web page, in physical and virtual space, act together as speaker. The speaker presents the page in question (the object) to the subject. The site is described in several different ways as the visitor interacts with the page: site is captured in dimensions that define user characteristics like amount of time spent on a page, browser type, platform, time of day, IP address of the visiting computer’s physical network, approximate geographic location of the visitor’s browser, and more. The subject’s relationship to the page (object) is captured by metrics that measure activity, including referring source (the link clicked or URL entered to arrive at the website in question). Metrics and dimensions work together as discursive formation that is collected in Analytics. Without a differentiated relationship (in which the subject is entering URLs, selecting links, or some other positivistic action that generates browser activity), no discursive content is collected.

Definition

Google Analytics is a Foucauldian archive of networked discourse. The discursive formation occurs the moment a subject follows or enters a web link. The active interaction of subject, object, and speaker/author/creator generates a discursive statement. That statement’s networked archive is inscribed as an assemblage of data points. A summary of those data points—in relationship to one another as metrics and dimensions and in relationship to subject, object, speaker, goals, and events—appears below in Figure 4.

Popplet screen capture

Figure 4: Google Analytics as networked archive of a discursive statement—Popplet

Agency and Flow

Google Analytics nodes are metrics and dimensions. These nodes have no agency of themselves. They are created and inscribed in the moment of visiting a web page.

However, Analytics requires agency at higher levels of the network hierarchy, in the differentiated relationship among speaker (page author, coder, and host), site (metrics and dimensions), and subject (visitor) relationship to the object (web page). Among these nodes (which are tangentially part of the Analytics network because the object contains the embedded Analytics code snippet), the subject is the agent that creates and sustains the network. As the result of a concrete action on a tracked web page, visit data are generated by the embedded code snippet and transmitted, via network hardware, to Google servers. At the same moment, a separate snippet of code is written to (or updated on) the subject’s browser cache (a cookie) that assists the tracking snippet in determining whether the visitor is new or returning to the page. User agency can erase the cookie, which may the dimension of new or returning visitor, and the user can determine whether to follow links, stay on the page, or follow an embedded event (like watching a video or reviewing a news feed). Agency and flow are largely “single bus” activities—they travel from the visitor to the Google server, but not directly back to the visitor. Some indirect agency can be found in the speaker (author and coder) in that results of metrics and dimensions analyses may include changes to web pages that become new again to the subject (visitor).

The Archive and the Archaeologist

As the person who has been granted administrative authority by our central website authority (Director of Web Services) to interact with data in Google Analytics, I have access to a vast (albeit potentially incomplete, given Google’s ownership of the archive itself) portion of the archive of discourse. Foucault describes an archive as the collection of discursive formations, a finite collection that does not point to some transcendent future or some ideal meaning. “The never completed, never wholly achieved uncovering of the archive forms the general horizon to which the description of discursive formations, the analysis of positivities, the mapping of the enunciative field belong” (p. 131). Analytics does not ascribe meaning to the discursive moment itself. Rather, it records the irruptive actions of the discursive formation as a collection of statements in an archive. As an administrative user, I can access that archive and recreate a visualization of discursive moments that occurred. They are inscribed in the metrics and dimensions recorded at the irruptive moment. At best, I can “dig into” the archived results to determine patterns of activity (metrics) and characteristics (dimensions). I and other users with access to some or all aspects of the Analytics account are archaeologists plumbing the depths of the archive.

Google Analytics visualizes flow by archiving the actions that generated flow, but Analytics data themselves are not in flow. They’re an archive of data generated via discourse. For lack of a better analogy, GA is a chapter book I can read that contains archived evidence of discourse. Those traces represent, but are not themselves, the discursive formations of statements.

Conclusions

Google Analytics is a networked archive and an archived network.

Networked archive: The archive is networked in that it collects interrelated data points and demonstrates the relationship among those data points using visualizations and aggregated data. Those relationships can be explored by someone with user access to the Google Analytics account. In this networked instant, my role as archive archaeologist activates the network, which otherwise represents little more than a collection of data points that, at the moment of web browsing, represented active discourse.

Archived network: The network is archived in that Google Analytics collects the network activity of subjects, objects, and creators/speakers—their discourses. The subject’s interaction with a web page results in discursive formation of statements; a sample statement is visualized in Figure 4 (above). A collection of such statements from a single subject is aggregated as a user session, which I would consider Foucault’s concept of a monument. A collection of those user sessions (monuments) in aggregate is the archive, and that’s what Analytics gives access to.

Note

Original snippet: [...they are simply collected and inscribed to Analytics in a transparent fashion...]

“Transparent” probably isn’t the right term. If you’ve ever seen a page load delayed by a message at the bottom of the browser window that says something like “Loading analytics.google.com/ga.js,” you’ve encountered the code snippet at work, struggling through network latency to load the data to Google’s servers. [return]

References

Foucault, M. (2010). The archaeology of knowledge and the discourse on language. (A. M. Sheridan Smith, Trans.). New York, NY: Vintage Books. (Original work published in 1972)

Google Analytics Academy. (2013, October). Key metrics and dimensions defined [Video transcript]. Digital Analytics Fundamentals. Retrieved from https://analyticsacademy.withgoogle.com/assets/pdf/DigitalAnalyticsFundamentals-Lesson3.2KeymetricsanddimensionsdefinedText.pdf

How Analytics impacts your website code. (2014). Retrieved 2014, 10 February from https://support.google.com/analytics/answer/1008009?hl=en

We use our own products. (2014). Retrieved 2014, 10 February from https://support.google.com/analytics/answer/3000986?hl=en&ref_topic=2919631

[“Rue Foucault”: Creative Commons licensed image by Flickr user sarahstarkweather]

Mindmap #1: The Rabbit Hole

At this point in the class, more questions than answers face me. In one sense I recognize the relative simplicity of a network: a connection of nodes. On the other hand, I quickly complicate my simple definition with questions: Are nodes relatively static? Are they predefined via framework or developed on the fly through the action of the network itself? Do the connections “define” the network, or do the nodes? Or is it the friction between the stasis of the nodes and the activity of the connections that makes the network “work”?

As I considered my object of study, Google Analytics, I also considered the object that Google Analytics studies, namely websites. I’ve created multiple websites in my career, both personal and professional. When I start a website, I start with the basic content that needs to be produced/communicated, then develop an organizational framework into which those content areas can and should appear. We call that framework the IA, or information architecture, and I enjoy creating the IA, either from a never-before-organized collection of content or from previously-created content that needs to be reorganized. My strength as a web manager comes from visualizing and developing the organizational and hierarchical framework for the content. Folder and subfolder structure, relationship of subfolders to folders, pages to folders, and folders to site are among the creative activities of my professional position. In short, I develop the relationship of the nodes to one another and create the connections that visitors will make between and among the nodes, both up and down the IA and page to page in individual folders.

What I realized as I considered Google Analytics is that each “level” of a web site – each folder, subfolder, and subsubfolder (and we try to have only three levels in even the largest sites) is itself a network with connections up and down the IA. A domain is a network. Subdomains within each domain are each networks. Folders within each subdomain are each networks. But they are also nodes. At the domain level, the subdomain is a node on the domain network. At the subdomain level, the folder is a node on the subdomain network. At the folder level, the subfolder is a node on the folder network. And so on down the rabbit hole.

Networks are iterative. My mind map addresses the iterative character of networks as it also starts asking questions that I’d like to answer through the semester. I made the connection between the questions because they are the common thread running through my mind right now. I don’t know enough to start answering yet, but I’m developing ideas and theories.

[Creative Commons licensed image by flickr user RubyGoes]