A Literature Review of Personalization Transparency and Control: Introducing the Transparency–Awareness–Control Framework

Through various online activities, individuals produce large amounts of data that are collected by companies for the pur‐ pose of providing users with personalized communication. In the light of this mass collection of personal data, the trans‐ parency and control paradigm for personalized communication has led to increased attention from legislators and aca‐ demics. However, in the scientific literature no clear definition of personalization transparency and control exists, which could lead to reliability and validity issues, impeding knowledge accumulation in academic research. In a literature review, we analyzed 31 articles and observed that: 1) no clear definitions of personalization transparency or control exist; 2) they are used interchangeably in the literature; 3) collection, processing, and sharing of data are the three objects of trans‐ parency and control; and 4) increased transparency does not automatically increase control because first awareness needs to be raised in the individual. Also, the relationship between awareness and control depends on the ability and the desire to control. This study contributes to the field of algorithmic communication by creating a common understanding of the transparency and control paradigm and thus improves validity of the results. Further, it progresses research on the issue by synthesizing existing studies on the topic, presenting the transparency–awareness–control framework, and formulating propositions to guide future research.


Introduction
Through various online activities, individuals produce large amounts of data that are collected by companies and processed through algorithms for the purpose of providing users with personalized communication (Yun et al., 2020). While personalization is currently applied in many different contexts-e.g., personalized healthcare (Dzau & Ginsburg, 2016) or news recommendations (Thurman et al., 2019)-it very frequently occurs in the form of personalized marketing messages (so-called personalized marketing communication, see . In this context, personalized communication involves interactions between companies and consumers, data collection, and processing by companies and delivery of marketing communication (Vesanen & Raulas, 2006).
In the light of this mass collection and advanced processing of personal data through algorithms for the means of personalized communication, disclosures about data collection and processing and individual control of these processes-often called the transparency and control paradigm-have been gaining importance in practice (Deloitte, 2018;Li et al., 2019). For example, recent legal developments, such as the General Data Protection Regulation (GDPR) in the EU in 2016 (enforcement in 2018), and the California Consumer Privacy Act in the US, assign high transparency requirements for companies' data collection and processing practices and strengthen individuals' rights to control their personal data as the main data protection mechanisms (Strycharz et al., 2020;van Ooijen & Vrabec, 2019).
The growing importance of the transparency and control paradigm in application of personalized marketing communication is also reflected in academic research. The effects that data collection transparency has on users have been investigated (e.g., Aguirre et al., 2015;Kim et al., 2019) as have the ways in which control over data collection impacts users' perceptions and behavior (e.g., Zarouali et al., 2018). Individual control over personal data has also been portrayed as a crucial element of privacy (Altman, 1975). However, the literature provides little consensus on how personalization transparency and control should be conceptualized. For example, while Aguirre and et al. (2015) call transparency "overt data collection" and focus on consumer awareness of data collection practices, Kim et al. (2019) write about "ad transparency" in terms of the disclosure of data collection practices. Similarly, control has been conceptualized as abilities users have to control data collection (Joo, 2018), but also as the desires that users have to exercise such control . Such substantial differences in conceptualizations impact the reliability and validity of the results and impede knowledge accumulation in the field. Therefore, the aim of the current study is to map academic research on personalization transparency and control and provide guidelines for future research on this issue.
To map academic research on transparency and control in personalized marketing communication, we conduct a systematic literature review on personalization transparency and control, provide conceptualizations, and develop a framework to facilitate future research on the topic. The research delivers a substantial contribution to the field of personalized marketing communication by creating a common understanding of the transparency and control paradigm, thus improving the validity of future results. Further, it progresses research by synthesizing existing studies on the topic and presents a framework to guide future research on the topic.

Methods
To locate the relevant literature included in this study, electronic searches were conducted in several disciplinary and multidisciplinary databases in July 2020. The primary search strategy was designed by social sciences librarians and conducted in Business Source Premier (EBSCO). It was then translated and conducted in Academic Search Premier (EBSCO), Communication and Mass Media Complete (EBSCO), and Proquest Dissertations & Theses. Due to technical limitations of search interfaces precluding systematic searching, studies from SocArXiv and the Social Science Research Network (SSRN) were obtained via hand searches using relevant keywords and the manual review of search results.
The literature search identified published and unpublished empirical and theoretical studies in databases focused on advertising, marketing, communication sciences, and business that include conceptualizations of the personalization transparency and control paradigm. The full search query and filters for the initial database are available in the Supplementary Files.
The 643 results identified from searching the six databases were exported to EndNote, a citation manager, and were deduplicated, resulting in 589 records. An online systematic review software, Rayyan QCRI, was used to assist the two-part screening process (Ouzzani et al., 2016). Titles and abstracts of the initial 589 studies were independently screened by the first and second authors which resulted in 36 studies. The authors had a percentage agreement of 95.8% and 100% was reached after discussions. An additional three studies were added based on backward citation tracking and one more was added based on the authors' professional contacts, for a total of 40 studies for full-text screening.
The primary two authors screened the full text of 40 studies, narrowing the sample to 31. The PRISMA flow diagram in Figure 1 includes the search, deduplication, screening, and data extraction totals for this study. The authors adhered to the PRISMA statement and checklist for this study to transparently report the procedures (Moher et al., 2009). The coding protocol as well as full coding scheme can be found in the Supplementary Files.

Results
The selected articles were qualitatively coded by the first two authors and an overview of the results is presented in Tables 1 and 2. In total, we included 31 studies, eight of which concern transparency, while 25 cover control (two articles mentioned both concepts). We observed that often transparency and control were used interchangeably. Although the concepts are related to each other, we believe that they are different constructs. Additionally, we observed that only a few papers included an explicit definition of transparency or control. From most papers we were able to derive the conceptualization from the text but for others we were unable to derive any understanding of the use of these concepts.

Conceptualization of Transparency
Of the eight studies on transparency, six were published and two were not (a Master's thesis and a dissertation). The majority of the papers came from marketing (n = 5) and three papers came from a communication science journal (see Table 1). Three papers were published in or after 2018, five were published in or after 2015, and all were published in the 2000s. This indicates that research interest in the issue of transparency has not decreased over time. The majority of the research was conducted in the US (n = 4), followed by Europe (Netherlands, Germany, and Sweden). We observed many different conceptualizations of transparency. This may be explained by the fact that the transparency object differs between studies. We observed three objects of transparency. In the first, transparency concerns data collection practices in general terms, or is related to specific data collection techniques, such as cookies. In the second, a few studies look into transparency of the personalization process (i.e., how data is used for the creation and delivery of personalized messages). In the third, transparency concerns sharing data with third parties.
Looking closer at these conceptualizations, we observed two further differences between how authors use the concept of transparency. First, it is used from the perspective of the data collector or the sender of the personalized message (industry perspective). This refers to the information that is disclosed by the sender to individuals about data collection, the personalization process, or data sharing. In general, transparency from the sender perspective focuses on how such information is disclosed and is closely related to choices about data collection made by the collector. Second, the concept is used in reference to the perspective of the individual whose data is collected and who is the recipient of the personalized message. In this case, it refers not to transparency but to the degree of awareness of data collection, processing, or sharing. Such awareness is also referred to as the overtness of data collection and is closely related to transparency. Looking at the link between these two uses, we argue that transparency (stemming from the industry), in fact, can lead to increased awareness among individuals.
Based on the different conceptualizations of transparency, we propose to differentiate between transparency and awareness. These constructs are both used in the reviewed literature but with substantially different conceptualizations and from different perspectives (e.g., sender vs. receiver). Building on the conceptualization provided by Kim et al. (2019) and adjusted according to other reviewed studies related to transparency (see Table 1), we propose the following definition for personalization transparency: Personalization transparency: The degree of disclosure of the ways in which firms collect, process, or share (exchange) personal data with the purpose of generating personalized communication.
Next, building on the conceptualization provided by Aguirre et al. (2015) and adjusting it to the reviewed articles on awareness (see Table 1), we propose the following definition for personalization awareness: Personalization awareness: The degree to which individuals are cognizant of how and when their personal data are collected, processed, or shared (exchanged) with the purpose of generating personalized communication.

Conceptualization of Data Control
Of the 25 studies on control, 20 were published and five were unpublished. The majority of the papers were from marketing (n = 12), followed by communication (n = 7), and law and ethics (n = 3; Table 2). Ten papers were published in or after 2018, 15 were published in or after 2015, and all were published in the 2000s. Similar to transparency, we observed that all articles are by different authors. The research was conducted in five different continents with the majority of the data collected in the US (n = 8), followed by Europe (n = 6), and also including data from South Africa (n = 3), South Korea (n = 1), and New Zealand (n = 1).
We observed different conceptualizations of data control, and many different terms were used to describe it (Table 2). This is not surprising given that most of the work does not build on the other studies under analysis, but rather were developed in parallel around the same time period. Looking more closely at the conceptualizations, we observe three main differences: 1) type of control, 2) concreteness, and 3) object of control.
First, the type of control differs depending on whether the authors talk about actual control (e.g., things that an individual can do) vs. perceived control (e.g., the perceptions of control by an individual). Literature on personalization has shown the importance of separating between reality and perceptions, for example, in (perceived) personalization (De Keyzer et al., 2015;Kramer et al., 2007;Maslowska et al., 2016). Maslowska et al. (2016) found that perceived personalization mediates the relationship between actual personalization and advertising responses. Therefore, we find it important to also distinguish between actual and perceived data control.
Second, the conceptualizations differ in terms of the level of concreteness. For example, some conceptualizations mention specific things that individuals can do to exert control (e.g., opt-out, decline cookies), while others are more abstract (e.g., control without explaining how). In order to increase the applicability of the conceptualizations, we decided to adopt an abstract conceptualization.
Third, similar with the literature on transparency, we observed three objects of control-namely, control over the collection, processing, and sharing of personal data. Therefore, we decide to adopt all three objects into our conceptualization. Based on the conceptualizations of the studies in Table 2, we propose two definitions of data control, one for actual control and one for perceived control: Actual data control: The extent to which individuals can start, stop, or maintain what personal data firms collect, process, or share (exchange) with the purpose of generating personalized communication.
Perceived data control: The extent to which individuals think they can start, stop, or maintain what personal data firms collect, process, or share (exchange) with the purpose of generating personalized communication.
Finally, we found two factors in the studies' conceptualizations that influence the amount of control that individuals have, namely the ability to control (i.e., the skills and knowledge one has to exert control) and the desire to control (i.e., one's motivation to exert control). The inclusion of ability and desire in the conceptualization of control indicates the relevance to control. Because we believe they are distinct concepts we conceptualized ability and desire to control separately. Based on the definition provided by van Ooijen and Vrabec (2019) and other reviewed literature, we propose the following definitions: Ability to control: The extent to which an individual has the necessary knowledge and skills to start, stop or maintain firms to collect, process, or share (exchange) personal data with the purpose of generating personalized communication.
Desire to control: the extent to which an individual has the motivation to start, stop, or maintain firms to collect, process, or share (exchange) personal data with the purpose of generating personalized communication.

Framework and Future Research Agenda
Based on the conceptualizations of transparency and control, we created the transparency-awarenesscontrol (TAC) framework ( Figure 2). Based on the framework, we provide concrete propositions to guide future research (Table 3). The framework differs based on the data collection mode because as Miyazaki (2008) noted, some data collection practices are more covert to consumers by nature (e.g., third-party cookies often

Miyazaki (2008) Yes
Marketing US Content analysis Covertness of cookies "Another concern regarding cookie placements is the covert nature of their usage. The placement of third-party cookies is often facilitated by the use of 'clear GIFs' that are only one pixel by one pixel in size, which essentially makes them invisible to the consumer" (p. 21).

Stevenson (2016) No
Communication USA Experiment Transparency in online "Transparency about some of the ways online ads science advertising personalization are personalized for individuals appears" (p. 150). processes information that is collected during, or as a result of, marketing transactions, as well as control over unwanted telephone, mail, or personal intrusions in the consumer's home" (p. 15). control on their mobile phones in order to enhance their personal privacy" (p. 468).  facilitated by the use of "clear GIFs" that are virtually invisible to individuals) and thus require transparency for raising awareness, while others have a less covert nature and require action from the individual (e.g., the proactive sharing of personal data such as an email address). We will explain the framework by means of four examples derived from the reviewed research. Furthermore, we will give different examples for the three objects of transparency and control (i.e., collection, processing, and sharing of data).

Example 1: High Transparency, High control
High transparency involves disclosure of data collection, processing, or sharing by the sender. Examples of high transparency from the reviewed literature regarding data collection include: detailed explanations of how personal data is collected and how long it will be stored (Song et al., 2016) as well as disclosures of covert data collection methods such as cookies, providing information on what they are and what data they collect (Miyazaki,  Transparency about data collection, processing, or sharing is a condition for individual awareness of such practices. The higher the degree of transparency, the higher the awareness.
2 Personalization awareness is a condition for perceived control over data for personalization. The higher the degree of awareness, the more likely that individuals have perceptions of control. Individuals need to be aware of data collection, processing, and sharing to perceive control.
3 Personalization awareness among individuals is a condition for having actual control over data for personalization. The higher the degree of awareness, the more likely that individuals will have control. When individuals are not aware of their data being collected, processed, or shared, it is not possible for them to control these actions.

4
Personalization awareness is a condition for having the desire to control data collection and personalization processes. Only individuals who are aware that their data is collected, processed, or shared can have the desire to control these processes.

5
The relationship between personalization awareness and (perceived and actual) control depends on the desire to control. Only with sufficient levels of desire to control, aware individuals will be able to exert some control.

6
The relationship between personalization awareness and (perceived and actual) control depends on the ability to control. Only with sufficient levels of ability (skills and knowledge) to control, aware individuals will be able to exert some control.

7
Higher actual control is more likely to lead to more perceived control.
2008). Examples of personalization processes include a high level of disclosure on data used to personalize the message (e.g., types of behavioral data or location data used and functions such as "Why am I seeing this ad?" offered by senders; see Dogruel, 2019; Kim et al., 2019).
Regarding sharing, disclosures involve information about specific sources of data (e.g., sources of behavioral and location data used for advertising; Dogruel, 2019) and information on third-parties with whom the data will be shared (as required, for example, by the GDPR). High actual control involves the possibility for individuals to act and is usually preceded with high transparency. The reviewed literature includes opt-out functions from data sharing with websites and apps (Joo, 2018). From the individual perspective, such control can also involve providing false information to the data collector (Miltgen & Smith, 2019). Regarding the personalization process, it includes privacy control menus that allow individuals to opt-out from processing for personalization (meaning not seeing personalized ads; see Zarouali et al., 2018). Finally, regarding data sharing, the literature proposes privacy settings that allow individuals to opt-out from third parties accessing their personal information (Tucker, 2014).

Example 2: High Transparency, Low Control
While high transparency may contribute to higher awareness among individuals, it does not automatically imply higher control. In cases of high transparency and low actual control, the same transparency mechanisms are in place as described above, but they either do not come with the possibility for action by the user (or have very limited options) to stop data collection (Zarouali et al., 2018), or they do not have opt-out signs in the app or web interface that would allow the user to impact the processing for personalization (Joo, 2018).
While it is not common to display disclosures but provide no opt-out/privacy control features (actual control), providing such features does not imply high perceived control (Zarouali et al., 2018). Perceived control may be impeded by lack of awareness, no desire to control personalization processes, or lack of ability to exercise control. An example of high transparency and low perceived control is data collection through cookies. Such data collection has to be disclosed on websites, but this disclosure does not foster the perception of control among individuals (Miyazaki, 2008).

Example 3: Low Transparency, Low Control
Low transparency regarding data collection involves not specifying what or how data are collected (Miyazaki, 2008). Regarding processing, it involves not disclosing what data have been used for personalization (Dogruel, 2019;Kim et al., 2019) and regarding sharing, how data have been obtained from other parties or if they will be shared with third parties. As Miyazaki (2008) argues, covert data collection techniques such as the use of third-party cookies facilitated by pixel-sized images on websites are practically invisible to individuals. Such techniques have been called non-obvious by the Federal Trade Commission (2000). For these non-obvious data collection techniques, with no transparency, individual awareness is difficult to achieve. As a result, individuals are not able to exercise control over such practices. Therefore, low transparency about non-obvious practices is often the object of regulations (such as the e-Privacy directive that obligates transparency about cookies in the EU).

Example 4: Low Transparency, High Control
This category does not exist as transparency is a condition for control. When it is not transparent how data are collected, processed, or shared, individuals are not aware of these practices (e.g., the use of third-party cookies are not disclosed on a website), and therefore they are not able to stop such practices.

Conclusions
The growing importance of the transparency and control paradigm for personalized communication has led to increased attention from legislators and academics. This calls for clear definitions of the concepts involved to increase validity and facilitate future research, which was the aim of this study.
By means of a systematic literature review, we analyzed 31 articles on personalization transparency and control. The concept of transparency has been around for a longer time because it has been relevant to other communication strategies that are more covert, such as native advertising (Wojdynski & Evans, 2020). However, control seems to be a phenomenon specific to communication strategies that rely on personal data that has been receiving increasing attention in the recent years. In our literature review we specifically focused on the conceptualization of transparency and control for personalized communication. This led us to four conclusions.
First, the literature review confirmed that there is no common definition of either transparency or control, which highlights the need for a shared understanding of these concepts. While studies included in the review have different focuses of research (different types of advertising including online advertising in general, online behavioral advertising, and mobile advertising) and different control mechanisms related to advertising (e.g., mobile phone settings, privacy protection, advertising opt-out mechanism), they all investigate different aspects of transparency and control related to personalized marketing communication. Hence, based on the reviewed literature, we formulated definitions of both personalization transparency and control. The different focuses of the studies included could have contributed to the diversity of conceptualizations found in the literature. However, even while focusing on one specific object and the papers that study that object (e.g., transparency about data collection for advertising through tracking cookies), we observe differences. Moreover, we find that many studies did not include any definitions of their terms. Our study, therefore, contributes to the literature by synthesizing different definitions, analyzing them, and proposing one definition to help research on this topic in the field of personalized marketing communication move forward. In addition, we made a distinction between actual and perceived control, which is important because previous research on personalization shows that they are different concepts. Future research could examine whether they have different predictive powers.
Second, we observed that the concepts of transparency and control were often used interchangeably in the literature. Although we believe that these concepts are related (see Figure 2), we argue that they are separate constructs. We also observed that other concepts were often entangled with understandings of transparency and control: Awareness, for example, was often integrated in the transparency conceptualization. However, we argue that transparency is about information disclosure from the sender side, while awareness concerns the extent to which individuals are conscious of the practices from the receiver's side. This is an important distinction for future research to take into consideration. Also, we found that ability and desire to control were integrated into definitions of control ( Figure 2).
Third, we observed that the objects of transparency and control differed between conceptualizations. We found three objects of transparency and control, namely collection, processing, and sharing of data. We believe it is important to acknowledge the different objects because what information is disclosed or what individuals can do to exert control differs for each object.
Finally, we introduced the TAC framework to visualize the relationship between the concepts discussed, providing concrete propositions to guide future research ( Table 3). Note that although we argue that transparency and control are positively related, it does not mean that more transparency automatically leads to more control. As shown in the TAC framework, transparency provided by the sender first needs to increase awareness in the receiver before it could lead to more control. In addition, we argue that the ability and desire to control are boundary conditions for the relationship between awareness and control. Future research should empirically test the propositions of the framework to validate the claims. In addition to this theoretical contribution, the TAC framework has important implications for privacy regulations, since transparency regarding data collection and processing practices is a core issue in current regulatory approaches. In fact, both the GDPR and the California Consumer Privacy Act, which aim to strengthen individuals' rights regarding control over their personal data, portray transparency as the main data pro-tection mechanisms in online data collection processes by requiring companies to be more transparent about their data collection practices (Strycharz et al., 2020;van Ooijen & Vrabec, 2019).
Furthermore, the TAC framework, while developed specifically in the context of personalized marketing communication, can be applied and tested to other areas of personalization research. Personal data collection and algorithmic processing that enable personalization can also be used in health communication (e.g., personalized healthcare; Dzau & Ginsburg, 2016), political communication (e.g., political microtargeting; Zuiderveen Borgesius et al., 2018) or journalism (e.g., news recommendations; Thurman et al., 2019) and lead to the same questions about transparency, individual awareness, and control. The TAC framework can therefore be used to further explore consumer empowerment in these areas.
In sum, this study provided definitions of personalization transparency and control for the use of personalized communication, as well as for the related concepts of awareness, ability, and desire to control. While the concepts are not new to the literature, the increasing use and importance of data for personalized marketing communication, computational advertising, and other forms of algorithmic communication make them important concepts of interest. Increased comprehension of the transparency and control paradigm gives us a chance to better understand how data collection practices work, what effects they have on individuals, and what implications this may have for industry practices and privacy regulations.
Cody Hennesy is the journalism and digital media librarian at the University of Minnesota, Twin Cities, where he develops services and support for text and data mining research and the computational social sciences.