|
|
Persistent Client State HTTP Cookies Anna Kemp, November 5, 1998 |
|
For a majority of Internet users, the term "cookie," when used in conjunction with the Internet, represents either a treat to be eaten or a recipe to be found while surfing the Net. In fact, in Commerce, Communication and Privacy Online, a 1997 online survey conducted by
P&AB Harris, 72% of the respondents had never heard the term cookie used in reference to World Wide Web sites. In order to move beyond the chocolate chip, oatmeal raisin and macaroon cookie image, I will present the factual who, what, where, when, how and why of Internet cookies, followed by a discussion of the current uses of cookies. Finally, issues regarding cookie control will be outlined from the perspective of the Webmaster and the Websurfer.Cookies were first discussed in July 1995 as part of a Netscape Technical Note, but they did not officially appear as a Netscape feature until the release of Netscape Navigator version 3.0 in 1996 (
Clarke). Today, Netscape refers to their creation as "Persistent Client State HTTP Cookies," which allow Web sites to maintain information on a particular user across HTTP connections. HTTP is a stateless or non-persistent protocol, meaning a web server does not store any information about a particular interaction and does not correlate previous exchanges with subsequent transactions (Bowen). Cookies were created to expand the abilities of HTTP by adding information to the HTTP header (Slayton). The cookie-containing HTTP header passes from the Web server onto the client's hard drive, allowing the creation of a persistent state. The cookie lives on the client's hard drive and helps the server to remember a user and differentiate between his/her visits to a particular site (Whalen). Netscape's cookie specifications provide the following overview:"A server, when returning an HTTP object to a client, may also send a piece of state information, which the client will store. Included in that state object is a description of the range of URLs for which that state is valid. Any future HTTP requests made by the client which fall in that range will include a transmittal of the current value of the state object from the client back to the server. The state object is called a cookie, for no compelling reason."
With the creation of a persistent state between the server and the user, Web sites have the ability to personalize site information, offer online sales and services and track popular links and demographics, all of which will be discussed in detail below. Two types of cookies can be utilized for these interactions. The first is temporarily placed on the client's hard drive for the length of the browsing session. By quitting the browser, the user destroys the cookie. The second, a persistent cookie, is stored on the user's hard drive for a specified period of time. The user can quit the browser yet maintain any cookies for use during his/her next Internet session.
Roger Clarke with Xamax Consultancy, presents a six-step sequence of events for a site utilizing persistent cookies:Each of these steps involves a more detailed interaction between the client and the server, as exemplified by the situation outlined below.
In step one, the user, Chip, requests his browser to retrieve the home page for
http://www.amazon.com. The browser, using the HTTP protocol, transmits Chip's request as well as his IP address, type of browser, and operating system to amazon.com's Web server (Kessler).In step two, the Web server determines that Chip surfs the Web using Netscape 4.0 as his browser. Most current browser versions support cookies, including Netscape .94 Beta and up and Microsoft Internet Explorer 2.0 and up. (For more information on browser/cookie compatibility, see Cookie Central's "
Cookie Test.") Because his browser supports cookies, Chip receives his requested page from amazon.com's Web server and instructions to set a cookie onto Chip's hard drive.In order to set a cookie, the HTTP response from amazon.com's web server will include a Set-Cookie header, typically generated by a CGI script (
Netscape). CGI stands for Common Gateway Interface and acts as the communication standard between the Web server and the server side gateway programs. These programs generate the HTTP header response that sets a cookie from amazon.com onto Chip's machine. Netscape's Set-cookie HTTP Response Header contains the following syntax:Set_Cookie: NAME=VALUE; expires=DATE; domain=DOMAIN_NAME; path=PATH; secure
The five attributes of the Set_cookie header are defined as follows:
NAME=VALUE:
As the only required attribute of the Set_cookie header, the name/value pair acts as the meat of the cookie. The value represents the data the Web server receives after the cookie has been set and another transaction occurs with the client's browser.expires=DATE
(optional): In a persistent cookie, the date attribute determines how long the cookie will be stored on the client's hard drive. Once the date has been reached, the cookie will cease to be stored. If the header does not contain a date attribute, the cookie will expire at the end of the client's session and will be removed from the hard drive. The date is formatted according to Greenwich Mean Time, that is, Weekday, DD-Mon-YYYY HH:MM:SS GMT. For example, Monday, 16-Nov-98 23:12:40.domain=DOMAIN_NAME
(optional): The domain attribute defaults to the domain name of the server setting and receiving the cookie. In order to coordinate multiple servers, however, this attribute can be set to a domain name tail for what Netscape refers to as "tail matching." For example, a domain attribute of ".amazon.com" would match host names "books.amazon.com" or "music.amazon.com". To prevent setting domain attributes like ".com", the seven top-level domains: .com, .gov, .edu, .net, .org, .mil, and .int are required to utilize two periods such as .amazon.com. All other domains require three periods. Also, a cookie can only be set by hosts within the specified domain attribute value.path=PATH
(optional): The path attribute "specifies the subset of URLs in a domain for which the cookie is valid" (Netscape). For example, the path value /~kempa would match URLs containing /~kempa/ari.html and /~kempanna. The most general path is "/," but the attribute value defaults to the same path as the file described by the header containing the cookie.secure
(optional): If the secure attribute is specified, the cookie is only transmitted if a secure channel, such as HTTPS, is used.The use of each attribute will be discussed in step four of the cookie transaction process.
In step three, Chip's browser writes the amazon.com cookie to his hard drive using the specifications provided in the HTTP header received from the Web server. Four factors currently limit cookie creation
(Whalen):Once created, Netscape Navigator stores cookies in a file called "cookies.txt," while Microsoft Internet Explorer stores them in separate files named with the user's name and domain name of the site that sent the cookie in Windows/cookies or Windows/profiles/cookies (
Kessler). Since Chip uses Netscape 4.0, his cookies.txt looks like the following:# Netscape HTTP Cookie File#
http://www.netscape.com/newsref/std/cookie_spec.html#
This is a generated file! Do not edit.
home.netscape.com FALSE / FALSE 942189502 NGUserID cc98a716-24595-900775051-1
.amazon.com TRUE / FALSE 2082787601 ubid-main 002-2904428-3375661
.doubleclick.net TRUE / FALSE 1920499140 id 490d9f0e
Each line of the file describes a cookie, but the exact format of the lines will vary according to the syntax used in the Set_cookie HTTP Response Header. By evaluating the second cookie entry, the following information can be gleaned from this example (
Rejonis):In step four, Chip requests another page from amazon.com regarding a popular music CD, "C is for Cookie." Before sending the request to amazon.com's web server, Chip's browser determines whether a cookie exists for amazon.com in his cookies.txt file (
Netscape). First the browser compares the domain attribute of the cookie with the domain name of the Web server from which the URL has been requested, amazon.com. Since the tail of the requested URL matches the domain attribute of the cookie, the browser compares the cookie's path attribute with the path of the requested URL. Again, the amazon.com cookie path value of "/" is valid for Chip's requested file. With these two comparisons verified, the browser proceeds with Chip's request.In step five, because a valid cookie exists in the cookies.txt file that matches Chip's requested URL, the HTTP request to amazon.com's Web server includes a cookie request header. Netscape's syntax for the Cookie HTTP Request Header contains the name of the valid cookie and its value (
Netscape). In this case, the syntax would appear as follows:Cookie: ubid-main=002-2904428-3375661
If the browser had found more than one valid (matching domain and path values) cookie entry for amazon.com, the additional cookie names and values would be appended to the cookie request header with a semicolon.
Finally, in step 6, amazon.com's web server receives Chip's second file request and the associated cookie request header. The value of the cookie becomes data for use by the server-side gateway programs. Using the CGI communication standard, these programs can access the cookie data from the Web server in order to remember something about Chip. In this case, it could be the number of times Chip has visited amazon.com or it could be adding this musical selection to his shopping cart.
To summarize the entire transaction between Chip and amazon.com:
Set-cookie: ubid-main=002-2904428-3375661; path=/; expires=<date in GMT>
.amazon.com TRUE / FALSE 2082787601 ubid-main 002-2904428-3375661
Cookie: ubid-main=002-2904428-3375661
The persistent HTTP state created between a Web server and a Web client by utilizing cookies supports advanced site functionality such as online shopping and custom Web pages. The cookie value represents a small piece of data about the user, such as an identification number, that can be accessed by web site programs running on the web server to provide a variety of services. For example,
These examples of cookie use, while beneficial, have caused a significant amount of controversy between Web users and Web developers. As stated by Simon St. Laurent in his book Cookies, "The information contained in most cookies is trivial but is still enough to make programmers' and marketers' dreams conceivable and give privacy advocates fits" (Machlis). User privacy and security must be addressed while site functionality is maintained as evidenced by the five cookie issues presented below.
1. Cookies do not gather data but they can be used as a tracking device to assist individuals and organizations who are capturing and analyzing user information (Whalen). Although only 5% of Net users and 7% of online-service users believe they have been the victims of a privacy invasion while online, sites using cookies as tracking devices cause many users to feel that their privacy has been invaded (P&AB Harris). Michelle Slatalla, a writer for the New York Times, typifies this concern in the following account,
"Actually, those cookies were sent without my permission, and since I cannot decode the little aliens, how can I be sure that they are not whooping it up with one another, exploring my hard drive, sending my secrets back to their leader on some distant planet? (Cookie A: 'Hey, Moe, get a look at this file. No wonder she calls it a rough draft!' Cookie B: 'Yeah, did you notice she's so cheap that she's played solitaire 749 times without paying the shareware fee?')"
While users may be uncomfortable, it should be noted that, "Cookies are merely a slight enhancement to a tool that developers have used since the Web's inception: log files" (Slayton). Using cookies allows web servers and their administrators to track the same information stored in server log files: the client's Internet Service Provider, his/her operating system and browser software, and any links the client utilized to reach his/her current page (Webmaster Report Part 3). A cookie can store user information beyond that contained in a log file if the user provides the information via an online form. David Whalen states, "A cookie alone cannot read your hard drive to find out who you are, what your income is or where you live. The only way that information could end up in a cookie is if you provide it to a site and that site saves it to a cookie." This kind of detailed user data resembles traditional information given to retail stores, catalog companies, and other merchants via U.S. mail, telephone and fax everyday. The use of a cookie, therefore, does not increase the types of information that a web site can collect, but it makes the available information easier to collect, preserve and analyze.
2. If the browser's default settings have not been changed, a web user does not know when a cookie is set on his/her hard drive. Roger Clarke believes this feature of cookies makes them "surreptitious" because it takes advantage of a user without his/her knowledge. The clandestine nature of cookies has already caused ethical problems for researchers utilizing cookies to obtain information regarding user needs and preferences. Amy Bruckman of the Georgia Institute of Technology states, "The main issue here is one of informed consent. You should not do research on people without their informed consent" (Kiernan A30). However, Michael Loui of the University of Illinois, Urbana-Champaign believes that "The value of the research may outweigh the small loss of privacy" (Kiernan A32).
The conflict among researchers exemplifies the larger scale debate of user responsibility versus site responsibility regarding cookie use. Users can take control of the cookie situation in one of two ways. First, browsers can be configured to prompt the user when a cookie is about to be transmitted or to block all cookies from being set (CIAC). In Netscape, this is accomplished under the Edit menu, Options section. From the Options dialog box, the user can select the Advanced tab and set their cookie notification preferences. In Microsoft Internet Explorer, the user can select Internet Options from the View menu. Then, by clicking on the Advanced tab, the cookie preferences can be set. The second method of cookie control is obtaining software such as Cookie Pal, Zdnet Cookie Master 2, Cookie Crusher 1.5, PGP Cookie.cutter and Crumbler97. These shareware programs can be downloaded from Cookie Central's web site and require a nominal registration fee.
Because of the cookie control given to users, web sites may want to undertake greater responsibility for cookie use. As stated by Robert Andrews, the Netscape webmaster, "The bottom line is that users have the power to reject cookies, so if the Web industry wants to use them, they're going to use them responsibly" (Bowen). For a web site, addressing the surreptitious nature of cookies may entail providing details regarding that site's use of cookies in a privacy statement linked to each page within the site. However, according to a survey of the 100 most popular web sites conducted by the Electronic Privacy Information Center, 17 of the 24 sites that utilized cookies had privacy policies, but none informed their users (Corr).
RFC 2109: HTTP State Management Mechanism, a proposed standard for cookies written by the HTTP Working Group of the Internet Engineering Task Force (IETF), outlines a more formal solution for informing users about a site's use of cookies. David Kristol, co-author of the proposal, hopes RFC 2109 will persuade web sites to alert computer users about cookie use. The proposal specifies the inclusion of a cookie HTTP response header with a simple explanation of the cookie purpose and a URL where user can access more detailed information. The proposed syntax for the Set-Cookie2 HTTP Response Header includes two additional attributes (Kristol):
Comment=value
(optional): "allows an origin server to document how it intends to use the cookie"CommentURL="http_URL"
(optional): "allows an origin server to document how it intends to user the cookie. The user can inspect the information identified by the URL to decide whether to initiate or continue a session with this cookie."Kristol continues to revise the working draft of RFC 2109 while encouraging browsers and site administrators to support the new standard. He states, "Of course, there's no Internet police, but we will try to get everyone on board"(
Slatalla).3. The information contained in a cookie can be read using a text editor or word processing program, which creates a potential security breach. Two methods of confronting this security issue are encryption and non-persistent cookies. Using encryption, any information in the cookie can only be viewed and used by the server that originally set it. Non-persistent cookies are maintained for the duration of the current session, and therefore, cannot be accessed at a later time. (
Webmaster Report Part 3)4. Because a cookie lives on the hard drive of the computer to which it was originally transmitted, problems can occur for organizations who advocate computer sharing, such as a university. If a client logs on to a site and transacts business, the next client can use the browser history to return to that site and potentially use the first client's account. Password protection is a potential solution, but passwords can often be easily obtained by monitoring a user during an Internet session. (
Chuck)5. Companies that use cookies for target marketing could "amass in a single cookie all the information a surfer's web surfing software makes available to all those sites" (
Webmaster Report Part 3). Because targeted marketing companies (TMC) use one cookie to monitor a user's progression through multiple sites that contain the TMC's ads, one cookie could potentially contain all the data gathered from one user. It is important to note, however, that a cookie cannot be accessed by any other server than the server that originally created it. Also, the server that created the cookie cannot use it to access any information on the user's computer except data within the cookie. To reiterate, "Cookies are not viruses or agents that scan a user's hard disk. A cookie can't, on his own, determine a user's hat size. A cookie can, for example, store a user's physical dimensions if the user completes a demographic form" (Michelson & Rein).As seen in these issues, striking a balance between the benefits of cookie use and the potential consequences of use presents some difficulty for Web users and developers. While maintaining this balance on specific issues, it is important for users and developers to keep three general concepts in mind. First, cookies do not harm the user's computer. The
Computer Incident Advisory Capability of the U.S. Department of Energy issued an information bulletin, which assesses the vulnerability of systems because of cookie use:"The vulnerability of systems to damage or snooping by using web browser cookies is essentially nonexistent. Cookies can only tell a web server if you have been there before and can pass short bits of information (such as a user number) from the web server back to itself the next time you visit. . . .Information about where you come from and what web pages you visit already exists in a web server's log files and could also be used to track users browsing habits, cookies just make it easier."
Second, cookie netiquette guidelines (
Webmaster Report Part 4), should be followed to further alleviate Web user concerns.Finally, if all else fails, sit back, relax and dive into the real thing -
|
The Practically Perfect Chocolate Chip Cookie
|
|
|
1. Preheat oven to 375F. 2. Mix the flour, baking soda and salt in a bowl and set aside. 3. Use a stand-type electric mixer to mix the two sugars briefly at low speed. 4. Add the butter in small gobbets, mixing first at low speed and then at high. Beat the mixture until it's pale, light, and very fluffy. 5. Add the vanilla at the mixer's lowest speed, then beat at high speed for a few seconds. 6. Add the eggs, again at the lowest speed, switching to high speed for the final second or so. The eggs should be well beaten in, and the mix should look creamed, not curdled. 7. Add the flour mixture, a half cup at a time, mixing at low speed for about one minute, then at high speed for a few seconds. 8. Scrape down the bowl's sides with a spatula, add the chocolate chips, and mix at low speed for about 10 seconds. If need be, scrape the bowl's sides again and mix for a few more seconds. 9. Put tablespoons of the mix on an ungreased cookie sheet. 10. Bake until the cookies are pale golden brown (nine minutes in an electric oven, 10 to 11 minutes in a gas one). 11. Remove and let cool on a rack. Makes about 40 medium cookies.
|
|
Bowen, Barry D. "How popular sites use cookie technology." NetscapeWorld. (Apr. 1997): n. pag. Online. Internet. 23 Oct. 1998.
Clarke, Roger. "Cookies." 1 June 1998. Online posting. Roger Clarke's Cookies Page. 2 Oct. 1998.Cookie Central. Cookie Test. Online posting. 28 Oct. 1998.
Cookie Central. Cookie Software. Online Posting. 28 Oct. 1998.
Corr, O. Casey. "Cybersnoops on the Loose." The Seattle Times 10 Aug. 1997: B5. SIRS Researcher. CD-ROM. Fall 1998.
Kristol, David M. "HTTP State Management Mechanism." 29 Jul.1998. Online posting. Internet Engineering Task Force. 23 Oct. 1998.Machlis, Sharon. "Cookies are a marketer's dream, but do they watch too closely?" Computerworld 13 Jul.1998: 25. Infotrac. Online. Expanded Academic. 22 Oct. 1998.
Michelson, Greg and Lisa Rein. "Five reasons people find cookies objectionable & how to address them." NetscapeWorld (Feb. 1997): n. pag. Online. Internet. 23 Oct. 1998. Netscape. "Cookies and Privacy Frequently Asked Questions." Online posting. Netscape FAQ. 28 Oct. 1998.Netscape. "Persistent Client State HTTP Cookies." Online posting. Netscape Support Documentation. 5 Oct. 1998.
P&AB Harris. "Internet Privacy Survey." Privacy & American Business. July/Aug 1997. SIRS Researcher. CD-ROM. Fall 1998.
Rejonis. Charles. "Opening the HTTP cookies jar." Netscape World (Jul. 1996): n. pag. Online. Internet. 23 Oct. 1998.Slatalla, Michelle. "Cookies may annoy but they don't hurt." New York Times 2 Apr. 1998, late. ed.: G11. New York Times Online. Online. Nexis. 22 Oct. 1998.
Slayton, Marc. "An Introduction to Cookies (Part 1 & 2)." Online posting. Webmonkey geektalk. 5 Oct. 1998. The Webmaster Report. "Magic Cookies - Web Delicacy or Half-Baked Blunder? Part 1" The GH Interactive Webmaster Report Issue #2. (5 May 1997): n. pag. Online. Internet. 23 Oct. 1998.The Webmaster Report. "Magic Cookies - Web Delicacy or Half-Baked Blunder? Part 2" The GH Interactive Webmaster Report Issue #3. (12 May 1997): n. pag. Online. Internet. 23 Oct. 1998.
The Webmaster Report. "Magic Cookies - Web Delicacy or Half-Baked Blunder? Part 3" The GH Interactive Webmaster Report Issue #4. (19 May 1997): n. pag. Online. Internet. 23 Oct. 1998.
The Webmaster Report. "Magic Cookies - Web Delicacy or Half-Baked Blunder? Part 4" The GH Interactive Webmaster Report Issue #5. (26 May 1997): n. pag. Online. Internet. 23 Oct. 1998.
U.S. Department of Energy, Computer Incident Advisory Capability. "Information Bulletin I-034: Internet Cookies." 12 Mar. 1998. Online posting. CIAC Bulletins. 5 Oct. 1998.
Whalen, David. "The Unofficial Cookie FAQ." Online posting. Cookie Central. 2 Oct. 1998.