Privacy on the Public Internet






 

Internet Privacy - Fact or Fiction?

Thomas R. Hintz

Abstract:

The Internet has become the preferred method for communications and information access and exchange. This use, however, has come at an expense that most users do not realize, a loss of privacy. In the early stages of Internet activity, users were almost anonymous. Today things are different. Anonymity is disappearing.

Browsing the Internet leaves a trail of your web site visits. Posting to Usenet News and your web pages may increase the receipt of mail Spam. Web site vendors are sharing information to build a user profile of their visitors. Have you been profiled? It is in your best interest to understand the problem and find out what can be done to eliminate or reduce this unwanted intrusion on your privacy.

Introduction:

The Internet has become the preferred method for communications and information access and exchange. During the last few years this use has come at an increasing expense that most users do not realize, a loss of privacy. In the early days of the web, one of the well-known cartoons showed a dog sitting at a computer keyboard and saying "on the Internet no one knows you are a dog." Well, that has all changed. When the Internet became commercialized in the late 1990s, a new breed of user emerged, the retailer. No longer limited to the use by only government and academia, businesses connected to the Internet in ever increasing numbers expecting big returns from e-commerce. They expected the customers to come to them and buy on-line. But something was missing, the customer.

The Internet was a new and unknown environment for both sellers and buyers. Before sellers could be successful on the Internet they needed to understand the new environment and adapt marketing strategies. The most important component of sales has always been "location, location, location." This they had, but so did everyone else. To gain an advantage on this level playing field it was necessary to advertise. Find the customer and tell them where you are located. But how? This has never been done before.

As a result of this quest, users are now visibly besieged with increasing amounts of junk mail and pop-up advertisements as they tour the Internet. But far worse and unseen, users are unknowingly giving away their personal information. This survey is being collected in small bits and pieces. A question here, another one there, your PC provides some information, along with your browser, some freeware program and an e-mail message. The data collected is being merged and compiled into a personal profile for each user. Are you concerned? Your personal information is a valuable commodity and many users are just giving it away, for free. It should be protected.

The vast majority of users are concerned about their privacy. According to a 1999 Forrester Research privacy study, consumers are extremely concerned 67%, somewhat concerned 24% and only 9% had little or no concern. However, they are unaware of their vulnerability and the information that is unknowingly being collected about their activities. They are even less aware of how to protect themselves on the Internet.

The collection of your personal information takes on many forms. Today, application programs are being utilized like Trojan horses for stalking you on the Internet. Their diversity of methods for collection continues to grow. Unfortunately there only a few privacy laws that govern the way businesses operate on the Internet.

Web Browsers:

The development of the World Wide Web and the use of web browsers was the beginning of widespread data collection on the Internet. The underlying HTTP protocol and HTML language allowed for the transfer and collection of basic information about the requesting user. To see what is transferred from your computer you can go to the Snooper web site (http://snoop.cdt.org/). The data sent with each transfer includes the return address for sending the requested web page as well as technical information intended for programmers to customize web page layout. It was also collected in log files for web site owners to evaluate their site activity. This type of information by itself is of minimal concern. But, your fingerprints in the form of an IP address (and more) are left at every site visited. Other types of data are becoming accessible from the browser. For example, Internet Explorer 5.0 web browser also informs a web site when a user bookmarks their page.

Cookies:

As HTML development continued, new features were added that enhanced the capabilities of the browser and provided new tools for the programmer. One of these tools was the birth of the cookie. In January of 1997, RFC 2109 was released (http://info.internet.isi.edu/in-notes/rfc/files/rfc2109.txt). It was the Internet Engineering Task Force Guidelines for cookie use on the Internet. When written, the authors saw the potential for abuse of privacy. Consequently, Sect 7.1 requires the browser to give user control over this intrusion. Browser designers incorporated the cookie concept but ignored this privacy section. Instead, browser defaults are set to accept cookies. It is up to the user to deactivate (Opt-out) this intrusion.

While cookies do have value to both the user and web providers and are safe and even helpful, they can be misused. Beneficial to the user when dealing with a company that has a good privacy policy in place, it is of questionable value when left open and available to the world-at-large. The real problem is with aggregation of data from multiple sources resulting in a user profile. Collected personal information is now being treated as a commodity belonging to the collectors.

Few browser users know and understand the utility of cookies or how they should be managed to their advantage. Many users do not go beyond the knowledge that cookies exist. Their use (or abuse) takes advantage of the user's inexperience. At its best, cookie management in browsers is quite primitive. For now, three options are usually to accept all cookies (risky), being prompted for each cookie (tedious) and blocking all cookies (eliminates personalization). To augment this limited selection it is necessary to utilize third part software. Many free programs are available.

Web Bugs:

One of the latest innovations being used to track Internet users is called the Web Bug. These bugs can be found in various applications including browsers, Usenet News and e-mail provided they support HTML. Impossible to see and difficult to identify, they usually go undetected. A web bug can be as small as an invisible 1x1 pixel. The request for the pixel to display is sent to an advertiser that can return a cookie. This method hides the fact that monitoring (eavesdropping) is taking place.

Advertisers use them extensively. More than 400,000 have been found on web pages for advertisers like BeFree, DoubleClick and ClickTrade. They are designed to monitor the reading of a web page or e-mail message. These web bugs can also carry back information to the advertiser or sender that can include when and where the message was read and who else read it. Turning off web graphics will prevent this transfer, but most users prefer the look of a graphic enhanced web page.

Some providers have been truly inventive in circumventing roadblocks to collecting personal data. Even if cookies are blocked, the server may distribute web bugs with "fake dates". Whenever the browser cache contains a web bug image (including an advertisement), it asks the server if the image has been modified since the "fake date". This instantly and uniquely identifies the user to the server since every person being tracked is given a different "fake date".

Hardware:

Hardware components have been utilized to identify users. Being able to identify and verify a user is important for e-commerce. There were no unique identifiers that could not be easily changed on a system, thus no way for absolute system identification.

Intel intended to remedy this problem when they introduced the Pentium III chip. They had incorporated into the chip design a unique processor serial number (PSN). Again, the PSN was turned on by default. In January 1999 the news reached the public about the PSN. The response was immediate, continuous and negative. Only after much unrest did Intel provide a program to turn the PSN off. While this appeased the public, methods were later found to programmatically retrieve the number even if turned off. Bowing to consumer pressure, in April 2000, Intel announced they would not include a PSN in future chips starting with the 1.5 MHz Pentium IV chip.

A relatively new peripheral is being made available to the public, the CueCat barcode scanner. Marketed as the simple way to surf the web, this device is available for free at Radio Shack stores. Its intended function is to allow the user to scan a products bar code and automatically send the web browser to the home site for the product. After being inspected by curious hardware hackers, each CueCat was found to contain a unique ID tracking number. In use, when a barcode was scanned, the code along with the ID would be sent to a central database before returning your requested URL. Since you are required to register to use this "free" service, all your bar code requests can be logged. Because it is not a mainstream product at this time, the privacy issue with this product has not created the publicity and uproar of the Intel PSN. Interestingly, concerned programmers have dissected the cat and found ways to eliminate this identifier and convert the CueCat into a simple, local scanning device.

Software:

Commercial software has included tracking information. During March 1999, Microsoft WORD, Excel and PowerPoint files were found to contain a Globally Unique Identifier (GUID) in their headers. This GUID, called a Microsoft System ID (MSID) by the manufacturer, contained the Ethernet address of the installed network card. It was sent with online registration and imbedded in cookies for Microsoft access. Individuals who developed the original WORD, Excel or PowerPoint document would have their unique identifier unknowingly included.

Spyware:

While the hardware and software intrusion is passive, a special form of software takes a more active roll in sending information about the user back to the program developer or company. This special type of code is called Spyware. It may come imbedded in various programs that are distributed as freeware, shareware or helper applications. While the free cost and utility of the program entice you to download and use, without the user's knowledge or explicit permission it performs additional tasks secretly gather information about the user and relay it to advertisers or other interested parties.

The Netscape Smart Download add-on utility keeps Netscape aware of what you are downloading from the network. Every request for a download is sent with your ID to Netscape. Even off-line, some spyware collects information about your activities. For example, the initial release of RealNetwork’s RealJukebox collected your off-line listening habits and sent them along with a unique GUID to the company when an Internet connection was reestablished. More than 800 infested software applications have been identified (http://www.infoforce.qc.ca/spyware/).

Specialized privacy probes:

With the inclusion of some brief JAVA code to the web bug mentioned above, it is also possible to retrieve any comments added to an e-mail message and determine who added the comments. Discovered in 1998 by Carl Voth, computer engineer, this technique is in essence an E-mail wiretap. This ability was not made public until February 2001. It is not clear if this technique is legal, but it certainly is doable with a minimum of code. Microsoft has released a downloadable patch (http://office.microsoft.com/Assistance/2000/Out2ksecFAQ.aspx) to disable all executable file types as attachments to e-mail. An immediate fix is to disable JavaScript in Outlook e-mail or your browser.

Programs are also available to determine where the machine you are using is located. Locator technology can now pinpoint your physical location down to country and city with 90% accuracy. With the help of computers distributed throughout an network, your computer can be located using computer triangulation. Although it is possible to pinpoint the location down to a zip code, developers indicated that has not been done (yet).

Verifications:

Another method for collecting personal information is under the guise of renewal verification. Previously when renewing free magazine subscriptions, a signature was required. Publishers now accept on-line renewals with a different form of verification. They ask simple but personal questions. By itself it may not seem important. But, combine the information asked, such as eye color, mother's maiden name, city of birth, state of birth, birth month, birth day, birth year and it begins to resemble a personnel form.

Opt-In vs. Opt-Out:

The current debate on the method for controlling on-line advertising and spam is over the implementation of an “opt-in” or “opt-out” system. Should the public be required to “opt-out” whenever an unwanted message is received? Or, should the public “opt-in” before specific information is sent? The current method to signify “opt-out” is for each company to place a special cookie on your system. With the potential for millions of on-line companies, user maintenance of an “opt-out” list will become a daunting task. You are "opting out" only as long as the special "opting out" cookie exists on your hard drive. If you were to clean up your cookie accumulation and accidentally delete an "opting out" cookie, you have just opted back into the system. You are sure to pick up another one of their "regular" cookies on your next surfing session and the process starts all over again.

Public opinion shows that 71 percent favor "opt-in" policies that place the burden of keeping information private on Web sites and not on users. Although people want “opt-in”, the industry is moving very strongly in the direction of “opt-out” policies. It is clear the online industry is moving away from public opinion on privacy because it is in their best interest.

For now, a decision about responding to an “Opt-out” option at the end of a spam message can be frustrating. Because of its misuse within spam messages, a user has no way of knowing if a response will actually be an “opt-out” or just a ploy to confirm the validity of an active e-mail address of someone who reads spam mail messages. This scheme is also known as an “opt-out” mailing list and is a terrible alternative to “opt-in”.

Spam:

The first major spam occurred in April 1994. Two lawyers in Arizona, Laurence Canter and Martha Sigel, sent an advertisement to over 6,000 newsgroups offering their services with the ``Green Card Lottery.'' Since then, spam continues to be the scourge of the Internet. Filling many mail boxes with mass distributions of unsolicited e-mail messages in an attempt to force the message on people who would not otherwise choose to receive it. Most spam is commercial advertising, often for dubious products, get-rich-quick schemes, or quasi-legal services. According to the Electronic Messaging Association, Junk mail now accounts for an estimated 10% of all e-mail traffic.

If you receive e-mail (who doesn’t), do not forward the message or reply to spam remove requests. You should ignore, delete and filter incoming e-mail. The only sure way to reduce spam is to change your e-mail address. But, this is only a temporary solution and inconvenient. Interestingly, it was found that the percentage of computer users who receive spam increases the longer they use any one ISP.

Unfortunately it is not possible to develop the perfect spam filter because there is little difference between the e-mail you do want and the spam you do not want. Surprisingly, even spam itself can be collecting information with the use of web bugs. When used with junk e-mail (spam), because the sender of the message already knows the e-mail address, they can include the e-mail address as a Web bug parameter thus providing a more complete picture of who and where you are..

Rules, Regulations And Standards:

While security has been the big topic in the past and continues to be an ongoing problem, privacy has challenged for the lead. More than 450 privacy-related bills have been introduced in state legislatures within the past few years. About 50 have appeared at the national level this year. Few if any have been passed for now. Currently, only three major privacy bills are available. They are:

There is no guarantee that even watchdog groups like BBBonline and TRUSTe make a difference. Their seal is only an indication that a privacy policy is posted and allows an “opt-out” option for consumers. This is the type of solution the industry is supporting in place of legislation. As a result, it is legal to collect any information about most users on-line and many web sites are doing just that. Already, privacy rules are much stricter in Europe where it is illegal to resell consumer information without their approval. The European Union appears to be taking the lead in supporting privacy laws. Europe is ready to implement stricter protection this year while the U.S. is delaying action. Consequently, until adequate legislation is passed to protect consumer privacy, individual users must take the initiative to implement the necessary safeguards that provide the level of privacy protection desired.

Abuse:

Is this personal information being abused? Yes! One example occurred last year with Amazon.com. It was reported that various users were quoted different purchase prices for the same DVD or MP3 player. After confronting Amazon, it was found they were using what is called dynamic pricing. Prices were determined according to the estimate of how much a given consumer is willing to pay. It was not clear what profile was used to make this determination. It could have been based on their previous buying habits or even on the level of affluence based on their zip code.

Solutions:

Because current legislation regarding online privacy does not favor the public in most cases, it will be up to the users to protect themselves. The best thing to do is to maintain a low profile while surfing the web. Begin by cleaning your browser with the removal of all personal information.

Honesty is not necessarily the best policy when filling out blind forms. Overcome the tendency to be completely truthful and informative. Don't over share information by answering questions that are unnecessary for the service rendered. Remember you are communicating with someone you do not know. Create a virtual user with characteristics you can easily remember (i.e., born 6/6/66, name John Smith) and only provide accurate information when absolutely necessary. A user survey found that 24 percent of users had supplied false information to web sites.

Make sure a secure link is used when transferring personal or financial information. Check the KEY or LOCK browser icon that indicates a secure link. The URL should begin with HTTPS:// . Set up a secondary e-mail account to protect the primary address from vendors. Don't provide your e-mail address when connecting to FTP servers. Hide behind a Proxy server or Firewall or both and use DHCP when available.

Manage your cookies. Download a free cookie viewer program to delete all unwanted cookies. Removal of unwanted cookies is necessary because disabling cookie acceptance does not block use of existing cookies. If in doubt, delete them all and start with a clean cookie file. Configure your browser to not accept any cookies (unless approved). Visit some of the major advertising sites and look through their privacy statements to find the “opt-out” option. Accept their “opt-out” cookie. Other cookies that could be accepted could contain username and passwords, portal setup information, preferred e-commerce sites and personalization for acceptable sites. Remember, cookies are beneficial when used correctly.

Avoid spambots that collect e-mail addresses from web sites and Usenet News sites. Do not use the mailto: tag in HTML. Instead, disguise the address with obvious characters that need to be removed before use (smith@NOSPAMwork.com), spell out the address (smith at work dot com), encode it with HTML (smith@work.com) or create a graphic image of the address.

Conclusion:

As shown, many methods are being used to track and collect personal information. While the programming ingenuity is to be admired, the ethical and legal use of the collected data is yet to be determined. But the collection of personal information to create a user profile can be beneficial provided the user maintains control. A profile allows personalization that can give you the sense of belonging. It allows a site to greet you by name, offer you services and products you prefer, and not being required to rekey requested information. You can truly appreciate these benefits. But don't be lulled into a false sense of security. Review the privacy statement of a vendor before establishing an open relationship with their site. Observe the same precautions you would take in dealing face-to-face with a stranger.

Make no mistake; your personal information is being collected. Until the Internet, as a whole, becomes a place of trust, users should remain vigilant of the unwelcome, invasive activities. Protect yourself. As with any environment there are safe and dangerous locations to visit. You will eventually learn the safe spots where you can let your guard down and share information in a cooperative exchange for your mutual benefit. Until than, keep your guard up. You are being watched.

Support Documentation:

 
 
From the notebook of Thomas Hintz
the AgriGator