2010年11月18日 星期四

Long Tail Effect and Streisand Effect in Web 2.0 Age

    The emergence of Web 2.0 is changing our life dramatically. Some phenomena, which usually appear in other fields rather than the world of WWW, may actually have significant influence in Web 2.0. And also, some new phenomena are taking place now and even more will unveil themselves in the near future due to the arrival of Web 2.0 age. In this blogger, I am going to talk about two phenomena, which are Long Tail effect and Streisand     effect, in the context of Web 2.0.
Long Tail Effect
    As a matter of fact, the term Long Tail originates from the field of statistical distributions, which is a long-know feature and also know as heavy tails, fat tails, power-law tails or Pareto tails.It describes the statistical property that a larger share of population rests within the tail of a probability distribution than observed under a 'normal' or Gaussian distribution[1] Below is an illustration of Long Tail.
File:Long tail.svg
    The vertical axis of this illustration means the popularity, and its horizontal axis means product. The so called Long Tail is the yellow part, the right part, which has a narrow height and a long width and looks just like a real long tail. Notice that the areas of left region and right region match.
    Long Tail challenges the 80-20 rule or Pareto principle, which suggests that a market with a high freedom of choice will create a certain degree of inequality by favoring the upper 20% of the items ("hits" or "head") against the other 80% ("non-hits" or "long tail").[1] According to Christ Anderson who has popularized Long Tail by giving Amazon.com and Netflix as examples of applying Long Tail strategy, the reason why products in low demand or that have a low sales volume can collectively make up a market share that rivals or exceeds the relatively few current bestsellers and blockbusters is that Internet provides huge distribution and sales channel opportunities.In other words, the age of Web 2.0 says goodbye to 80-20 rule and embraces Long Tail Effect, thus brings whole new opportunities to explore business.
    There are many companies adopting this niche strategy as part of their businesses all around the world, especially Internet companies. Examples include eBay (auctions), Yahoo! and Google (web search), Amazon (retail) and iTunes Store (music and podcasts) amongst the major companies, along with smaller Internet companies like Audible (audio books) and Netflix (video rental).[1]
Streisand Effect
    Unlike Long Tail effect, Streisand effect appears in exactly the world of WWW. It is a primarily online phenomenon in which an attempt to censor or remove a piece of information has the unintended consequence of causing the information to be publicized widely and to a greater extent than would have occurred if no censorship had been attempted.[2]
    The perfect example would be the original incident that coined the term Streisand effect, in which Barbra Streisand's unsuccessful attempt to suppress photographs of her residence unexpectedly caused further publicity.
    Other examples include: on 5 December 2008, the Internet Watch Foundation (IWF) added the Wikipedia article about the 1976 Scorpions album Virgin Killer to a child pornography blacklist, considering the album's cover art "a potentially illegal indecent image of a child under the age of 18." but the article quickly became one of the most popular pages on the site, and the publicity surrounding the censorship resulted in the image being spread across other sites. In November 2009, Wolfgang Werlé and Manfred Lauber, convicted for the murder ofWalter Sedlmayr, demanded their names be removed from an article on the German language Wikipedia due to German laws. The German Wikipedia complied, but the information was widely publicized as a result.[2] There are much more similar cases in recent years all over the world.
Why so many effects in Web 2.0 age
    Besides the two effects I mentioned above, actually there are many other effects happening in the Internet, among which some we may even don't realize and there are no names for them. So we may ask why so many effects are in the Web 2.0 age.
    In my opinion, it is an inevitable result of the revolution of Web 2.0. After all, Web 2.0 is commonly associated with web applications that facilitate interactive information sharing, interoperability, user-centered design, and collaboration on the World Wide Web.[3] 
    Regarding to the definition of Web 2.0, I focus on the word "user-centered" here. As the technologies become more and more developed, the psychological world of human being will be more and more reflected in the Internet. Therefore, consider the World Wide Web as the exact reflection of our real daily life, all these effects will seem to be so natural.
Reference
[1] http://en.wikipedia.org/wiki/Long_Tail
[2] http://en.wikipedia.org/wiki/Streisand_Effect
[3] http://en.wikipedia.org/wiki/Web_2.0

2010年9月23日 星期四

StAX - Another XML Programming Library

     XML is now widely used to exchange data over Internet. Many emerging new technologies are also based on XML. Therefore, how to parse an XML file, to manipulate contained XML data and even to generate an XML file seems to be a very fundamental task of many applications.
    There are a variety of existing APIs to XML processing. To be more specific, these APIs can be categorized into four classes: (http://en.wikipedia.org/wiki/Xml)
    1. Declarative transformation languages such as XSLT and XQuery;
    2. XML data binding, which provides an automated translation between an XML document and programming-language objects, and this may sound exciting to object-oriented programmers, for instance JAXB in Java;
    3. Tree-traversal APIs accessible from a programming language, for example DOM and XOM;
    4. Stream-oriented APIs accessible from a programming language, for example SAX and XNI.

    Here, I am going to talk about StAX, the streaming API for XML, which is a standard XML processing API that allows you to stream XML data from and to your application. (http://stax.codehaus.org/Home). As the StAX itself indicates, StAX belongs to the stream-oriented APIs' category. But actually it differs from previous APIs, like SAX or XNI.
    The major difference is about the patterns they adopt. For previous APIs, like SAX, they employ a push pattern, in which they pass the content of XML documents to applications as soon as they see it, regardless of applications' readiness for that data. On the other hand, StAX adopts the pull pattern, in which applications ask the StAX parser to pass data actively, not fed data passively. In another words, in a pull API the client program drives the parser, whereas in a push API the parser drives the client. (http://www.xml.com/pub/a/2003/09/17/stax.html?page=1)
    Another big difference is that StAX is a bidirectional API, which means that StAX can not only read XML documents, but also create XML documents. The situation of SAX is that it doesn't support writing data to XML files.
    In general, since StAX is stream-oriented API, compared to DOM or other tree-traversal APIs, it has the abilities of fast XML processing, less memory comsuming and so forth. This features seem to be more valuable when XML documents are larger than a few megabytes, and will be a very good option in the ubiquitous computing environment because devices in the environment are constrained.

    Before I do a demo, I just want to point out that StAX is a pure Java API and is parser independent, and it is standardized as JSR-173 specification. What's more, StAX is included in JDK 6.0 (some may prefer JDK 1.6), so using StAX is becoming natural in JDK 1.6. But for those developers who are using JDK 5.0 or below, they can just download a jar file at http://dist.codehaus.org/stax/jars/ and use it. Following I will show a demo to illustrate that how convenient it is to use StAX in the more familiar iterator design pattern rather than the less well-known observer design pattern (like SAX does), and I will just use JDK 6.0 to simplify the configuration and to show JDK's natural StAX ability.

Parsing documents with StAX
    The XML documents I am going to use is called weather.xml, its content is as follows:


<?xml version="1.0" encoding="UTF-8"?>
<WeatherReport date="2007-08-12">
<City name="Hong Kong">
<Report time="09:00:00">
<Weather>Cloudy</Weather>
<Temp unit="C">34</Temp>
</Report>
<Report time="21:00:00">
<Weather>Thunder</Weather>
<Temp unit="C">26</Temp>
</Report>
</City>
<City name="Macao">
<Report time="09:00:00">
<Weather>Cloudy</Weather>
<Temp unit="C">31</Temp>
</Report>
</City>
<City name="Beijing">
<Report time="09:00:00">
<Weather>Sunny</Weather>
<Temp unit="C">34</Temp>
</Report>
</City>
</WeatherReport>

    XMLStreamReader is the key interface to read documents in StAX. This interface represents a cursor that's moved across an XML document from beginning to end. At any given time, this cursor points at one thing: a text node, a start-tag, a comment, the beginning of the document, etc. The cursor always moves forward, never backward, and normally only moves one item at a time. You invoke methods such as getName and getText on the XMLStreamReader to retrieve information about the item the cursor is currently positioned at. (http://www.xml.com/pub/a/2003/09/17/stax.html?page=1) My code to display the contained information of weather.xml is pasted below: 

public class StAXDemo {
public static void main(String args[]) {
XMLInputFactory factory = XMLInputFactory.newInstance();
try {
XMLStreamReader parser = factory.createXMLStreamReader(new FileInputStream("src/weather.xml"));
for (int event = parser.next(); event != XMLStreamConstants.END_DOCUMENT; event = parser.next()) {
switch (event) {
         case XMLStreamConstants.START_ELEMENT:
         if(parser.getLocalName().equals("City")){
         System.out.println(parser.getAttributeValue("", "name") + ":");
             }else if(parser.getLocalName().equals("Report")) {
             System.out.println("\t" + parser.getAttributeValue("", "time") + ":");
             }
         break;
         case XMLStreamConstants.CHARACTERS:
         if(!parser.getText().trim().equals(""))
         System.out.println("\t\t" + parser.getText().trim());
         break;
}
}
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (XMLStreamException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
The execution result of the code is: 

Hong Kong:
09:00:00:
Cloudy
34
21:00:00:
Thunder
26
Macao:
09:00:00:
Cloudy
31
Beijing:
09:00:00:
Sunny
34

    From the statement of for loop, it is obvious that StAX is not event-driven like SAX does, and somehow it becomes easier and more plain. Also, I think that the for loop statement gives us a sense of pull pattern, in which my code asks for relative data, not be fed passively.

Write XML documents with StAX
    Creating XML documents with StAX is also very easy. We have XMLStreamWriter instead of XMLStreamReader, and it provides us a variety of methods to construct elements, attributes, text and all other parts of an XML  document. Here is my code to write a simple XML document:

public class StAXDemo2 {
public static void main(String args[]) {
try {
OutputStream out = new FileOutputStream("data.xml");
XMLOutputFactory factory = XMLOutputFactory.newInstance();
XMLStreamWriter writer = factory.createXMLStreamWriter(out);
writer.writeStartDocument("UTF-8", "1.0");
writer.writeStartElement("WeatherReport");
writer.writeAttribute("date", "2007-08-12");
writer.writeStartElement("City");
writer.writeAttribute("name", "Hong Kong");
writer.writeStartElement("Report");
writer.writeAttribute("time", "09:00:00");
writer.writeStartElement("Weather");
writer.writeCharacters("Cloudy");
writer.writeEndElement();
writer.writeStartElement("Temp");
writer.writeAttribute("unit", "C");
writer.writeCharacters("34");
writer.writeEndElement();
writer.writeEndDocument();
writer.flush();
writer.close();
out.close();
}catch (IOException ioe) {
ioe.printStackTrace();
}catch (XMLStreamException xse) {
xse.printStackTrace();
}
}
}
The generated file is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<WeatherReport date="2007-08-12">
    <City name="Hong Kong">
        <Report time="09:00:00">
            <Weather>Cloudy</Weather>
            <Temp unit="C">34</Temp>
        </Report>
    </City>
</WeatherReport>

    Through this article, I just want to tell that there is another good option to process XML, which is stream-oriented and works as a pull pattern. So let StAX be a handy toolkit to Java developers, and make XML processing no longer a big issue! 

2010年9月8日 星期三

Personal opinions on Web 3.0

Currently we are in the age of Web 2.0, but technologies are evolving day by day, so what's the prospect of Web 3.0?


First, I want to share my opinions from the perspective of usage range. I believe that Web 3.0 is going to be ubiquitous, which means that people can not only browse pages on computers and smart mobile phones like  we do today, but also on various devices that we are going to devise, such as electronic devices of smart home, fancy cars, robots and so on. Image that we can listen newly released songs online on a washing machine when we do laundry, we check our emails on a GPS navigator when we take a break in a long distance travel, researchers in Antarctica enjoy live sports matches on a robot at nights. Isn't that amazing! To achieve this, I don't think that how to render the content on all kinds of devices should be our main concern. It's the communications that between miscellaneous devices and web servers that appears to be the crucial factor. But I do believe that we can conquer this. On one hand, high speed Internet will be available, the bandwidth of  single fiber can reach 10 Gb/s; on the other hand, new technologies of wireless and mobile networking will emerge, for example 4G is on its way to the pervasive computing age, and also wireless LAN and Blue Tooth are innovating.


Second, I want to talk about Web 3.0 from the perspective of its content. My opinion is that Web 3.0 will be 3D. We know that the content of web pages have evolved from static ones to dynamic ones and social websites like Facebook have attracted so many people all around the world. But still the content of the web now is 2D, since 3D is so natural to human beings and I believe Web 3.0 will definitely be 3D. Let me demonstrate my imagination on how I use social websites of 3D. When I establish an account of a social website, I will also establish a virtual figure of myself, users can make the virtual figures of themselves exactly the same as the real them or not due to their preferences. The way I go to my friends' websites is no longer clicking the links, instead I, the virtual one, can walk, drive or even take a flight to my friends' according to my friends' virtual addresses. I decide to visit my newly friend Johnny, he lives at the other side of the city so I drive my Porsche to his house. In his salon, there are pictures on the walls; there is a board on which he has wrote something to share; there also is a stereo and his favorite CDs beside it so I just pick one and play it; a television is in the middle and I can watch the videos he likes......In another word, everyone lives in another virtual world. But how can we accomplish it? I think there are two important factors here. First, it's HTML5, the next html standard, because HTML5 has new features like video elements and audio elements and can do things like video playback and drag-and-drop which are depend on bowser plugins. HTML5 makes all these natural and more easier. Meantime, supporting for HTML5 means that browsers have to adapt for new technologies. Second, it's WebGL, the 3D graphic API intended to be applied in web browsers. We know that OpenGL can help us manipulating underlying hardware to render 3D graphs and JavaScript runs on the client side to enhance user interface. WebGL now brings them together, so our web browsers can certainly show 3D graphs! Equipped with HTML5 and WebGL, the future Web3.0 will be 3D without doubt!


Last but not least, Web 3.0 will be intelligent. For instance, you drive into a town where you have never been before and you are running out of gas, so you want to refuel your car. Since you are not familiar with the town, you can turn to the internet. As I said before, Web 3.0 will be ubiquitous, you can use the GPS navigator of your car to search in the internet. You just input "Where is the nearest gas station" in the search engine, and the nearest gas station is on the map of the town and the fastest route to the nearest gas station is indicated. What's more, a voice comes out saying "There is no traffic on the fastest route and within the maximal speed permitted by the traffic department you can reach the nearest gas station in ten minutes. Would you like to begin the voice navigation", and if you answer yes, the voice navigation begins. In this scenario, actually there embodies much intelligence. First, it understands what you input by the sentence "Where is the nearest gas station" like a human being; second, it knows your position whereas you may not know where you are yourself; third, it communicates with relative organizations automatically and get the information about real-time traffic situation and traffic regulations; fourth, it calculates the time you need and offers you the service you may probably want the most. What a great job done by Web 3.0 here! With Web 3.0, it seems that we have nothing to concern about our life! But can we really achieve this? The answer will be positively yes. As far as I concern, this can be achieved by the emerging technology called Semantic Web which, by the W3C definition, provides a common framework that allows data to be shared and reused across application, enterprise and community boundaries. In the view of Semantic Web, Web 3.0 is a web of data based and all the data is interchangeable because the data is based on RDF. As the data interchange is available, useful data can be retrieved and intelligent results seem not so hard to produce. In another word, intelligent Web 3.0 isn't hard to achieve.


All in all, my prospect of Web 3.0 is that it is ubiquitous depending on various wireless communications, that it is 3D representing the digital world exactly like our real one, that it is intelligent facilitating us whenever and wherever.