Tuesday, September 28, 2010

TCP--does it need to change?

In the graduate networking class today, we were discussing two papers about TCP. One was Congestion Avoidance and Control (CAC) and the other was Simulation-based Comparisons of Tahoe, Reno, and SACK TCP (SCTRS). CAC was written by Van Jacobson, the inventor of TCP. I find the history of TCP very interesting because it is a pretty elegant solution to a very hard problem. The original Tahoe implementation has very deterministic behavior in any kind of situation. For instance, when a packet is lost, you can expect it to act the same regardless of the number of packets loss. The same can't be said for TCP Reno. That really tells you how solid of an algorithm it is. Modern improvements, such as SACK, seem to make it a little more efficient.

So, where can improvements be made? In class, we were discussing another TCP implementation called Vegas, one that uses sending rates rather than window sizes to control congestion. It seems to me like the internet congestion control algorithms seem to work just fine. The only things causing congestion on the internet are P2P networks and spam, but those are application layer issues. I’ve never heard anyone complain about TCP. It provides reliability and congestion control that seem to be working. What more can we ask that layer to do? Maybe transport is where security should go instead of the network or application layers.

Saturday, September 25, 2010

Online Gaming

Online gaming has become a very successful business. With millions of users daily, the server capacity required to host a system like that (e.g. World of Warcraft) is humongous. One of the interesting applications of P2P would be in MMOG’s. P2P can already deliver content in a very speedy fashion. A node in a P2P system can also take advantage of all its available bandwidth. A paper was recently presented in class that proposed a system called Donnybrook. Basically, they took the Quake III source code and modified it to include P2P networking and other tools I will mention to make the game play smooth. Two things they did that I thought were interesting were interest sets and doppelgangers. Interest sets are a set of equations that decide who out of all the players you are most interested in receiving real-time updates about. These equations use player proximity, field of vision, and other aspects of the game to choose the top 5 players in the game you are most likely interacting with. These interest sets change frequently, but they make it so you only have to receive real-time updates from a small number of players—you receive updates from the other players once per second. At first this raised a red flag in my mind when I thought, “well how do they smooth out the game play?” It wouldn’t be acceptable to have players jerking around the map. One of the other things they did that addressed this issue was implementing doppelgangers (i.e. bots). Basically, the game will measure player behavior and predict movement patterns. Therefore, during the time between the longer updates, bots are moving the players in the direction they think the players would have taken—pretty cool concept. The authors were able to achieve a P2P game of Quake III with 900 players using these techniques. That’s amazing!

I’ve previously mentioned that P2P systems cause ISP’s much consternation. If games move to a P2P architecture, will they be able to throttle the traffic like they are? I ask this question because, most people will not contest an ISP’s decision to throttle P2P traffic because they are likely downloading media illegally. But with games, there is nothing illegal about that. There could be some interesting business deals that come out such a gaming system. Another interesting thought would be to employ previously mentioned peer selection techniques to make sure that inter-ISP traffic was minimized. These techniques would also improve the throughput of the game. There are lots of interesting applications for P2P systems, but politics always seem to get in the way (tongue and cheek of course) of innovation.

Thursday, September 23, 2010

BitTorrent clients

Studies of torrent P2P networks are always interesting. I recently read two papers (TopBT: A Topology-Aware and Infrastructure-Independent BitTorrent Client and Taming the Torrent: A Practical Approach to Reducing Cross-ISP Traffic in Peer-to-Peer Systems) about improving the efficiency of torrent P2P networks. One of the biggest problems with torrent systems is that, generally, the only metric used to select peers is download speed. To a P2P user this seems good because it ensures that you are getting your content in the fastest possible manner. However, for ISP's this can be bad because traffic that travels onto other ISP networks costs money. Companies like Comcast have throttled torrent traffic to minimize its effects on their network. P2P systems are a wonderful way of sharing information (legal information, Wink) and these studies are geared towards finding a way in which P2P systems, like BitTorrent, can be used minimizing the cost on ISP's.

Taming the Torrent proposes a system called Ono that uses content distribution networks (CDN) to calculate proximity values for peers in a network. The idea is that if two peers resolve to the same CDN node, then they must be close to each other and possibly within the same ISP autonomous system (AS). This system relies heavily on the fact that CDN's would be willing to provide such a service for free, but I think such as assumption is okay considering companies like Google provide all kinds of cloud services for free around the world. By using CDN's as a reference point, Ono is able to reduce cross-ISP traffic and select peers within the same AS about a third of the time. On average, it is also able to increase download rates by 207%. Pretty impressive!

TopBT's results aren't quite as impressive, but it doesn't rely on a third party service either, which is good. TopBT uses various tools to measure the network between potential peers. Along with measuring download rate, it also uses ping and traceroute to measure the link as well as gain important information about the autonomous systems respective peers are in. The challenge this solution has is that many routers are configured to block ping and traceroute. It thus becomes difficult to figure out the proximity of a peer. Despite these road blocks, TopBT is able to reduce inter ISP traffic by 25% and increase download time by about 15%.

One of the interesting things both of these papers did was release their software for the world to use in order to acquire data. I mean who wouldn't want a torrent client that increases download time by 207% percent. It just rubs me a little funny that the data they got was probably a result of thousands of people obtaining media illegally. Just an interesting thought in conclusion.

Saturday, September 18, 2010

Peer-to-peer

Peer-to-peer systems are unique in that everything about them is truly distributed. One of the interesting things about peer-to-peer systems is that they scale instantly. The more peers join a system, the more computing bandwidth is available. As a result, a peer-to-peer system can quickly transfer data and cool down rapidly where as a client/server architecture can take a while to cool down. P2P networks are a perfect medium for sharing files because many machines can share the burden of getting a file to a user. This can enable a user to truly leverage all their available bandwidth because they are receiving data from multiple nodes.

Distributed architectures are becoming crucial to the success of large companies on the web. Content distribution networks are used to spread out content and data so that users might be able to request it from a source closer to their location. It would be an interesting study to look at putting P2P features into a web browser so that static content could be requested from your closest neighbor. It would also be interesting to look at a P2P type active network for pushing dynamic content out onto the network.

Thursday, September 16, 2010

Packet Dynamics

I just read a paper on End-to-End Internet Packet Dynamics, which was kind of interesting as it showed data about changes in packet flow from December 1994 - December 1995. The paper details an experiment the authors performed that measured TCP bulk transfers between 35 sites running special measurement daemons. One of the interesting data points shared was that during the first experiment, out-of-order packet delivery was quite prevalent. It is interesting because reordering in TCP can cause of lot of packet retransmission, which would have been an expensive thing to do considering the internet was very small back then with considerably lower bandwidth capabilities than we have today. There are times, however, where a packet is honestly lost and retransmission is necessary. The authors found that during the first experiment, the ratio of good retransmissions to bad one was 22. In the second experiment (a year later), they increased the window size and the ratio increased to 300 which is much better.

One of the other parts in this paper that I found interesting was that of packet loss. One of the initial data points that they gave was that between the first and second experiments that packet loss increased. One of the measurements they took was rates of ack loss. The data shows that at one point, ack's flowing into the US were like likely to be lost than those flowing into the Europe. Those roles switched in the subsequent experiment. 

I think the current internet infrastructure is fairly stable today and packet loss is generally low on a good connection. One of the interesting complaints I hear from a lot of people is that the internet is so much faster in other companies compared to what is available in the US. Obviously other countries don't have the infrastructure in place on the scale the US does, but it would be interesting to study the dynamics of those smaller systems to learn something from how those countries decided to build their networks. I think one of the great advantages that countries who are relatively new in building up internet connectivity is that they can learn from the mistakes of countries like the US. This would enable their networks to be faster, in a sense, than the US because they wouldn't have to build and work around their mistakes. 

Monday, September 13, 2010

A future internet

The internet has become a crucial part in every day life. Its creation has spurred other innovations that have worked themselves into what is considered the norm. The internet has enabled collaboration to occur that has never been known before its existence. I believe, going forward, that it is not only important, but crucial to have an internet architecture that can easily evolve with the latest standards and increasing demands put upon it. I wish to give my input for what I think needs to be the focus for a new internet architecture.

First, security. From the internet's conception, security was never a design goal because the architects never envisioned an internet at the magnitude it has grown to today. With the explosion of eCommerce, online gaming, and social networking, among other things, exploding on the web, malicious users don't even have to leave their homes to steal someone's identity. Security is a must.

Second, protocol flexibility. What I mean by that is that network entities should not have to be at the mercy of hardware vendors or large commercial organizations in order to try out and/or implement new protocols. If we look in the software space, the concept of open standards has tended to push out proprietary commercial protocols because users were able to freely try them out.

Third, addressing should be name based. These names should be logically constructed and rememberable. I think the postal system has a fairly nice way of assigning addresses to locations. Although some street names might

be a little eccentric, generally the naming convention is logical and rememberable. We shouldn't need systems like DNS to resolve a name to a number, we should just be able to give a name and know the location or the resource immediately. 

Connecting the whole world in an efficient fashion is not an easy problem to solve. The current internet architecture has done extraordinarily well and I think there have been some great learning experiences. But, now that more than just scientists use it, I think we need to attack a new architecture from a "customer needs" perspective.

Wednesday, September 8, 2010

Active Internet Architecture

I find the concept of an active internet architecture very interesting. I just read a SIGCOMM paper called Towards an Active Internet Architecture. The paper outlines an architecture where the computing power of the network is used to route packets through the network. To leverage the network's computing power the paper proposes that instead of packets, the network will switch capsules. A capsule is comprised of a custom user program, that is executed at every hop, and other information that is normally included in a packet. These custom programs are advantageous in that when they are executed at a hop, they could customize the data for the next hop, specify where to hop next and a myriad of other things. The cool thing this gets is the ability for a network to evolve on its own. That means that network administrators could try out new standards without waiting for hardware vendors to decide to implement it--the standard would just be encoded into the capsule. 

One of the biggest challenges of releasing new internet standards is adoption. The internet has grown to such a size where pushing out new standards simultaneously is infeasible. I think in the long run an architecture such as this would actually quicken the pace of standards adoption. For instance, smaller ISP's could deploy new standards on their network without affecting outside networks and ISP's. As soon as enough contiguous smaller ISP's have these standards implemented, the larger ISP, from whom these smaller ISP's purchase bandwidth, could then deploy these standards to connect these smaller ISP's by these standards. In this way, standards could grow incrementally.

Sunday, September 5, 2010

Design Philosophy of the Internet

As many of you know, the internet started as a government research project called ARPANET whose primary goal was to allow independent networks to communicate with each other. There were also several secondary goals which are mentioned in The Design Philosophy of the DARPA Internet Protocols. I wish to discuss a few things I found interesting in this paper. The secondary goals are as follows:

  1. Internet communication must continue despite loss of networks or gateways
  2. The Internet must support multiple types of communication services
  3. The Internet architecture must accommodate a variety of networks
  4. The Internet architecture must permit distributed management of its resources
  5. The Internet architecture must be cost effective
  6. The Internet architecture must permit host attachment with low level of effort
  7. The resources used in the Internet architecture must be accountable

 

Considering this system was originally developed for military use, I find 1 & 2 very obvious and necessary goals. What is surprising to me though, is that security is not in that list. Many government agencies have very strict security policies. For example, Agilent Technologies develops test and measurement equipment. One of their customers is the NSA. If something goes wrong with the equipment, the engineer at the NSA is not allowed to copy and paste error messages or take screen shots of them. He must write them down by hand and email them to the support people at Agilent. Talk about a little paranoia but, you see the point. Either the government wasn't worried about the network communications being intercepted (which would surprise me) or they figured this network would be so small and somewhat protected that it would be impossible to intercept. I often wonder what the internet protocols would be like if security had been a design goal from the beginning. Security in any system can incur large amounts of overhead, which is why, I think, it usually comes last. Today we have IPsec and multiple application layer protocols, but those were after thoughts.

Thursday, September 2, 2010

Future internet impressions

I just finished reading a conference paper entitled A Data-Oriented (and Beyond) Network Architecture. It was very interesting and I wish to share some thoughts I found interesting. This paper takes a "clean-slate" look at internet naming and addressing. I find this an interesting read in loo of the answer-to-all-our-addressing-problems IPv6 not taking hold. So what is the problem with what is currently being used? This paper explains that the internet naming and addressing scheme is centered around getting a user connected to a particular machine in order to request content and services. Interestly, when people use the internet, it is not about what machine or server they are connected to that's important, it's about the content and/or services that machine provides. Users could care less if they were connected to a server in Texas or a server in India, just as long as they get the CNN.com content they requested. This paper changes addressing in such a way that an address or name no longer refers to "where," it refers to "what."

So this new name addressing system is supposed to make it easier for users to get at content and services, but I think they might run into issues when it comes to usability. One of the great weaknesses of the addressing scheme and that of IPv6 as well, in my opinion, is that the addresses are not user friendly. With IPv6 it is all wonderful and great that the address space is practically limitless, but I know many IT organizations that will never move it because they can't memorize the addresses. The same is true for this paper--the address is comprised of a cryptographic hash and a label. When was the last time you memorized a hash? Don't get me wrong, I think there is great value in referencing content by name, but only if the name it is given makes sense.