Thursday, December 18, 2008

GPL violations close to home

Many times I hear about GPL violations in vendors software, especially it seems in embedded routers. There are two cases which hit me in my home.

The first is our FIOS router which is an Actionec MI424-WR which runs Linux inside. You can even get to a telnet prompt. The problem is that it has a crappy DHCP server and always seems to assign different IP addresses even to the same MAC address. This breaks ssh and other services which do strong man-in-the-middle prevention. It seem the vendor hasn't fixed the problem, but as a result of a GPL violations suit the some source is available but the DHCP code is not included probably because it is BSD licensed so they don't have to. Given this I'll just punt and do the lazy solution and just turn it into an dumb Ethernet bridge and use something better like Vyatta V514 test box or Linksys WR54TG, both of which are repairable.

The second is the Asus P6T motherboard which has a SplashVM feature. This allows booting to a lightweight desktop in less than a minute (the BIOS is still slow to get its hardware setup). The desktop is based on Linux with standard kernel and browser. It is kind of a toy, but good for checking gmail etc. Since SplashVM is using GPL, if the vendor was following the GPL license I should be able to find the source on their website. It is possible to find some pieces on the Splashtop vendor website, but it is the responsibility of the system vendor not the subcontractor to make available the source for the actual firmware they are shipping. In this case, it matters to me for a couple of reasons. I wrote the driver for the Marvell Yukon-2 EC Ultra NIC's on this motherboard and would like to know if 1) the vendor fixed some bugs 2) the vendor still has some bugs that other users will pester me about. As copyright holder for this driver, I may have to go nasty to find out; stay tuned.

Wednesday, October 1, 2008

Netfilter workshop day 1

At netfilter workshop, Patrick McHardy described an exciting new feature implementation of netfilter firewalling called nftables. This has the promise of reducing 100's of netfilter modules down to a smaller kernel footprint, and allow for optimization of rulesets. Eric Leblond's blog has more information.

Friday, September 12, 2008

Open Source is alive and well in PDX thank you

I really should stop reading the Oregonian, they do such a poor job of covering high tech and the business section is especially weak. The recent piece about OSCON moving to Silly Valley overlooked so many obvious things like the Linux Plumber's Conference next week, the Kernel Summit not to mention the Open Source technology center, Oracle office in Portland, Portland State, and Free Geek. So the loss of one conference which is mostly attended by out of town people is really no impact on the local open source infrastructure.

Wednesday, August 27, 2008

Exploring transactional filesystems

In order to implement router style semantics, Vyatta allows setting many different configuration variables and then applying them all at once with a commit command. Currently, this is implemented by a combination of shell magic and unionfs. The problem is that keeping unionfs up to date and fixing the resulting crashes is major pain.

There must be better alternatives, current options include:
  • Replace unionfs with aufs which has less users yelling at it and more developers.
  • Use a filesystem like btrfs which has snapshots. This changes the model and makes api's like "what changed?" hard to implement.
  • Move to a pure userspace model using git. The problem here is that git as currently written is meant for users not transactions.
  • Use combination of copy, bind mount, and rsync.
  • Use a database for configuration. This is easier for general queries but is the most work. Conversion from existing format would be a pain.
Looks like a fun/hard problem. Don't expect any resolution soon.

Thursday, June 26, 2008

TCP MD5 debugging

Added CLI support for TCP MD5 (via Quagga) to the upcoming Vyatta release. It worked fine under testing (VM) but wouldn't operate with IOS. Reduced the problem down by making some useful utilities:
  • Patch for Netcat to support MD5
  • Standalone using libpcap to validate MD5 option in capture file
It turned out that the sender was generating wrong MD5 option after the initial SYN handshake. When data is finally sent, the problem is that the data in the kernel is fragmented because the underlying device supports scatter/gather but the md5_calc doesn't do scatter gather.

Thursday, June 19, 2008

Linux Plumbers Conference

I have high hopes for the first Linux Plumbers Conference. Unlike an academic conference with papers, or an un-conference with no agenda; the plumbers conference is using a mini-conference format to break down by topic. There is even a Call For Speakers to get speakers in topic areas.

First time conferences have a different feel, more rough edges, but more passion and fun. So I hope it works out. There is no particular networking track, mostly because the other areas seemed to need more work.

Wednesday, January 23, 2008

FIB Trie saga

For the next release of Vyatta, I wanted to enable the Trie algorithm for routing in our kernel. Since FIB Trie is compatible with the previous hash, I expected no change. Well the day after enabling it caused an immediate failure in the regression test. The regression test plays back a full BGP input stream into the router and polls for the result. The problem was that the Trie to a long time, ... a really long time, to dump the routes. For the full 163395 routes, the dump was taking 20 seconds vs 1/2 sec for FIB_HASH.

As expected the problem turned out to be an N^2 lookup. The code was basically:
walk tree to find a route, and put it in buffer; if buffer is full, then give up, then go
back and walk to the last location. Since the trie has nice fast lookup function the change to just record the last route dumped (rather than offset), then use the lookup to find the location. This dropped the dump down to under 3/10 sec.

Finding this took several days, mostly because of looking at the profile; see "nextleaf" is the hotspot, so let's look at that. The real breakthrough came when I realized there were other operations that were walking the tree, like collecting stats, but they were fast. The next diversion was figuring out all the other suboptimal behaviour (ip route flush calls fflush for each route), which although slow weren't the real issue.