Table of Contents

September 6, 2012

Two nights ago, open-mesh reported that the gateway router went down. I went by today, and couldn't connect to either the wirelesstoronto or wt-yds networks. I went in, and our equipment was unplugged. I asked, and apparently some people had been in to install security cameras. I plugged it back in and it came back up.

April 30, 2012

I got an open-mesh email notification on Apr 8 that the network went down. I dropped by a couple of times between then and now to ssh into and reboot the gateway router, but it didn't fix anything. (The problem was with the NS2.) Today I got an email from Laura at YDS, and she confirmed that she could let me into the TO Tix booth. I installed a timer which will reboot everything (NS2, Linksys router and DSL modem) every morning at 3:30ish. Once I installed it and powered everything back on, the network came up quickly.

February 23, 2012

I went back, and the situation was the same. I went in, powercycled the NS2, and it came back up within a minute or so. As of right now the dashboard isn't yet reporting that it's back up, but I was able to connect to wirelesstoronto easily.

February 7, 2012

Cloudtrax (the open-mesh dashboard) reported that all mesh nodes at YDS went down on Friday. I went this morning to check. Normally this is caused by the Linksys router (which runs PPPoE) going down. But today I was able to connect to the Linksys router and get online. The ticket office was closed, so I couldn't get in to reboot the NS2. I'll try to go by later to try that.

August 6, 2011

I rebooted the Linksys router, and no luck. I plugged my laptop into the modem and it couldn't find a PPPoE server. I rebooted the modem, and it started working. I intended to bring a lamp timer that would auto powercycle the whole rig every night, but couldn't find one. The modem problem is, as far as I can tell, a new one, so maybe it was a fluke.

I waited a few minutes and still couldn't connect to the wirelesstoronto network. I rebooted the NS2 and it came up pretty quick.

July 13, 2011

It's down again; for four days according to the open-mesh dashboard. I went by this morning and tried the same thing I did last time. The difference was that rebooting it didn't bring the Internet connection back up. The next thing I would've tried is powercycling the modem (or maybe before that, plugging my laptop into it to see if it can get on, to try to isolate the problem to either the modem or the router). But the TO Tix booth is only open after noon.

Once it's back online, maybe it'd be a good idea to set a cronjob to reboot it every night. Who knows; it might help.

July 5, 2011

Late last week I got an email from the YDS folks saying that the wifi was down, and that “someone” had tried rebooting the modem, but it was still down. I was out of town, so I went by on Tuesday afternoon. I didn't even need to go inside…

My laptop connected immediately to the wirelesstoronto network, and got a DHCP IP, 192.168.1.150. (Clearly, this came from the Linksys router and not the mesh router.) I sshed to 192.168.1.1 (the Linksys). It had 64 days uptime, 1.5GB transferred on VLAN1, but wasn't online. I used “ps -ef” to get the pppoe commandline, killed the process, and launched it again. It immediately went online, but wasn't doing the routing properly. (The router was online, but my laptop wasn't.) So I rebooted the router and it came up fine.

At some point in the process the wirelesstoronto network disappeared, so I had to connect to wt-yds. A few minutes after the reboot wirelesstoronto came back up.

It seems like the problem is that the PPPoE client on the Linksys router flakes out about once every month or two, and the router needs to be rebooted. Since we're not running wifidog on the Linksys router, we're not currently monitoring it. I tried to install the om-checkin script, but there wasn't enough space on the router.

** had to make a copy first, 'cause it was linked to the read-only
cp /etc/ipkg.conf /etc/i
mv /etc/i /etc/ipkg.conf
vi /etc/ipkg.conf 
ipkg update
ipkg install microperl curl 
cd /sbin
wget http://wirelesstoronto.ca/dist/om-checkin.pl
cd /etc
wget http://wirelesstoronto.ca/dist/om-checkin.conf.pl
crontab -e
* add line: */2 * * * * /sbin/om-checkin.pl
killall crond
/usr/sbin/crond -c /etc/crontabs

** I had to install libopenssl, but there wasn't enough room, so:
ipkg remove wifidog

Still not enough space, so I gave up. The microperl binary is 1MB! If we could port the om-checkin script from perl to bash, we'd be way better off.

October 28, 2010

The open-mesh dashboard is indicating that the network has been down for 34 days. I went to the square, was able to connect to wirelesstoronto but didn't get an IP address. I connected to wt-yds, got an IP address, but wasn't able to ping the Internet from that router. I fumbled around with udhcp for a while, but eventually just issued a 'reboot'. As soon as it came back, everything was fine.

July 23, 2010

It went down again. This time I went by, the wirelesstoronto ssid was visible, but didn't do anything (not even a dhcp address). I connected to wt-yds, and surprisingly, I could ping the Internet. So it seems like the problem was with the open-mesh router? I went into the office. It had fallen off the wall again – since the red vinyl went up, the suction cup doesn't stick. I took it off, moved the Linksys router to under the desk, and used the big clip to attach the router to the column. (This also moves a bunch of the spaghetti to below the desk, which is nice.) I had unplugged everything in the process. When it plugged it back in, it took a couple of minutes (during which totally weird stuff was happening) before it all started working ok.

July 18, 2010

It went down again at about 7am yesterday. I rebooted the wt-yds (Linksys) router (because I haven't yet figured out how to reset the pppoe connection), and it came back up.

July 16, 2010

The open-mesh dashboard reported that the gateway and security routers went down on Wednesday (and the infobooth 3 days ago). I couldn't ping the linksys router via vpn. I went by today and could connect to the linksys router wirelessly. It had no 'net connection. I tried ifdown ppp0; ifup ppp0 and it didn't seem to work (or maybe I got impatient). I rebooted the router (from the commandline) and when it came back up everything seemed fine. Wouldn't it be easy to put a cronjob on to reboot (or reset pppoe) if it loses connectivity?

June 30, 2010

The dashboard reports that all nodes went down 8 days ago. I went to the square and tried to connect to the Linksys router (ssid wt-yds), the one that handles the PPPoE connection. I was able to connect to it, but not get online. I reset the PPPOE connection (ifdown ppp0; ifup ppp0) and the Internet connection came up.

I then tried connecting to the mesh router, and it worked, but I got an IP address from the Linksys router. Apparently the dhcp server in the mesh router wasn't working; presumably 'cause it knew that the Internet connection was down. I disconnected and reconnected a bunch of times, and eventually (3 minutes?) it started working like normal. My guess is that when it knows that it's offline it goes into wacko mode. As soon as it realized that it was back online again, it fixed everything. The security node came up a few minutes later. The infobooth node hadn't come up at the time that I left the square.

June 7, 2010

The open-mesh dashboard reports that the infobooth router went down 200 days ago, the security booth router went down 129 days ago, and that the main router went down 11 days ago. The main (Ubiquiti) router was dangling; presumably the mount doesn't stick as well to the red vinyl they installed. I re-stuck it. I ssh-ed into the Linksys router and couldn't ping the 'net. I rebooted the router and it and the Ubiquiti router came back up.

The Linksys router has wifi enabled, on a hidden but not secure network called wt-yds, on channel 1. I moved the mesh to channel 11.

I power-cycled the router in the security office and it came back up. The router in the infobooth was unplugged; I asked them to plug it back in, and it came back up too.

November 11, 2009

The security booth router never came back up, so I swapped in a new router. It's up now.

August 21, 2009

Open-mesh reported that the nodes stopped checking in yesterday morning. I assume that the problem was the bad upgrade that went out through the open-mesh updating system. I swapped in the spare NanoStation2 that I'd prepared last week, and the new router and the infobooth router came up. The security booth router hasn't yet (a couple of hours later). Maybe it got the bad upgrade too?

August 12, 2009

Gabe went by, prepared to replace each piece of equipment. But all I did was reboot the Linksys router, and the network came right back up. The only weird thing I noticed was that the spring clip that attaches the Linksys router to the column is too strong, and was squishing the router.

There's ongoing work happening at the TO Tix office, so there may be some disruptions over the next few days relating to that.

It would be good for us to swap the current DSL modem with one that's preconfigured with a PPPoE client – that way we could get rid of the Linksys router.

July 31, 2009

The open-mesh dashboard was reporting that the network was down. Alex went to reboot stuff, but couldn't get it back up. Alex went back in the afternoon to install a roach coach, and that only stayed up for a few minutes.