Two nights ago, open-mesh reported that the gateway router went down. I went by today, and couldn't connect to either the wirelesstoronto or wt-yds networks. I went in, and our equipment was unplugged. I asked, and apparently some people had been in to install security cameras. I plugged it back in and it came back up.
I got an open-mesh email notification on Apr 8 that the network went down. I dropped by a couple of times between then and now to ssh into and reboot the gateway router, but it didn't fix anything. (The problem was with the NS2.) Today I got an email from Laura at YDS, and she confirmed that she could let me into the TO Tix booth. I installed a timer which will reboot everything (NS2, Linksys router and DSL modem) every morning at 3:30ish. Once I installed it and powered everything back on, the network came up quickly.
I went back, and the situation was the same. I went in, powercycled the NS2, and it came back up within a minute or so. As of right now the dashboard isn't yet reporting that it's back up, but I was able to connect to wirelesstoronto easily.
Cloudtrax (the open-mesh dashboard) reported that all mesh nodes at YDS went down on Friday. I went this morning to check. Normally this is caused by the Linksys router (which runs PPPoE) going down. But today I was able to connect to the Linksys router and get online. The ticket office was closed, so I couldn't get in to reboot the NS2. I'll try to go by later to try that.
I rebooted the Linksys router, and no luck. I plugged my laptop into the modem and it couldn't find a PPPoE server. I rebooted the modem, and it started working. I intended to bring a lamp timer that would auto powercycle the whole rig every night, but couldn't find one. The modem problem is, as far as I can tell, a new one, so maybe it was a fluke.
I waited a few minutes and still couldn't connect to the wirelesstoronto network. I rebooted the NS2 and it came up pretty quick.
It's down again; for four days according to the open-mesh dashboard. I went by this morning and tried the same thing I did last time. The difference was that rebooting it didn't bring the Internet connection back up. The next thing I would've tried is powercycling the modem (or maybe before that, plugging my laptop into it to see if it can get on, to try to isolate the problem to either the modem or the router). But the TO Tix booth is only open after noon.
Once it's back online, maybe it'd be a good idea to set a cronjob to reboot it every night. Who knows; it might help.
Late last week I got an email from the YDS folks saying that the wifi was down, and that “someone” had tried rebooting the modem, but it was still down. I was out of town, so I went by on Tuesday afternoon. I didn't even need to go inside…
My laptop connected immediately to the wirelesstoronto network, and got a DHCP IP, 192.168.1.150. (Clearly, this came from the Linksys router and not the mesh router.) I sshed to 192.168.1.1 (the Linksys). It had 64 days uptime, 1.5GB transferred on VLAN1, but wasn't online. I used “ps -ef” to get the pppoe commandline, killed the process, and launched it again. It immediately went online, but wasn't doing the routing properly. (The router was online, but my laptop wasn't.) So I rebooted the router and it came up fine.
At some point in the process the wirelesstoronto network disappeared, so I had to connect to wt-yds. A few minutes after the reboot wirelesstoronto came back up.
It seems like the problem is that the PPPoE client on the Linksys router flakes out about once every month or two, and the router needs to be rebooted. Since we're not running wifidog on the Linksys router, we're not currently monitoring it. I tried to install the om-checkin script, but there wasn't enough space on the router.
** had to make a copy first, 'cause it was linked to the read-only cp /etc/ipkg.conf /etc/i mv /etc/i /etc/ipkg.conf vi /etc/ipkg.conf ipkg update ipkg install microperl curl cd /sbin wget http://wirelesstoronto.ca/dist/om-checkin.pl cd /etc wget http://wirelesstoronto.ca/dist/om-checkin.conf.pl crontab -e * add line: */2 * * * * /sbin/om-checkin.pl killall crond /usr/sbin/crond -c /etc/crontabs ** I had to install libopenssl, but there wasn't enough room, so: ipkg remove wifidog
Still not enough space, so I gave up. The microperl binary is 1MB! If we could port the om-checkin script from perl to bash, we'd be way better off.
The open-mesh dashboard is indicating that the network has been down for 34 days. I went to the square, was able to connect to wirelesstoronto but didn't get an IP address. I connected to wt-yds, got an IP address, but wasn't able to ping the Internet from that router. I fumbled around with udhcp for a while, but eventually just issued a 'reboot'. As soon as it came back, everything was fine.
It went down again. This time I went by, the wirelesstoronto ssid was visible, but didn't do anything (not even a dhcp address). I connected to wt-yds, and surprisingly, I could ping the Internet. So it seems like the problem was with the open-mesh router? I went into the office. It had fallen off the wall again – since the red vinyl went up, the suction cup doesn't stick. I took it off, moved the Linksys router to under the desk, and used the big clip to attach the router to the column. (This also moves a bunch of the spaghetti to below the desk, which is nice.) I had unplugged everything in the process. When it plugged it back in, it took a couple of minutes (during which totally weird stuff was happening) before it all started working ok.
It went down again at about 7am yesterday. I rebooted the wt-yds (Linksys) router (because I haven't yet figured out how to reset the pppoe connection), and it came back up.
The open-mesh dashboard reported that the gateway and security routers went down on Wednesday (and the infobooth 3 days ago). I couldn't ping the linksys router via vpn. I went by today and could connect to the linksys router wirelessly. It had no 'net connection. I tried ifdown ppp0; ifup ppp0 and it didn't seem to work (or maybe I got impatient). I rebooted the router (from the commandline) and when it came back up everything seemed fine. Wouldn't it be easy to put a cronjob on to reboot (or reset pppoe) if it loses connectivity?
The dashboard reports that all nodes went down 8 days ago. I went to the square and tried to connect to the Linksys router (ssid wt-yds), the one that handles the PPPoE connection. I was able to connect to it, but not get online. I reset the PPPOE connection (ifdown ppp0; ifup ppp0) and the Internet connection came up.
I then tried connecting to the mesh router, and it worked, but I got an IP address from the Linksys router. Apparently the dhcp server in the mesh router wasn't working; presumably 'cause it knew that the Internet connection was down. I disconnected and reconnected a bunch of times, and eventually (3 minutes?) it started working like normal. My guess is that when it knows that it's offline it goes into wacko mode. As soon as it realized that it was back online again, it fixed everything. The security node came up a few minutes later. The infobooth node hadn't come up at the time that I left the square.
The open-mesh dashboard reports that the infobooth router went down 200 days ago, the security booth router went down 129 days ago, and that the main router went down 11 days ago. The main (Ubiquiti) router was dangling; presumably the mount doesn't stick as well to the red vinyl they installed. I re-stuck it. I ssh-ed into the Linksys router and couldn't ping the 'net. I rebooted the router and it and the Ubiquiti router came back up.
The Linksys router has wifi enabled, on a hidden but not secure network called wt-yds, on channel 1. I moved the mesh to channel 11.
I power-cycled the router in the security office and it came back up. The router in the infobooth was unplugged; I asked them to plug it back in, and it came back up too.
The security booth router never came back up, so I swapped in a new router. It's up now.
Open-mesh reported that the nodes stopped checking in yesterday morning. I assume that the problem was the bad upgrade that went out through the open-mesh updating system. I swapped in the spare NanoStation2 that I'd prepared last week, and the new router and the infobooth router came up. The security booth router hasn't yet (a couple of hours later). Maybe it got the bad upgrade too?
Gabe went by, prepared to replace each piece of equipment. But all I did was reboot the Linksys router, and the network came right back up. The only weird thing I noticed was that the spring clip that attaches the Linksys router to the column is too strong, and was squishing the router.
There's ongoing work happening at the TO Tix office, so there may be some disruptions over the next few days relating to that.
It would be good for us to swap the current DSL modem with one that's preconfigured with a PPPoE client – that way we could get rid of the Linksys router.
The open-mesh dashboard was reporting that the network was down. Alex went to reboot stuff, but couldn't get it back up. Alex went back in the afternoon to install a roach coach, and that only stayed up for a few minutes.
Alex and I went to Dundas Square this morning to install an open-mesh network. We put a NanoStation2 at the T.O. Tix office (where the existing Linksys router is), and an open-mesh MR3201A at each of the security office and the infobooth at the south-west corner of the square.
The NS2 doesn't do PPPoE, so the WRT54GL is still there, between the DSL modem and the NS2.
Wifidog reported the node being down for a few days. They're doing some renovating and the router had simply accidentally been unplugged. The work's not done, so it might happen again…
Wifidog reported that the spot has been down for over two weeks. Gabe went by – the router was working, but the it couldn't ping the net. I restarted it and it came right back.
Gabe dropped in. It's the phone cable that was broken; possibly chewed-through by someone's dog. I replaced it with another phone cable – it's way longer than it needs to be (the only one I had on me). We can/should charge YDS for the cable.
I also changed the channel to 6, since it seemed to be the least busy.
Jeff(?) from TO Tix left a voicemail message saying that one of our cables (network or phone; it's unclear from the message) “broke”; “is in two pieces”.
Michael P. swapped in a new DSL modem on March 30, and the node has been stable since then. The modem he swapped in was from Dufferin Grove Park. He has the one that he swapped out, and will test it at his house to determine if it's defective.
The router has recently gone into flake mode for no apparent reason. A reboot seems to fix it but it's not a long lasting cure.
Michael P rebooted the machine Feb 8, 2008. T.O.Tix guy said that David (blonde?) had done the same on the 7th and Gabe on the 6th. Hotspot had been down for 16 hours when it was rebooted.
On 2/8/07, Dave Robertson wrote:
Yesterday, Wednesday, Yonge & Dundas square, 5:30pm
- I power cycled the modem and it didn't seem to come back (I may have been
impatient)
- I then power cycled the router aswell, and everything came back.
- It went down again around 9:00pm :(
- It was back up today (I don't know if someone helpdesk it)
- It's down again today about 5:30pm