As with the SLM network, I updated the Cloudtrax checkin URL. I could only (remotely) access the gateway router, though – the other three are down.
Network monitoring enabled through Cloudtrax: http://www.cloudtrax.com/overview2.php?id=wt-harb
Finally went down to investigate. Above the southeast corner of the gallery there's a spot where all the important wiring happens. There are two Ethernet switches there: one for a Harbourfront network, and one for Wireless Toronto. The WT switch had all the Ethernet removed. I plugged everything in, and the two orphaned routers were immediately back online.
The router in the studio has weirdly long and eccentric ping times – often indicative of wireless interference. I changed the channel from 6 to 11 and it *may* have helped things a little. This is something to investigate further.
Also, there are a few routers on site with the ssid “wtoronto.” These appear to be routers installed by Harbourfront, but plugged into the Wireless Toronto network. (Not sure where; I didn't investigate.) No big deal, but worth noting. Presumably these are monitored by Harbourfront IT, which explains how they know so quickly whenever our network goes down. The only physical device I found was on the south patio (outside!).
I've gotta go back in (remotely) and set up a monitoring cronjob on the routers.
A user emailed the support address describing trouble connecting. It sounded like our coverage area has shrunk. I checked, and I can't ping routers .67 or .68. (Was this ever resolved after March 28?) I emailed Juan to find out when I can go by to investigate.
I got an email from Harbourfront this morning saying that the network has been down since before 7. Weird, because wifidog reported it up. I checked the logs, and the wifidog checkins were consistent (every minute, exactly) and from the right IP address. I couldn't ping it via OpenVPN, though – this is very weird. (OpenVPN is normally more resilient than the Wifidog gateway.)
I saw the wirelesstoronto network, but didn't get a DHCP address. I forced a static IP, and could then ping 192.168.1.1. But I still couldn't SSH in (my attempts timed out – maybe a sign of a high load on the box?).
For fun, I tried pinging the other routers: .67, .68 and .69. Only .69 responded.
I powercycled the gateway router (which is in the storage closet next to the Studio Theatre), and it came right back up. But I still can't ping router .67 and .68.
Wifidog reported that the router was down. I was able to ssh in, and the same thing was happening as before – the load was 0.00 (as reported by 'uptime'), but 'wdctl status' didn't return anything in the 30 seconds I waited for it. There were 10 entries in /tmp/dhcp.leases. It seems to have lost its DNS servers again. I added them to /tmp/resolv.conf, and also at the bottom of /etc/init.d/S50dnsmasq, which is the script that creates /tmp/resolv.conf – hopefully now it won't forget anymore?
Wifidog reported numerous 5-minute outages this morning, and Juan emailed saying that the system is down. I connected to the main router, noticed that 'wdctl status' was taking a long time to run. Just in case, I set static DNS servers in /etc/resolv.conf and nvram, and restarted wifidog. Hopefully that fixes it.
Juan from Harbourfront emailed to say that the network was down; Wifidog had been reporting that it was down for about 12 hours. Gabe connected to the router via OpenVPN, noticed that the wifidog gateway had crashed, and restarted it. The node is now checking into the auth server.
Wifidog was reporting that the node was down. I went by; the wifi was up, but it seemed like the gateway router was using 127.0.0.1 as a nameserver – and so the login page couldn't be displayed. I manually added some other nameservers (through the web interface) and rebooted the router (through ssh) and it came right back up.
Michael and Gabe went this morning to try to figure out what's up. Simply, the mid-hallway router (.70, the one with wifi turned off) had no power. The extension cord it's plugged into must've been unplugged on the other end, and we tried but weren't able to find it. We decided that the router didn't need to be there (it was for only some forgotten/legacy reason). The router was decommissioned and Michael said he'd go back to install an F-F ethernet connector in its place.
At some point in the past few months, the routers after the gateway router stopped working. Michael went to check it out last week, and Gabe went today. Indeed, from the gateway router you can't ping the others. And it's possible to connect to the others (setting a static IP), and they can all ping each other (.67, .68, .69), but not the gateway. It could be a bad/disconnected wire, or that the router that's mid-hallway has lost power. I didn't have time today to get a ladder to check it out.
Wifidog was reporting that Harbourfront went down sometime yesterday. I went by to check it out. I could connect to the hallway-south router, but didn't get a DHCP IP. I could ping the three 'bridge' routers, but not the primary one. I also couldn't see its wifi beacon. I got access to the old box office. Lights on the router were on (I didn't pay close attention to which), and I unplugged it and plugged it back in. It came back up right away.
I told Patrick M. that we'll keep an eye on it, and if it happens more than twice again, we'll swap in a new router.
Michael and Gabe installed the rest of the routers today, and hooked them up to the dsl line. Details:
The main router is #65 (LAN IP: 192.168.1.1), and is installed in the old box office. The DSL modem is in the server room; they're connected through the in-wall wiring. (Port #2 in the old box office.) It's connected to router #70, which is in the north hallway. Its wifi is turned off, because it's not needed (there's already enough wifi density) – this router should simply be replaced with a female-female RJ-45 connector. It's connected to router #69, in the south hallway. It's connected to an ethernet switch in the gallery, which is in turn connected to the router in the cafe (#67), and the one in the studio (#68). Each router except the main one has the static LAN IP 192.168.1.<#> where # is the router number. All routers have openvpn clients installed and working.
The DSL modem was configured with the PPPoE info, so the router is getting its WAN IP over DHCP. The DSL modem's IP is 192.168.2.1.
The setup is like the one at St. Lawrence Market: only the primary router has wifidog and dnsmasq (dhcpd) installed. The rest are simply being used as bridges.
Late last night wifidog reported the node down. This morning I stopped by to check it out. The network was visible, but I didn't get a DHCP address. When I set a manual IP on my machine and tried to connect to the router via http and ssh, it was alive but unresponsive. I did get the webif splash screen, but it never made it to the real interface. Likewise, I never got an ssh password prompt. I unplugged it, replugged it in, and it came back just fine. [Gabe]
The priority is to light up the beer tent for the June 1st opening of LuminaTO.
Michael and Gabe went to install today, but neither the DSL line nor the ethernet cables were in place. So they set up an interim solution, installing the primary router by the metal studio, attached to Michael's “wimax” modem. The coverage is good. We'll go back sometime the week of June 11th to do the full install. The official launch is in mid-July.