Auth server was returning errors on any wifidog page. (Could not allocate memory?) I restarted it and it came back fine.
Auth returning errors, unable to connect to PostGres.
sudo /etc/init.d/postgresql-8.3 restart
Auth back up.
– Michael Pereira
The auth server has spun out of control at around noon two days in a row, and needed a hard reboot. I checked the logs, and it just looks like an above-normal but not extraordinary number of hits to the login page. I added “OfficeLiveConnector” to the blacklist, but basically this problem is only going to get worse.
The auth server went down yesterday afternoon. Looks like it was the same problem. I “blacklisted” “Google Updater”. There were also a ton (5000) of emails in the mailman queue, all from the past 2-3 weeks; I deleted them.
The auth server went down at about noon today, and I brought it back at about 3pm. At 12:08pm, the server was hit with about 50 login redirects from one IP in about 20 seconds. From then on the server just began spinning wildly. I added the UserAgent string to the blacklist, and installed Cache_Lite. [Actually, I can't get Cache_Lite working.]
I ended up having to upgrade to a 512MB slice in order for the server to not run out of memory.
Over the past few days users have reported certificate errors when they hit the auth server (“untrusted site” or “unknown authority”) – I remembered that when installing this cert on the old auth server I had to add an extra file; the “chain file”. I added this (SSLCertificateChainFile in the apache config), and the problem seems to be fixed.
Here are the changes I made to the config files to reduce memory:
###apache.conf #found at http://wiki.vpslink.com/Low_memory_MySQL_/_Apache_configurations StartServers 1 MinSpareServers 1 MaxSpareServers 5 ServerLimit 40 MaxClients 40 MaxRequestsPerChild 200
###postgresql.conf max_connections = 40
###mysql/my.cnf ###used the small my.cnf found at /usr/share/doc/mysql-server-5.0/examples/my-small.cnf
Our deal with TheWire is up, so I'm migrating the server to VPS. Hard to get the memory usage down so it doesn't thrash, but seems to be running ok now. Will keep monitoring. Also, we're now up to wifidog rev 1390.
The auth server got kinda crashy on us, 'cause it was on a box with not much RAM. In particular, it would crash when an unauthenticated user would launch an RSS reader – which would slam the router with dozens of simultaneous HTTP requests, which were each redirected to the server. I wrote this little chunk of code to try to minimize this impact; I didn't do any quantitative testing, but anecdotally it seemed to reduce the problem:
///////////// (wifidog/include/common.php, before 'require_once('init_php.php')') $rssagents = array( 'NetNewsWire', 'AppleSyndication', 'ProRSSReader', 'Apple-PubSub', 'Transmission','Shasta', 'Microsoft-CryptoAPI', 'Feedfetcher', 'AntiVir-NG-Upd', 'Windows-Update-Age' ); $keepgoing = 1; foreach ($rssagents as $thisrssua) { if ( (strpos($_SERVER['HTTP_USER_AGENT'],$thisrssua) === 0) ) { $keepgoing = 0; } } if ($keepgoing == 0) { print "request cancelled"; die(); } ///////////// END OF GABE ADDITION
Wifidog sent out email notifications at around 7:15PM tonight that all the hotspots were down. Seems like apache on the server had crashed.
Gabe upgraded the auth server. It's now at revision 1305. I did:
I had to install a newer Smarty – the upgraded auth server wanted 2.6.18.
Gabe upgraded the auth server. It's now at revision 1279. I did:
Gabe upgraded the auth server. It's now at revision 1244. I did:
Gabe received an email from a user saying he was having trouble changing his password. Gabe confirmed the problem, a reported the bug on the isf-wifidog list. Five hours later Benoit replied saying he'd fixed it in svn. Gabe ran “svn update” – we're now running revision 1172.
Gabe upgraded the wifidog auth server to the current version (1164).
Note: to upgrade auth server, use “svn update”, as per François Proulx's email to the wifidog email list on January 10th.