Hi, I don't follow OG development but as host operator a few things are coming up. The machine's load is gradually climbing over time and some of that is OG.
Despite a relatively low hit rate on OG it is consuming quite a bit of resource. If OG started taking off it would take the machine down.
First up: index.cgi requires 0.35s to perform a `perl -c` syntax check. Any thoughts on putting OG on a mod_perl server? I have mod_perl running here of course and we'd need to coordinate some apache.conf stuff.
Second: the supersearch.cgi gulps down CPU, often for seconds at a time. It is a frequent resident of `top` output. This isn't really acceptable. I'm going to request this feature be turned off unless an effective optimisation plan or some other way to reduce its impact here is constructed pretty soon. Sorry about this but it's encroaching on others.
Third: I wonder if there's some way to instruct robots not to spider parts of your wiki. This ought to speak for itself: $ grep crawl /var/log/apache/london.openguides.org-access.log | grep 'action=edit' | wc -l 8242 $
Finally: I posted about a DoS and was wondering what the status of a solution was. http://openguides.org/mail/openguides-dev/2004-October/000542.html
Cheers, Paul (any overbearing tone unintentional ;-)
On Thu 21 Oct 2004, Paul Makepeace openguides.org@paulm.com wrote:
Second: the supersearch.cgi gulps down CPU, often for seconds at a time.
Working on this. The slowness is regexp-related. I have a much much faster version that passes all tests but gives slightly different results. More tests and investigation needed, will keep people posted.
Third: I wonder if there's some way to instruct robots not to spider parts of your wiki.
This was fixx0red in 0.41, released on 21 September.
Kake
On Thu 21 Oct 2004, Paul Makepeace openguides.org@paulm.com wrote:
Hi, I don't follow OG development but as host operator a few things are coming up. The machine's load is gradually climbing over time and some of that is OG.
Oh - and for anyone who was confused by Paul's message, he's the sysadmin of the machine that the London OpenGuide runs on.
Kake
On Mon, Oct 25, 2004 at 04:25:40AM +0100, Kake L Pugh wrote:
On Thu 21 Oct 2004, Paul Makepeace openguides.org@paulm.com wrote:
Hi, I don't follow OG development but as host operator a few things are coming up. The machine's load is gradually climbing over time and some of that is OG.
Oh - and for anyone who was confused by Paul's message, he's the sysadmin of the machine that the London OpenGuide runs on.
Ooh, and that reminds me, I was muttering at Ivor about this in the pub on Friday, and the URL for the article on vector searching is: http://www.perl.com/pub/a/2003/02/19/engine.html
Maybe it'll help.
Any objections to me completely removing Search::InvertedIndex support from OpenGuides and just using Plucene?
Kake
On Oct 25, 2004, at 18:46, Kake L Pugh wrote:
Any objections to me completely removing Search::InvertedIndex support from OpenGuides and just using Plucene?
Yes, Plucene doesn't work on Mac OS X. Plus, _wow_ that's a lot of deps.
Not that I use SII either, to be honest, but at least the thing passes tests.
tom
On Oct 25, 2004, at 18:46, Kake L Pugh wrote:
Any objections to me completely removing Search::InvertedIndex support from OpenGuides and just using Plucene?
On Mon 25 Oct 2004, Tom Insam tom@jerakeen.org wrote:
Yes, Plucene doesn't work on Mac OS X.
OK.
Plus, _wow_ that's a lot of deps.
Not an issue in my book.
Kake
This one time, at band camp, Kake L Pugh wrote:
Any objections to me completely removing Search::InvertedIndex support from OpenGuides and just using Plucene?
Currently unsupported in the Debian packages. Dom, happy to check out some test builds on desktop machine if you enable it.
On Oct 25, 2004, at 18:46, Kake L Pugh wrote:
Any objections to me completely removing Search::InvertedIndex support from OpenGuides and just using Plucene?
On Mon 25 Oct 2004, Tom Insam tom@jerakeen.org wrote:
Yes, Plucene doesn't work on Mac OS X.
OK.
I withdraw my opinion. I'm perfectly capable of force-installing Plucene locally if I want the tests to work, and my argument is thus based purely on 'some other people might have issues', which is a terrible reason. If anyone actually would consider this a real issue, fine, it's a real issue, but it's not for me.
tom
On Mon, Oct 25, 2004 at 06:46:06PM +0100, Kake L Pugh wrote:
Any objections to me completely removing Search::InvertedIndex support from OpenGuides and just using Plucene?
What concrete reasons are there to drop support for S::I? Myself (and for the Debian packages) I would not want S::I support to go away just yet; as others have said, Plucene introduces a heap of new module dependencies which I'll need to work on for Debian packaging; this is something I wanted to do at some point but I was hoping to have a bit more breathing space.
Does retaining S::I support impose a significant hassle now? IMO strongarming users into using the "better" programming solution is a bad thing; OpenGuides is a fast enough moving target as it is, and at the end of the day we want to be able to run a stable, maintainable system (I don't consider CPAN.pm or CPANPLUS.pm capable of doing this, for the record). There is of course no reason not to develop better searching code, and I recognise that this is an urgent problem as regards the London server, but since we already support both search systems, is there really a need to throw the baby out with the bathwater?
I am also concerned that we might lock ourselves into Plucene and subsequently find it an abandoned project and that better things are out there (I'm not making a statement of the current situation because I haven't done a survey of the current search toolkits available). I know that Simon has stepped away from maintaining his modules, so it seems foolish to force our users to use Plucene with no alternative at this stage.
These are some initial thoughts. If I come up with anything more once I've had a chance to think it through, look through the code, and be sober, I'll follow up.
Cheers,
Dom.
On Tue 26 Oct 2004, Dominic Hargreaves dom@earth.li wrote:
What concrete reasons are there to drop support for S::I?
The supersearch needs a complete rewrite. It's tangled and murky and full of action at a distance. However it's also quite featureful, and most of its features are ones that people have specifically asked for, so I need to retain them.
Now, the way the supersearch works at the moment is that it only uses a very small amount of the functionality of the CGI::Wiki indexers. The essence of the search is done with regexes, and this is why it's so inefficient. There is no way I can see to speed the thing up except by actually using some of the features of the powerful modules that CGI::Wiki uses for indexing.
What I'm working on at the moment is something that will probably end up being called OpenGuides::Indexer. It will use Plucene directly and efficiently.
Does retaining S::I support impose a significant hassle now?
Yes.
I know that Simon has stepped away from maintaining his modules, so it seems foolish to force our users to use Plucene with no alternative at this stage.
As far as I know, Plucene is an important part of a lot of Kasei's code. They are the maintainers now, and I doubt they're going to junk it. Search::InvertedIndex however has essentially been abandoned by its maintainer.
Kake
On Tue, Oct 26, 2004 at 12:17:22AM +0100, Kake L Pugh wrote:
On Tue 26 Oct 2004, Dominic Hargreaves dom@earth.li wrote:
Does retaining S::I support impose a significant hassle now?
Yes.
Okay. I was assuming that it would be possible to just clone stuff, so that we end up with a new supersearch.cgi and a notverygoodsearch.cgi left over with S::I support :)
I know that Simon has stepped away from maintaining his modules, so it seems foolish to force our users to use Plucene with no alternative at this stage.
As far as I know, Plucene is an important part of a lot of Kasei's code. They are the maintainers now, and I doubt they're going to junk it. Search::InvertedIndex however has essentially been abandoned by its maintainer.
I didn't realise that; that makes me feel happier about using it.
Cheers,
Dominic.
A follow-up on a 2.5yr-old mail from your London OG admin,
OpenGuides London is frequently bursting 100% of a 3Ghz HT CPU which is causing stix (the server) load problems,
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11060 www-data 19 0 18940 15m 2576 R 26.5 1.5 0:00.80 index.cgi 11040 www-data 19 0 21420 17m 2612 R 6.9 1.7 0:01.12 index.cgi 11039 www-data 19 0 21004 17m 2584 R 6.6 1.7 0:01.04 index.cgi 11048 www-data 19 0 21392 17m 2612 R 6.6 1.7 0:01.10 index.cgi 11051 www-data 19 0 21248 17m 2612 R 4.6 1.7 0:01.02 index.cgi
This CPU's nearly twice as fast as the server's the hosting was on when I last wrote this (the IO yet faster) and not using supersearch.cgi has obviously helped, all of which has bought some time, retrospectively. Some plan however to make use of mod_perl or other CPU-reduction measure needs to come into effect pretty soon.
What say ye?
P
Je 2004-10-21 17:25:59 +0100, Paul Makepeace skribis:
Hi, I don't follow OG development but as host operator a few things are coming up. The machine's load is gradually climbing over time and some of that is OG.
Despite a relatively low hit rate on OG it is consuming quite a bit of resource. If OG started taking off it would take the machine down.
First up: index.cgi requires 0.35s to perform a `perl -c` syntax check. Any thoughts on putting OG on a mod_perl server? I have mod_perl running here of course and we'd need to coordinate some apache.conf stuff.
Second: the supersearch.cgi gulps down CPU, often for seconds at a time. It is a frequent resident of `top` output. This isn't really acceptable. I'm going to request this feature be turned off unless an effective optimisation plan or some other way to reduce its impact here is constructed pretty soon. Sorry about this but it's encroaching on others.
Third: I wonder if there's some way to instruct robots not to spider parts of your wiki. This ought to speak for itself: $ grep crawl /var/log/apache/london.openguides.org-access.log | grep 'action=edit' | wc -l 8242 $
Finally: I posted about a DoS and was wondering what the status of a solution was. http://openguides.org/mail/openguides-dev/2004-October/000542.html
Cheers, Paul (any overbearing tone unintentional ;-)
On Tue, Jun 05, 2007 at 09:58:37AM +0100, Paul Makepeace wrote:
A follow-up on a 2.5yr-old mail from your London OG admin,
OpenGuides London is frequently bursting 100% of a 3Ghz HT CPU which is causing stix (the server) load problems,
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11060 www-data 19 0 18940 15m 2576 R 26.5 1.5 0:00.80 index.cgi 11040 www-data 19 0 21420 17m 2612 R 6.9 1.7 0:01.12 index.cgi 11039 www-data 19 0 21004 17m 2584 R 6.6 1.7 0:01.04 index.cgi 11048 www-data 19 0 21392 17m 2612 R 6.6 1.7 0:01.10 index.cgi 11051 www-data 19 0 21248 17m 2612 R 4.6 1.7 0:01.02 index.cgi
This CPU's nearly twice as fast as the server's the hosting was on when I last wrote this (the IO yet faster) and not using supersearch.cgi has obviously helped, all of which has bought some time, retrospectively. Some plan however to make use of mod_perl or other CPU-reduction measure needs to come into effect pretty soon.
What say ye?
I think this has to be directed at Earle in the first instance, as the rest of us are not party to the configuration or any statistics of london.openguides.org, or, for example, if the relevant database indexes are configured (this, I recall, improves certain expects of OpenGuides significantly).
The last time I tried OpenGuides under mod_perl it mostly worked, though for some reason that escaped me (possibly time) I never went as far as declaring it so - there were some niggles to clean up.
These days my own interest in persistent perl applications lies within FastCGI (since this allows one to run the application under its own user id) and restructing wiki.cgi et al to use CGI::Fast should be fairly simple (especially since CGI::Fast will also work in plain CGI mode) but since I'm not currently suffering from resource problems this isn't a priority for me.
http://dev.openguides.org/ticket/150
is one ticket about this issue.
Dominic.
On Tue, 5 Jun 2007, Dominic Hargreaves wrote:
The last time I tried OpenGuides under mod_perl it mostly worked, though for some reason that escaped me (possibly time) I never went as far as declaring it so - there were some niggles to clean up.
I think it mostly works too. However it will exhaust database connections (under postgres). So I suspect were not closing db connections properly at some point(and indeed my cow-orkers agreed with me when I asked them(we didnt look into the code though)). I was going to see if I could replicate this before the hackfest this weekend.
On Tue 05 Jun 2007, Dominic Hargreaves dom@earth.li wrote:
The last time I tried OpenGuides under mod_perl it mostly worked, though for some reason that escaped me (possibly time) I never went as far as declaring it so - there were some niggles to clean up.
Bob tried it out recently on the dev install of RGL; it seemed to work apart from the problem of its occasionally spitting out a dump of its config object instead of the expected output. Obviously this is a showstopper since the config object contains admin and database passwords. I couldn't figure out why it was happening (though frustratingly I remember something else causing this problem too, and I fixed that, but I can't remember the details).
[...] since I'm not currently suffering from resource problems this isn't a priority for me.
Me neither, I'm afraid. I suspect Earle is the person to spearhead this one, since he's the one directly affected.
Kake
On Tue, Jun 05, 2007 at 09:58:37AM +0100, Paul Makepeace wrote:
This CPU's nearly twice as fast as the server's the hosting was on when I last wrote this (the IO yet faster) and not using supersearch.cgi has obviously helped, all of which has bought some time, retrospectively. Some plan however to make use of mod_perl or other CPU-reduction measure needs to come into effect pretty soon.
I run the Open Guide to Boston under mod_perl without problems. I believe that I brought all the patches I made to the code back into line with the main trunk, so I am not aware of any reason that the code should not be mod_perl friendly.
Regards,
On Tue, 5 Jun 2007, Christopher Schmidt wrote:
I run the Open Guide to Boston under mod_perl without problems. I believe that I brought all the patches I made to the code back into line with the main trunk, so I am not aware of any reason that the code should not be mod_perl friendly.
is your use of memcached hiding possible connection exhaustion issues? would your memcached stuff help with london's problem?
On Tue, Jun 05, 2007 at 12:49:33PM +0100, Bob Walker wrote:
On Tue, 5 Jun 2007, Christopher Schmidt wrote:
I run the Open Guide to Boston under mod_perl without problems. I believe that I brought all the patches I made to the code back into line with the main trunk, so I am not aware of any reason that the code should not be mod_perl friendly.
is your use of memcached hiding possible connection exhaustion issues? would your memcached stuff help with london's problem?
Even when memcached is in place, I still connect to the database, even when all the data is loaded from cache. (That's a bug, but not one I care about). However, I have my database set up to allow max connections of twice the number of apache threads I have, and OpenGuides only connects once per page as far as I remember -- I've done some looking at this -- so I don't think it'd be a problem...
However, I'm using MySQL, and I know you're not. It's possible that the W::T Store module for Postgres doesn't do closing and MySQL does?
I can't tell whether memcached would help: it would depend on whether there is free memory on the box, and whether the load is from getting the data or from styling it. In general, the latter is generally true, but someone with access to the box might be better suited to determine that.
In general, mod_perl has helped much much much more with performance than memcached has, especially on the single-node case (rather than category listings or the ilke).
REgards,
On Tue, Jun 05, 2007 at 08:01:37AM -0400, Christopher Schmidt wrote:
Even when memcached is in place, I still connect to the database, even when all the data is loaded from cache. (That's a bug, but not one I care about). However, I have my database set up to allow max connections of twice the number of apache threads I have, and OpenGuides only connects once per page as far as I remember -- I've done some looking at this -- so I don't think it'd be a problem...
We ought to be keeping database handles around and reusing them, and if we're not, and not destroying them, than this is obviously going to cause the number of database connections to spiral. We need to look into the code to correct this; it shouldn't be hard. One for this weekend maybe.
Dominic.
[+earle]
On 6/5/07, Dominic Hargreaves dom@earth.li wrote:
On Tue, Jun 05, 2007 at 09:58:37AM +0100, Paul Makepeace wrote:
A follow-up on a 2.5yr-old mail from your London OG admin,
OpenGuides London is frequently bursting 100% of a 3Ghz HT CPU which is causing stix (the server) load problems,
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11060 www-data 19 0 18940 15m 2576 R 26.5 1.5 0:00.80 index.cgi 11040 www-data 19 0 21420 17m 2612 R 6.9 1.7 0:01.12 index.cgi 11039 www-data 19 0 21004 17m 2584 R 6.6 1.7 0:01.04 index.cgi 11048 www-data 19 0 21392 17m 2612 R 6.6 1.7 0:01.10 index.cgi 11051 www-data 19 0 21248 17m 2612 R 4.6 1.7 0:01.02 index.cgi
This CPU's nearly twice as fast as the server's the hosting was on when I last wrote this (the IO yet faster) and not using supersearch.cgi has obviously helped, all of which has bought some time, retrospectively. Some plan however to make use of mod_perl or other CPU-reduction measure needs to come into effect pretty soon.
What say ye?
I think this has to be directed at Earle in the first instance, as the rest of us are not party to the configuration or any statistics of london.openguides.org, or, for example, if the relevant database indexes are configured (this, I recall, improves certain expects of OpenGuides significantly).
The last time I tried OpenGuides under mod_perl it mostly worked, though for some reason that escaped me (possibly time) I never went as far as declaring it so - there were some niggles to clean up.
These days my own interest in persistent perl applications lies within FastCGI (since this allows one to run the application under its own user id) and restructing wiki.cgi et al to use CGI::Fast should be fairly simple (especially since CGI::Fast will also work in plain CGI mode) but since I'm not currently suffering from resource problems this isn't a priority for me.
http://dev.openguides.org/ticket/150
is one ticket about this issue.
Thanks Dom. FastCGI is another route that seems reasonable, in fact preferable probably since it's somewhat easier to hive off and debug resource issues than apache+mod_perl, specifically restarting an fcgi instance is less disruptive in a shared environment.
Can I request someone (Earle, with help from others if nec) propose a deadline on ameliorating this? I'm of course happy as usual to help from the admin end.
P
Dominic.
-- Dominic Hargreaves | http://www.larted.org.uk/~dom/ PGP key 5178E2A5 from the.earth.li (keyserver,web,email)
-- OpenGuides-Dev mailing list - OpenGuides-Dev@lists.openguides.org http://lists.openguides.org/cgi-bin/mailman/listinfo/openguides-dev
Hi,
I'm sorry to hear that OGL is causing such a server strain. I haven't had much time to follow list traffic recently as I've been trying to look for a new job while also revising for an engineering examination on Monday, and my mother has just had a stroke which is currently requiring the greater part of my attention.
A few days ago I made partial progress in sorting out an install of the svn code for OGL. Hopefully I will be able to finish it on Tuesday or so.
Earle
On Tue 05 Jun 2007, Paul Makepeace paulm@paulm.com wrote:
OpenGuides London is frequently bursting 100% of a 3Ghz HT CPU which is causing stix (the server) load problems,
Did this issue get resolved or ameliorated, and if so what was the fix?
Kake
openguides-dev@lists.openguides.org