From abellmt@spsp.net Fri Apr 4 19:00:02 2003 From: abellmt@spsp.net (Martin Abell) Date: Fri, 04 Apr 2003 14:00:02 -0500 Subject: [SearchEngine] Configuration. Message-ID: Hi, I've downloaded the Douglas Thrift Search Engine and am intrigued by the apparent simplicity of it, even while it seems to have considerable flexibility. I'm new to this list and am not aware of an archive, so I'll simply ask the first dumb newbie question. How do you configure the Douglas Thrift Search Engine? After everything was loaded into /usr/loca/dtse I decided that search.cgi probably needed to be in the cgi-bin directory of the site to be searched. (This site will not allow it to execute from the html directory.) Even if that is correct, I'm not sure how to tell it where to find the pages to be searched. And I assume the "/data/" directory can be moved so that there could be multiple indexes on a machine with multiple websites. (I'm getting "Invalid XML version declaration" errors, but don't know what that means.) Are there any guidelines or examples of how to start? Martin -- Martin Abell SpeedSpan 617 Vine St Suite 1308 Cincinnati, OH 45202 513-579-1990 From douglaswth@earthlink.net Fri Apr 4 20:10:53 2003 From: douglaswth@earthlink.net (Douglas William Thrift) Date: Fri, 4 Apr 2003 12:10:53 -0800 Subject: [SearchEngine] Re: Configuration. References: Message-ID: <00a801c2fae6$4df18090$0100a8c0@mshome.net> Hello and welcome, Your's is actually the first question asked on this list! The search.cgi script is for displaying the results of a search. The program that is actually used is Search in the bin subdirectory. To index a site with it you issue the following at the terminal ($ is the prompt): $ Search index.xml -i http://www.site.com/ -d www.site.com where index.xml is your index file and www.site.com is the website you want to index. You will want to run the command from within the data subdirectory. If you want to customize where files are found for searches you can edit the search.cgi file that you copied to your web server. Also, if you are running the search.cgi script from cgi-bin you may need to change some of the paths in the HTML template files. _______________________________________________________________________ Douglas William Thrift Martin Abell on Friday, April 04, 2003 11:00 AM said: > Hi, > > I've downloaded the Douglas Thrift Search Engine and am intrigued by > the apparent simplicity of it, even while it seems to have > considerable flexibility. > > I'm new to this list and am not aware of an archive, so I'll simply > ask the first dumb newbie question. > > How do you configure the Douglas Thrift Search Engine? > > After everything was loaded into /usr/loca/dtse I decided that > search.cgi probably needed to be in the cgi-bin directory of the site > to be searched. (This site will not allow it to execute from the html > directory.) > > Even if that is correct, I'm not sure how to tell it where to find > the pages to be searched. And I assume the "/data/" directory can be > moved so that there could be multiple indexes on a machine with > multiple websites. > > (I'm getting "Invalid XML version declaration" errors, but don't know > what that means.) > > Are there any guidelines or examples of how to start? > > Martin > > -- > Martin Abell > SpeedSpan > 617 Vine St > Suite 1308 > Cincinnati, OH 45202 > 513-579-1990 From douglaswth@earthlink.net Fri Apr 4 20:12:46 2003 From: douglaswth@earthlink.net (Douglas William Thrift) Date: Fri, 4 Apr 2003 12:12:46 -0800 Subject: [SearchEngine] Re: Configuration. References: <00a801c2fae6$4df18090$0100a8c0@mshome.net> Message-ID: <00b401c2fae6$8f24a240$0100a8c0@mshome.net> Also to get all of the command line arguments for the Search program, type: $ Search -help _______________________________________________________________________ Douglas William Thrift Douglas William Thrift on Friday, April 04, 2003 12:10 PM said: > Hello and welcome, > Your's is actually the first question asked on this list! > > The search.cgi script is for displaying the results of a search. > The program that is actually used is Search in the bin subdirectory. > To index a site with it you issue the following at the terminal ($ is > the prompt): > > $ Search index.xml -i http://www.site.com/ -d www.site.com > > where index.xml is your index file and www.site.com is the website > you want to index. You will want to run the command from within the > data subdirectory. > > If you want to customize where files are found for searches you can > edit the search.cgi file that you copied to your web server. Also, > if you are running the search.cgi script from cgi-bin you may need to > change some of the paths in the HTML template files. > > _______________________________________________________________________ > Douglas William Thrift > > > > > Martin Abell on > Friday, April 04, 2003 11:00 AM said: >> Hi, >> >> I've downloaded the Douglas Thrift Search Engine and am intrigued by >> the apparent simplicity of it, even while it seems to have >> considerable flexibility. >> >> I'm new to this list and am not aware of an archive, so I'll simply >> ask the first dumb newbie question. >> >> How do you configure the Douglas Thrift Search Engine? >> >> After everything was loaded into /usr/loca/dtse I decided that >> search.cgi probably needed to be in the cgi-bin directory of the site >> to be searched. (This site will not allow it to execute from the html >> directory.) >> >> Even if that is correct, I'm not sure how to tell it where to find >> the pages to be searched. And I assume the "/data/" directory can be >> moved so that there could be multiple indexes on a machine with >> multiple websites. >> >> (I'm getting "Invalid XML version declaration" errors, but don't know >> what that means.) >> >> Are there any guidelines or examples of how to start? >> >> Martin >> >> -- >> Martin Abell >> SpeedSpan >> 617 Vine St >> Suite 1308 >> Cincinnati, OH 45202 >> 513-579-1990 From abellmt@spsp.net Fri Apr 25 20:47:07 2003 From: abellmt@spsp.net (Martin Abell) Date: Fri, 25 Apr 2003 15:47:07 -0400 Subject: [SearchEngine] FW: Configuration. In-Reply-To: <00a801c2fae6$4df18090$0100a8c0@mshome.net> Message-ID: Douglas, Thanks for the help on this. Sorry it took so long to get back to you, but lots of things intervened before I could try some things out. Searching (and indexing) is very fast, and the results format is excellent. Very nice work. Once you unlocked the indexing step for me, everything worked great. I guess I'd suggest you put the instruction you sent me (below) on your website. At least I don't think anything like it was there when I looked before. (I realize you're following the Google model, but for a small site [not many hits], I think the ability to enter a word "fragment" [e.g., "docum" to find document or documents] would be helpful.) Thanks again. Martin -- Martin Abell SpeedSpan 617 Vine St Suite 1308 Cincinnati, OH 45202 513-579-1990 ------ Forwarded Message From: "Douglas William Thrift" Date: Fri, 4 Apr 2003 12:10:53 -0800 To: Subject: Re: Configuration. Hello and welcome, Your's is actually the first question asked on this list! The search.cgi script is for displaying the results of a search. The program that is actually used is Search in the bin subdirectory. To index a site with it you issue the following at the terminal ($ is the prompt): $ Search index.xml -i http://www.site.com/ -d www.site.com where index.xml is your index file and www.site.com is the website you want to index. You will want to run the command from within the data subdirectory. If you want to customize where files are found for searches you can edit the search.cgi file that you copied to your web server. Also, if you are running the search.cgi script from cgi-bin you may need to change some of the paths in the HTML template files. _______________________________________________________________________ Douglas William Thrift Martin Abell on Friday, April 04, 2003 11:00 AM said: > Hi, > > I've downloaded the Douglas Thrift Search Engine and am intrigued by > the apparent simplicity of it, even while it seems to have > considerable flexibility. > > I'm new to this list and am not aware of an archive, so I'll simply > ask the first dumb newbie question. > > How do you configure the Douglas Thrift Search Engine? > > After everything was loaded into /usr/loca/dtse I decided that > search.cgi probably needed to be in the cgi-bin directory of the site > to be searched. (This site will not allow it to execute from the html > directory.) > > Even if that is correct, I'm not sure how to tell it where to find > the pages to be searched. And I assume the "/data/" directory can be > moved so that there could be multiple indexes on a machine with > multiple websites. > > (I'm getting "Invalid XML version declaration" errors, but don't know > what that means.) > > Are there any guidelines or examples of how to start? > > Martin > > -- > Martin Abell > SpeedSpan > 617 Vine St > Suite 1308 > Cincinnati, OH 45202 > 513-579-1990 ------ End of Forwarded Message From douglaswth@earthlink.net Sat Apr 26 02:54:32 2003 From: douglaswth@earthlink.net (Douglas William Thrift) Date: Fri, 25 Apr 2003 18:54:32 -0700 Subject: [SearchEngine] Re: Configuration. References: Message-ID: <00e601c30b96$d5ddd1f0$0100a8c0@mshome.net> Hello again, I'm glad you got it working. I just put up some getting started instructions in the Documentation section: http://computers.douglasthrift.net/searchengine/#docs. _______________________________________________________________________ Douglas William Thrift ----- Original Message ----- From: "Martin Abell" To: Sent: Friday, April 25, 2003 12:47 PM Subject: FW: Configuration. > Douglas, > > Thanks for the help on this. Sorry it took so long to get back to you, but > lots of things intervened before I could try some things out. Searching > (and indexing) is very fast, and the results format is excellent. Very nice > work. > > Once you unlocked the indexing step for me, everything worked great. I > guess I'd suggest you put the instruction you sent me (below) on your > website. At least I don't think anything like it was there when I looked > before. > > (I realize you're following the Google model, but for a small site [not many > hits], I think the ability to enter a word "fragment" [e.g., "docum" to find > document or documents] would be helpful.) > > Thanks again. > > Martin > -- > Martin Abell > SpeedSpan > 617 Vine St > Suite 1308 > Cincinnati, OH 45202 > 513-579-1990 > > > > > ------ Forwarded Message > From: "Douglas William Thrift" > Date: Fri, 4 Apr 2003 12:10:53 -0800 > To: > Subject: Re: Configuration. > > Hello and welcome, > Your's is actually the first question asked on this list! > > The search.cgi script is for displaying the results of a search. The > program that is actually used is Search in the bin subdirectory. To index a > site with it you issue the following at the terminal ($ is the prompt): > > $ Search index.xml -i http://www.site.com/ -d www.site.com > > where index.xml is your index file and www.site.com is the website you want > to index. You will want to run the command from within the data > subdirectory. > > If you want to customize where files are found for searches you can edit the > search.cgi file that you copied to your web server. Also, if you are > running the search.cgi script from cgi-bin you may need to change some of > the paths in the HTML template files. > > _______________________________________________________________________ > Douglas William Thrift > > > > > Martin Abell on > Friday, April 04, 2003 11:00 AM said: > > Hi, > > > > I've downloaded the Douglas Thrift Search Engine and am intrigued by > > the apparent simplicity of it, even while it seems to have > > considerable flexibility. > > > > I'm new to this list and am not aware of an archive, so I'll simply > > ask the first dumb newbie question. > > > > How do you configure the Douglas Thrift Search Engine? > > > > After everything was loaded into /usr/loca/dtse I decided that > > search.cgi probably needed to be in the cgi-bin directory of the site > > to be searched. (This site will not allow it to execute from the html > > directory.) > > > > Even if that is correct, I'm not sure how to tell it where to find > > the pages to be searched. And I assume the "/data/" directory can be > > moved so that there could be multiple indexes on a machine with > > multiple websites. > > > > (I'm getting "Invalid XML version declaration" errors, but don't know > > what that means.) > > > > Are there any guidelines or examples of how to start? > > > > Martin > > > > -- > > Martin Abell > > SpeedSpan > > 617 Vine St > > Suite 1308 > > Cincinnati, OH 45202 > > 513-579-1990 > > > > ------ End of Forwarded Message > From abellmt@spsp.net Sat Apr 26 04:16:32 2003 From: abellmt@spsp.net (Martin Abell) Date: Fri, 25 Apr 2003 23:16:32 -0400 Subject: [SearchEngine] Startup Documentation Should Really Be Helpful. Message-ID: I'd imagine others will appreciate that. (I had to increase the font size on my browser to read it though.) > Hello again, > I'm glad you got it working. I just put up some getting started > instructions in the Documentation section: > http://computers.douglasthrift.net/searchengine/#docs. > _______________________________________________________________________ > Douglas William Thrift > >