Path: usenet.cise.ufl.edu!usenet.eel.ufl.edu!news-peer.gsl.net!news.gsl.net!howland.erols.net!newsxfer3.itd.umich.edu!su-news-hub1.bbnplanet.com!news.bbnplanet.com!su-news-feed4.bbnplanet.com!coop.net!csnews!not-for-mail From: Tom Christiansen Newsgroups: comp.lang.perl.misc,comp.lang.perl.announce Subject: Perl FAQ part 9 of 0..9: Networking [Periodic Posting] Supersedes: <5g4po3$g0f$1@csnews.cs.colorado.edu> Followup-To: comp.lang.perl.misc Date: 17 Mar 1997 23:43:21 GMT Organization: Perl Consulting and Training Lines: 300 Approved: tchrist@mox.perl.com (per merlyn@stonehenge.com) Expires: Tue, 1 Apr 1997 12:00:00 GMT Message-ID: <5gkkup$fpo$4@csnews.cs.colorado.edu> References: <5gkhqn$bfq$2@csnews.cs.colorado.edu> Reply-To: perlfaq-suggestions@mox.perl.com NNTP-Posting-Host: perl.com X-Newsposter: Pnews 4.0-test52 (20 Jan 97) Originator: tchrist@perl.com (Tom Christiansen) Xref: usenet.cise.ufl.edu comp.lang.perl.misc:20906 comp.lang.perl.announce:130 NAME perlfaq9 - Networking ($Revision: 1.13 $) DESCRIPTION This section deals with questions related to networking, the internet, and a few on the web. My CGI script runs from the command line but not the browser. Can you help me fix it? Sure, but you probably can't afford our contracting rates :-) Seriously, if you can demonstrate that you've read the following FAQs and that your problem isn't something simple that can be easily answered, you'll probably receive a courteous and useful reply to your question if you post it on comp.infosystems.www.authoring.cgi (if it's something to do with HTTP, HTML, or the CGI protocols). Questions that appear to be Perl questions but are really CGI ones that are posted to comp.lang.perl.misc may not be so well received. The useful FAQs are: http://www.perl.com/perl/faq/idiots-guide.html http://www3.pair.com/webthing/docs/cgi/faqs/cgifaq.shtml http://www.perl.com/perl/faq/perl-cgi-faq.html http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html http://www.boutell.com/faq/ How do I remove HTML from a string? The most correct way (albeit not the fastest) is to use HTML::Parse from CPAN (part of the libwww-perl distribution, which is a must-have module for all web hackers). Many folks attempt a simple-minded regular expression approach, like `s/<.*?>//g', but that fails in many cases because the tags may continue over line breaks, they may contain quoted angle-brackets, or HTML comment may be present. Plus folks forget to convert entities, like `<' for example. Here's one "simple-minded" approach, that works for most files: #!/usr/bin/perl -p0777 s/<(?:[^>'"]*|(['"]).*?\1)*>//gs If you want a more complete solution, see the 3-stage striphtml program in http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/stri phtml.gz . How do I extract URLs? A quick but imperfect approach is #!/usr/bin/perl -n00 # qxurl - tchrist@perl.com print "$2\n" while m{ < \s* A \s+ HREF \s* = \s* (["']) (.*?) \1 \s* > }gsix; This version does not adjust relative URLs, understand alternate bases, deal with HTML comments, or accept URLs themselves as arguments. It also runs about 100x faster than a more "complete" solution using the LWP suite of modules, such as the http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/xurl .gz program. How do I download a file from the user's machine? How do I open a file on another machine? In the context of an HTML form, you can use what's known as multipart/form-data encoding. The CGI.pm module (available from CPAN) supports this in the start_multipart_form() method, which isn't the same as the startform() method. How do I make a pop-up menu in HTML? Use the