From asmodai@nexus.ninth-circle.org  Sun Mar 23 05:41:22 2003
Return-Path: <asmodai@nexus.ninth-circle.org>
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 3B3C037B404
	for <FreeBSD-gnats-submit@freebsd.org>; Sun, 23 Mar 2003 05:41:22 -0800 (PST)
Received: from 213-84-207-11.adsl.xs4all.nl (nexus.xs4all.nl [213.84.207.11])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 971A643F75
	for <FreeBSD-gnats-submit@freebsd.org>; Sun, 23 Mar 2003 05:41:20 -0800 (PST)
	(envelope-from asmodai@nexus.ninth-circle.org)
Received: by 213-84-207-11.adsl.xs4all.nl (Postfix, from userid 1000)
	id DC6D35B0; Sun, 23 Mar 2003 14:41:18 +0100 (CET)
Message-Id: <20030323134118.DC6D35B0@213-84-207-11.adsl.xs4all.nl>
Date: Sun, 23 Mar 2003 14:41:18 +0100 (CET)
From: Jeroen Ruigrok van der Werven <asmodai@nexus.ninth-circle.org>
Reply-To: Jeroen Ruigrok van der Werven <asmodai@nexus.ninth-circle.org>
To: FreeBSD-gnats-submit@freebsd.org
Cc:
Subject: [PATCH] Fix textfile creation
X-Send-Pr-Version: 3.113
X-GNATS-Notify:

>Number:         50211
>Category:       docs
>Synopsis:       [patch] doc.docbook.mk: fix textfile creation
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-doc
>State:          open
>Quarter:        
>Keywords:       all
>Date-Required:  
>Class:          change-request
>Submitter-Id:   current-users
>Arrival-Date:   Sun Mar 23 05:50:11 PST 2003
>Closed-Date:    
>Last-Modified:  Tue Aug 31 22:09:48 UTC 2010
>Originator:     Jeroen Ruigrok van der Werven
>Release:        FreeBSD 4.7-STABLE i386
>Organization:
Ninth Circle Enterprises
>Environment:

Not applicable.

>Description:

A person might want to have the latest links installed, however, this messes
up textfile creation from html files since the latest links version does not
support -dump anymore.

>How-To-Repeat:

Install docproj, install ports/www/links.

>Fix:

Ditch links in favour of elinks, see attached patches.


Index: doc/share/mk/doc.docbook.mk
===================================================================
RCS file: /home/ncvs/FreeBSD/doc/share/mk/doc.docbook.mk,v
retrieving revision 1.78
diff -u -r1.78 doc.docbook.mk
--- doc/share/mk/doc.docbook.mk	16 Feb 2003 14:59:30 -0000	1.78
+++ doc/share/mk/doc.docbook.mk	23 Mar 2003 13:37:43 -0000
@@ -217,7 +217,7 @@
 GROFF?=		groff
 TIDY?=		${PREFIX}/bin/tidy
 TIDYOPTS?=	-i -m -raw -preserve -f /dev/null -asxml ${TIDYFLAGS}
-HTML2TXT?=	${PREFIX}/bin/links
+HTML2TXT?=	${PREFIX}/bin/elinks
 HTML2TXTOPTS?=	-dump ${HTML2TXTFLAGS}
 HTML2PDB?=	${PREFIX}/bin/iSiloBSD
 HTML2PDBOPTS?=	-y -d0 -Idef ${HTML2PDBFLAGS}
@@ -440,6 +440,9 @@
 ${DOC}.html-text: ${DOC}.xml ${INDEX_SGML} ${HTML_INDEX}
 	${XSLTPROC} --param freebsd.output.html.images "'0'" ${XSLHTML} \
 		${DOC}.xml > ${.TARGET}
+.endif
+.if !defined(NO_TIDY)
+	-${TIDY} ${TIDYOPTS} ${.TARGET}
 .endif
 
 ${DOC}.html-split.tar: HTML.manifest ${LOCAL_IMAGES_LIB} \
Index: doc/share/mk/doc.html.mk
===================================================================
RCS file: /home/ncvs/FreeBSD/doc/share/mk/doc.html.mk,v
retrieving revision 1.15
diff -u -r1.15 doc.html.mk
--- doc/share/mk/doc.html.mk	27 Dec 2002 00:25:42 -0000	1.15
+++ doc/share/mk/doc.html.mk	23 Mar 2003 13:29:35 -0000
@@ -69,7 +69,7 @@
 
 TIDY?=		${PREFIX}/bin/tidy
 TIDYOPTS?=	-i -m -raw -preserve -f /dev/null -asxml ${TIDYFLAGS}
-HTML2TXT?=	${PREFIX}/bin/links
+HTML2TXT?=	${PREFIX}/bin/elinks
 HTML2TXTOPTS?=	-dump ${HTML2TXTFLAGS}
 HTML2PDB?=	${PREFIX}/bin/iSiloBSD
 HTML2PDBOPTS?=	-y -d0 -Idef ${HTML2PDBFLAGS}

Index: ports/textproc/docproj/Makefile
===================================================================
RCS file: /home/ncvs/FreeBSD/ports/textproc/docproj/Makefile,v
retrieving revision 1.41
diff -u -r1.41 Makefile
--- ports/textproc/docproj/Makefile	7 Mar 2003 06:11:39 -0000	1.41
+++ ports/textproc/docproj/Makefile	23 Mar 2003 13:34:02 -0000
@@ -27,7 +27,7 @@
 		${PREFIX}/share/xml/dtd/xhtml/xhtml.soc:${PORTSDIR}/textproc/xhtml \
 		${PREFIX}/bin/peps:${PORTSDIR}/graphics/peps \
 		${PREFIX}/bin/pngtopnm:${PORTSDIR}/graphics/netpbm \
-		${PREFIX}/bin/links:${PORTSDIR}/www/links1 \
+		${PREFIX}/bin/elinks:${PORTSDIR}/www/elinks \
 		${PREFIX}/bin/xsltproc:${PORTSDIR}/textproc/libxslt \
 		${PREFIX}/bin/scr2png:${PORTSDIR}/graphics/scr2png \
 		${PREFIX}/bin/scr2txt:${PORTSDIR}/textproc/scr2txt
>Release-Note:
>Audit-Trail:

From: Ceri Davies <ceri@FreeBSD.org>
To: Jeroen Ruigrok van der Werven <asmodai@nexus.ninth-circle.org>
Cc: FreeBSD-gnats-submit@FreeBSD.org
Subject: Re: docs/50211: [PATCH] Fix textfile creation
Date: Sun, 23 Mar 2003 14:07:39 +0000

 On Sun, Mar 23, 2003 at 02:41:18PM +0100, Jeroen Ruigrok van der Werven wrote:
 
 > A person might want to have the latest links installed, however, this messes
 > up textfile creation from html files since the latest links version does not
 > support -dump anymore.
 
 We discussed this on -doc a month or so ago, and were generally thinking of
 going back to www/lynx, because this also gets localized text builds working.
 
 Would you happen to know if elinks has this advantage too ?
 
 Thanks,
 
 Ceri

From: Jeroen Ruigrok/asmodai <asmodai@wxs.nl>
To: Ceri Davies <ceri@FreeBSD.org>, FreeBSD-gnats-submit@FreeBSD.org
Cc:  
Subject: Re: docs/50211: [PATCH] Fix textfile creation
Date: Sun, 23 Mar 2003 18:03:54 +0100

 -On [20030323 15:07], Ceri Davies (ceri@FreeBSD.org) wrote:
 >We discussed this on -doc a month or so ago, and were generally thinking of
 >going back to www/lynx, because this also gets localized text builds working.
 
 Problem I had with lynx was that I was unable to make it parse
 book.html-tex as text/html.
 w3m has a -T flag for this, elinks just looks at the file itself, or
 perhaps just assumes it is HTML.
 
 >Would you happen to know if elinks has this advantage too ?
 
 It does, but I don't know for certain for which languages it all works:
 
 elinks -dump -dump-charset iso-8859-15 http://www.paris.fr/
 
 gives me accent aigus, accent circumflexes, etc.
 
 I would be interested in hearing about non-Latin-based examples and how
 they work out.
 
 -- 
 Jeroen Ruigrok van der Werven <asmodai(at)wxs.nl> / asmodai / a capoeirista
 PGP fingerprint: 2D92 980E 45FE 2C28 9DB7  9D88 97E6 839B 2EAC 625B
 http://www.tendra.org/   | http://www.in-nomine.org/~asmodai/diary/
 A kiss is a lovely trick designed by nature to stop speech when words
 become superfluous...

From: Jeroen Ruigrok van der Werven <asmodai@in-nomine.org>
To: bug-followup@FreeBSD.org
Cc:  
Subject: Re: docs/50211: [PATCH] doc.docbook.mk: fix textfile creation
Date: Sun, 13 May 2007 16:59:23 +0200

 A long overdue update I guess.
 
 Neither links or elinks will help for the multibyte environments of Chinese,
 Japanese, Korean and the likes. They simply do not understand encodings such
 as EucJP, SJIS, GB18030, GB2312, EucKR, or UTF-8.
 
 Using www/w3m-m17n I can at least view Japanese pages.
 Using a 'w3m -dump http://website > dump.txt' of a EucJP encoded page the
 resulting file is an UTF-8 encoded plain text file.
 
 The same also works for (X-)SJIS (Japanese), GB2312 (Chinese/PRC), EucKR
 (Korean), UTF-8, TIS-620 (Thai), Big5 (Taiwanese), VISCII (Vietnamese), and
 KOI8-U (Russian).
 
 I tried some ISO-8859 dumps as well (8859-6 for example as well as -7) and it
 all works fine.
 
 So my suggestion is to change HTML2TXT to use w3m from w3m-m17n.
 
 -- 
 Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
 イェルーン ラウフロック ヴァン デル ウェルヴェン
 http://www.in-nomine.org/ | http://www.rangaku.org/
 Reality is an illusion, grimmer. The dreamlands are like masks within
 masks, and Time has no dominion beyond the Shroud...
>Unformatted:
