ripnews/trunk/ripnews
2008-02-04 21:37:44 +00:00
..
encode rescue junk 2005-06-06 12:51:08 +00:00
net user and pass working 2006-08-21 19:41:29 +00:00
news small improvements 2008-02-04 21:37:44 +00:00
notes fix release procedure 2006-09-13 08:27:53 +00:00
set nog een test 2005-12-06 21:44:40 +00:00
tools read 'from' headers too; preparing for filtering on posters 2006-09-12 20:06:54 +00:00
CHANGELOG grmbl 2006-09-13 07:54:19 +00:00
INSTALL replace Id tag with Dwarf tag 2002-11-05 09:33:41 +00:00
README filtering on poster 2006-09-12 20:54:54 +00:00
ripnews.rb print -> puts 2008-02-04 21:37:09 +00:00
TODO filtering on poster 2006-09-12 20:54:54 +00:00

# $Dwarf: README,v 1.20 2005/03/01 09:18:25 ward Exp $
# $Source$

Ripnews is a bulk downloader for usenet. It's quite flexible in terms of
configuration. Some of it's features are:

- basic support for multiple servers per group
- cacheing of article headers to speed up reading of newsgroups
- newsrc file support (one newsrc file per server)
- flexible but simple configuration

Configuration:
==============

I'll just give a commented example config, it should be pretty clear,
after that I'll list the possible options.

<== cut here ==>
# Set the default NNTPSERVER to localhost
NNTPSERVER=localhost

# Set the cachedir, this is where the subject caches are stored
# without this ripnews will be much slower (but should still work)
CACHEDIR=/mnt/newspace/News/.ripnews_caches

# PID lockfile, prevents multiple ripnews processes from running at the
# same time [global keyword]
LOCKFILE=/local/newspace/News/.ripnewslock

# Set the datadir, this where a subdir for each group will be made to
# store the ripped articles
DATADIR=/mnt/newspace/News

# Set the tempdir, used to store the undecoded data. Without this ripnews
# uses a lot more memory
TEMPDIR=/mnt/newspace/News/ripnews_temp

# Set include pattern to a case insensitive "grateful.dead"
OPT_I=(?i)grateful.dead

# Set the base newsrc name. The server name will be appended.
NEWSRCNAME=/ward/src/ruby/ripnews/.newsrc

# Set the permission to create subdirs with
PERMISSION=0700

# Set the niceness of the ripnews process [global keyword]
NICE=20

# For alt.binaries.e-book and alt.binaries.e-books change from defaults...
alt.binaries.e-book| \
alt.binaries.e-books {
	# Set another include pattern
	OPT_I=(?i)(bible|dickens|shakespeare)
}

alt.binaries.e-book.flood {
	# Add to default pattern, this will not be case insensitive
	# anymore, because that's how ruby patterns work
	OPT_I+=|george.orwell
}

# For both alt.binaries.e-book, alt.binaries.e-books and
# alt.binaries.e-book.flood change some value
alt.binaries.e-book| \
alt.binaries.e-books| \
alt.binaries.e-book.flood {
	# Sets long filenames. If this is set the subject will be used
	# as a filename instead of the name specified in the encoding.
	OPT_L = true
}

# Change default server to news.tilbu1.nb.nl.home.com, since the config
# is parsed in order this will be used from her on down
NNTPSERVER=news.tilbu1.nb.nl.home.com

alt.binaries.music.classical| \
alt.binaries.sounds.lossless.classical| \
alt.binaries.sounds.mp3.classical {
	# Add news4.euro.net as a second server for
	# alt.binaries.music.classical,
	# alt.binaries.sounds.lossless.classical and
	# alt.binaries.sounds.mp3.classical
	NNTPSERVER+=|news4.euro.net
}

alt.binaries.music.classical| \
alt.binaries.sounds.lossless.classical| \
alt.binaries.sounds.mp3.classical {
	OPT_L=true
	OPT_I=(?i)( \
		verdi| \
		vivaldi| \
		mozart| \
		beethoven \
	)
}
<== cut here ==>

Supported commandline options:
------------------------------

"-I", "--include"	Set include pattern.
"-c", "--configfile"	Specify a different config file. Default
			.ripnewsrc
"-L", "--longname"	Sets long filenames.
"-C", "--combinedname"	Sets combined filenames.
"-M", "--multipart"	Get multipart articles
"-s"			Exit silently if already running
"-S", "--singlepart"	Get singlepart articles
"-T", "--test"		Set test mode. Newsrc files will not be writen
			to.
"-X", "--exclude"	Set exclude pattern.

Supported config options:
-------------------------

OPT_I=<pattern>			Set include pattern.
OPT_IF=<patter>			Set include from pattern. Filters on poster.
OPT_L=<bool>			Set long filenames.
OPT_C=<bool>			Sets combined filenames.
OPT_CP=<bool>			Sets poster combined filenames.
OPT_X=<pattern>			Set exclude pattern. Ripnews will read articles
				matching this pattern but it will not attempt
				to download them.
OPT_XF=<pattern>		Set exclude from pattern. Filters on posters.
OPT_MR=<pattern>		Set "mark read" pattern. Ripnews will place
				articles matching this pattern in your newsrc,
				afterwards they will never be present in memory
				again. Great for reducing memory usage when
				checking a group for the first time.
OPT_MRF=<patter>		Set "mark read from" pattern. Filters on posters.
OPT_MRR=<bool>			Mark Remaining Read. If this is set to
				true and the article doesn't match an exclude or
				include pattern, the article will be
				marked as read. The purpose of this is
				to keep the caches of extremely large
				groups small as to make processing
				quicker.
OPT_T=<bool>			Set test mode. Newsrc files will not be written
				to.
TEMPDIR=<dir>			Set tempdir location.
NNTPSERVER=<server>[|server]	Set NNTPSERVER names
				You can also use this notation:
				<user>:<pass>@<server> for each server
				if you need to authenticate by username
				and password.
CACHEDIR=<dir>			Set cachedir location.
DATADIR=<dir>			Set output dir location.
NEWSRCNAME=<newsrcbase>		Specify newsrc basename. Server names
				will be appended.
PERMISSION=<perm>		Set permission bits for directory
				creation. Standard unix style, eg. 0755.
EXTENSIONS=<pattern>		Set extension include pattern.
OPT_M=<pattern>			Set EXTENSIONS just for multi part messages.
OPT_S=<pattern>			Set EXTENSIONS just for single part messages.
DELEXT=<pattern>		Set extension "mark read" pattern.
OPT_MD=<pattern>		Set DELEXT just for multi part messages.
OPT_SD=<pattern>		Set DELEXT just for single part messages.
INCLUDEFILE=<file>		Include another file, only works in main config.
PRIMARYTHRES=<int>		At least this percentage of the post has to be found
				on the first server.

Ruby patterns:
--------------

Ruby patterns are a lot like perl patterns, but there are some
differences. (?i) is the modifier to turn on case insensitivity, unlike
perl this modifier only works on the following block. Luckily you can
group multiple blocks into one by enclosing them with ()'s. So while
'OPT_I=(?i)foo|bar' would match 'foo' case insensitve and 'bar' case
sensitive 'OPT_I=(?i)(foo|bar)' will match both 'foo' and 'bar' case
insensitivly.

Caveat: if for some reason you use a | at the end of a list of patterns
(for instance: OPT_X=(?i)(foo|bar|) ) the pattern will also match an
empty string.  This can have the result that you exclude everything if
you use it with OPT_X, or include everything with OPT_I. You have been
warned.

Other features:
===============

You can make a running ripnews process reread it's configuration by
sending it a SIGHUP.

Where can I find newsservers:
=============================
freenews.maxbaud.net
www.newzbot.com
www.gj.net/~bhkraft

Known bugs:
===========

There are no known bugs at this moment. If you find any, please let me
know. As with all my software, if it breaks you get to keep _both_
pieces.

Credits:
========
- Stijn Hoop for adding yEnc support

Contact info:
=============

New problems can be reported directly to me at <ward@wouts.nl>. Patches
welcome ;)

Ward Wouts