No description
Find a file
2025-01-09 07:45:12 +01:00
encode some fixes 2025-01-09 07:45:12 +01:00
net get rid of old SVN structure 2020-03-11 07:23:11 +01:00
news some fixes 2025-01-09 07:45:12 +01:00
notes get rid of old SVN structure 2020-03-11 07:23:11 +01:00
set get rid of old SVN structure 2020-03-11 07:23:11 +01:00
tools get rid of old SVN structure 2020-03-11 07:23:11 +01:00
CHANGELOG.md Update 'CHANGELOG.md' 2020-03-12 09:06:30 +00:00
INSTALL.md rename and update 2020-03-11 07:27:48 +01:00
README.md formatting 2020-03-11 07:33:41 +01:00
ripnews.rb some fixes 2025-01-09 07:45:12 +01:00
TODO get rid of old SVN structure 2020-03-11 07:23:11 +01:00

Ripnews

Ripnews is a bulk downloader for usenet. It's quite flexible in terms of configuration. Some of it's features are:

  • basic support for multiple servers per group
  • cacheing of article headers to speed up reading of newsgroups
  • newsrc file support (one newsrc file per server)
  • flexible but simple configuration

Configuration:

I'll just give a commented example config, it should be pretty clear, after that I'll list the possible options.

# Set the default NNTPSERVER to localhost
NNTPSERVER=localhost
# If you need authentication
# NNTPSERVER=user:password@servername

# Set the cachedir, this is where the subject caches are stored
# without this ripnews will be much slower (but should still work)
CACHEDIR=/mnt/newspace/News/.ripnews_caches

# PID lockfile, prevents multiple ripnews processes from running at the
# same time [global keyword]
LOCKFILE=/local/newspace/News/.ripnewslock

# Set the datadir, this where a subdir for each group will be made to
# store the ripped articles
DATADIR=/mnt/newspace/News

# Set the tempdir, used to store the undecoded data. Without this ripnews
# uses a lot more memory
TEMPDIR=/mnt/newspace/News/ripnews_temp

# Set include pattern to a case insensitive "grateful.dead"
OPT_I=(?i)grateful.dead

# Set the base newsrc name. The server name will be appended.
NEWSRCNAME=/ward/src/ruby/ripnews/.newsrc

# Set the permission to create subdirs with
PERMISSION=0700

# Set the niceness of the ripnews process [global keyword]
NICE=20

# For alt.binaries.e-book and alt.binaries.e-books change from defaults...
alt.binaries.e-book| \
alt.binaries.e-books {
	# Set another include pattern
	OPT_I=(?i)(bible|dickens|shakespeare)
}

alt.binaries.e-book.flood {
	# Add to default pattern, this will not be case insensitive
	# anymore, because that's how ruby patterns work
	OPT_I+=|george.orwell
}

# For both alt.binaries.e-book, alt.binaries.e-books and
# alt.binaries.e-book.flood change some value
alt.binaries.e-book| \
alt.binaries.e-books| \
alt.binaries.e-book.flood {
	# Sets long filenames. If this is set the subject will be used
	# as a filename instead of the name specified in the encoding.
	OPT_L = true
}

# Change default server to news.tilbu1.nb.nl.home.com, since the config
# is parsed in order this will be used from her on down
NNTPSERVER=news.tilbu1.nb.nl.home.com

alt.binaries.music.classical| \
alt.binaries.sounds.lossless.classical| \
alt.binaries.sounds.mp3.classical {
	# Add news4.euro.net as a second server for
	# alt.binaries.music.classical,
	# alt.binaries.sounds.lossless.classical and
	# alt.binaries.sounds.mp3.classical
	NNTPSERVER+=|news4.euro.net
}

alt.binaries.music.classical| \
alt.binaries.sounds.lossless.classical| \
alt.binaries.sounds.mp3.classical {
	OPT_L=true
	OPT_I=(?i)( \
		verdi| \
		vivaldi| \
		mozart| \
		beethoven \
	)
}

Commandline options:

"-I", "--include"	Set include pattern.
"-c", "--configfile"	Specify a different config file. Default
			.ripnewsrc
"-g", "--group"		only rip specified group
"-h", "--help"		display this help and exit
"-l", "--list"		list configured groups and exit
"-L", "--longname"	Sets long filenames.
"-C", "--combinedname"	Sets combined filenames.
"-M", "--multipart"	Get multipart articles
"-s"			Exit silently if already running
"-S", "--singlepart"	Get singlepart articles
"-T", "--test"		Set test mode. Newsrc files will not be writen
			to.
"-X", "--exclude"	Set exclude pattern.

Supported config options:

OPT_I=<pattern>			Set include pattern.
OPT_IF=<patter>			Set include from pattern. Filters on poster.
OPT_L=<bool>			Set long filenames.
OPT_C=<bool>			Sets combined filenames.
OPT_CP=<bool>			Sets poster combined filenames.
OPT_X=<pattern>			Set exclude pattern. Ripnews will read articles
				matching this pattern but it will not attempt
				to download them.
OPT_XF=<pattern>		Set exclude from pattern. Filters on posters.
OPT_MR=<pattern>		Set "mark read" pattern. Ripnews will place
				articles matching this pattern in your newsrc,
				afterwards they will never be present in memory
				again. Great for reducing memory usage when
				checking a group for the first time.
OPT_MRF=<patter>		Set "mark read from" pattern. Filters on posters.
OPT_MRO=<days>			Set "mark read old". Filters posts older that set days.
OPT_MRR=<bool>			Mark Remaining Read. If this is set to
				true and the article doesn't match an exclude or
				include pattern, the article will be
				marked as read. The purpose of this is
				to keep the caches of extremely large
				groups small as to make processing
				quicker.
OPT_T=<bool>			Set test mode. Newsrc files will not be written
				to.
TEMPDIR=<dir>			Set tempdir location.
NNTPSERVER=<server>[|server]	Set NNTPSERVER names
				You can also use this notation:
				<user>:<pass>@<server> for each server
				if you need to authenticate by username
				and password.
CACHEDIR=<dir>			Set cachedir location.
DATADIR=<dir>			Set output dir location.
NEWSRCNAME=<newsrcbase>		Specify newsrc basename. Server names
				will be appended.
PERMISSION=<perm>		Set permission bits for directory
				creation. Standard unix style, eg. 0755.
EXTENSIONS=<pattern>		Set extension include pattern.
OPT_M=<pattern>			Set EXTENSIONS just for multi part messages.
OPT_S=<pattern>			Set EXTENSIONS just for single part messages.
DELEXT=<pattern>		Set extension "mark read" pattern.
OPT_MD=<pattern>		Set DELEXT just for multi part messages.
OPT_SD=<pattern>		Set DELEXT just for single part messages.
INCLUDEFILE=<file>		Include another file, only works in main config.
PRIMARYTHRES=<int>		At least this percentage of the post has to be found
				on the first server.

Ruby patterns:

Ruby patterns are a lot like perl patterns, but there are some differences. (?i) is the modifier to turn on case insensitivity, unlike perl this modifier only works on the following block. Luckily you can group multiple blocks into one by enclosing them with ()'s. So while 'OPT_I=(?i)foo|bar' would match 'foo' case insensitve and 'bar' case sensitive 'OPT_I=(?i)(foo|bar)' will match both 'foo' and 'bar' case insensitivly.

Caveat: if for some reason you use a | at the end of a list of patterns (for instance: OPT_X=(?i)(foo|bar|) ) the pattern will also match an empty string. This can have the result that you exclude everything if you use it with OPT_X, or include everything with OPT_I. You have been warned.

Other features:

You can make a running ripnews process reread it's configuration by sending it a SIGHUP.

Where can I find newsservers:

  • freenews.maxbaud.net
  • www.newzbot.com
  • www.gj.net/~bhkraft

Known bugs:

There are no known bugs at this moment. If you find any, please let me know. As with all my software, if it breaks you get to keep both pieces.

Credits:

  • Stijn Hoop for adding yEnc support

Contact info:

New problems can be reported directly to me at ward@wouts.nl. Patches welcome ;)

Ward Wouts