2020-03-11 06:17:59 +00:00
|
|
|
# Ripnews
|
2002-05-05 20:05:11 +00:00
|
|
|
Ripnews is a bulk downloader for usenet. It's quite flexible in terms of
|
|
|
|
|
configuration. Some of it's features are:
|
|
|
|
|
|
|
|
|
|
- basic support for multiple servers per group
|
|
|
|
|
- cacheing of article headers to speed up reading of newsgroups
|
|
|
|
|
- newsrc file support (one newsrc file per server)
|
|
|
|
|
- flexible but simple configuration
|
|
|
|
|
|
2020-03-11 06:17:59 +00:00
|
|
|
## Configuration:
|
2002-05-05 20:05:11 +00:00
|
|
|
|
|
|
|
|
I'll just give a commented example config, it should be pretty clear,
|
|
|
|
|
after that I'll list the possible options.
|
|
|
|
|
|
2020-03-11 06:17:59 +00:00
|
|
|
```
|
2002-05-05 20:05:11 +00:00
|
|
|
# Set the default NNTPSERVER to localhost
|
|
|
|
|
NNTPSERVER=localhost
|
2008-02-05 15:30:35 +00:00
|
|
|
# If you need authentication
|
|
|
|
|
# NNTPSERVER=user:password@servername
|
2002-05-05 20:05:11 +00:00
|
|
|
|
|
|
|
|
# Set the cachedir, this is where the subject caches are stored
|
|
|
|
|
# without this ripnews will be much slower (but should still work)
|
|
|
|
|
CACHEDIR=/mnt/newspace/News/.ripnews_caches
|
|
|
|
|
|
2003-05-24 12:09:18 +00:00
|
|
|
# PID lockfile, prevents multiple ripnews processes from running at the
|
2003-05-26 19:35:04 +00:00
|
|
|
# same time [global keyword]
|
2003-05-24 12:09:18 +00:00
|
|
|
LOCKFILE=/local/newspace/News/.ripnewslock
|
|
|
|
|
|
2002-05-05 20:05:11 +00:00
|
|
|
# Set the datadir, this where a subdir for each group will be made to
|
|
|
|
|
# store the ripped articles
|
|
|
|
|
DATADIR=/mnt/newspace/News
|
|
|
|
|
|
2005-02-01 22:10:51 +00:00
|
|
|
# Set the tempdir, used to store the undecoded data. Without this ripnews
|
2002-05-05 20:05:11 +00:00
|
|
|
# uses a lot more memory
|
|
|
|
|
TEMPDIR=/mnt/newspace/News/ripnews_temp
|
|
|
|
|
|
2005-02-05 11:44:29 +00:00
|
|
|
# Set include pattern to a case insensitive "grateful.dead"
|
|
|
|
|
OPT_I=(?i)grateful.dead
|
2002-05-05 20:05:11 +00:00
|
|
|
|
|
|
|
|
# Set the base newsrc name. The server name will be appended.
|
|
|
|
|
NEWSRCNAME=/ward/src/ruby/ripnews/.newsrc
|
|
|
|
|
|
|
|
|
|
# Set the permission to create subdirs with
|
|
|
|
|
PERMISSION=0700
|
|
|
|
|
|
2003-05-26 19:35:04 +00:00
|
|
|
# Set the niceness of the ripnews process [global keyword]
|
|
|
|
|
NICE=20
|
|
|
|
|
|
2005-02-05 11:44:29 +00:00
|
|
|
# For alt.binaries.e-book and alt.binaries.e-books change from defaults...
|
|
|
|
|
alt.binaries.e-book| \
|
|
|
|
|
alt.binaries.e-books {
|
2002-05-05 20:05:11 +00:00
|
|
|
# Set another include pattern
|
2005-02-05 11:44:29 +00:00
|
|
|
OPT_I=(?i)(bible|dickens|shakespeare)
|
2002-05-05 20:05:11 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
alt.binaries.e-book.flood {
|
|
|
|
|
# Add to default pattern, this will not be case insensitive
|
2005-02-03 16:29:18 +00:00
|
|
|
# anymore, because that's how ruby patterns work
|
2005-02-05 11:44:29 +00:00
|
|
|
OPT_I+=|george.orwell
|
2002-05-05 20:05:11 +00:00
|
|
|
}
|
|
|
|
|
|
2005-02-05 11:44:29 +00:00
|
|
|
# For both alt.binaries.e-book, alt.binaries.e-books and
|
|
|
|
|
# alt.binaries.e-book.flood change some value
|
|
|
|
|
alt.binaries.e-book| \
|
|
|
|
|
alt.binaries.e-books| \
|
2002-05-05 20:05:11 +00:00
|
|
|
alt.binaries.e-book.flood {
|
|
|
|
|
# Sets long filenames. If this is set the subject will be used
|
|
|
|
|
# as a filename instead of the name specified in the encoding.
|
|
|
|
|
OPT_L = true
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
# Change default server to news.tilbu1.nb.nl.home.com, since the config
|
2005-02-03 16:29:18 +00:00
|
|
|
# is parsed in order this will be used from her on down
|
2002-05-05 20:05:11 +00:00
|
|
|
NNTPSERVER=news.tilbu1.nb.nl.home.com
|
|
|
|
|
|
2005-02-05 11:44:29 +00:00
|
|
|
alt.binaries.music.classical| \
|
|
|
|
|
alt.binaries.sounds.lossless.classical| \
|
|
|
|
|
alt.binaries.sounds.mp3.classical {
|
2002-05-05 20:05:11 +00:00
|
|
|
# Add news4.euro.net as a second server for
|
2005-02-05 12:00:39 +00:00
|
|
|
# alt.binaries.music.classical,
|
|
|
|
|
# alt.binaries.sounds.lossless.classical and
|
|
|
|
|
# alt.binaries.sounds.mp3.classical
|
2002-05-05 20:05:11 +00:00
|
|
|
NNTPSERVER+=|news4.euro.net
|
|
|
|
|
}
|
|
|
|
|
|
2005-02-05 11:44:29 +00:00
|
|
|
alt.binaries.music.classical| \
|
|
|
|
|
alt.binaries.sounds.lossless.classical| \
|
|
|
|
|
alt.binaries.sounds.mp3.classical {
|
2002-05-05 20:05:11 +00:00
|
|
|
OPT_L=true
|
|
|
|
|
OPT_I=(?i)( \
|
2005-02-05 11:44:29 +00:00
|
|
|
verdi| \
|
|
|
|
|
vivaldi| \
|
|
|
|
|
mozart| \
|
|
|
|
|
beethoven \
|
2002-05-05 20:05:11 +00:00
|
|
|
)
|
|
|
|
|
}
|
2020-03-11 06:17:59 +00:00
|
|
|
```
|
2002-05-05 20:05:11 +00:00
|
|
|
|
2020-03-11 06:17:59 +00:00
|
|
|
## Commandline options:
|
2002-05-05 20:05:11 +00:00
|
|
|
|
2020-03-11 06:17:59 +00:00
|
|
|
```
|
2002-05-05 20:05:11 +00:00
|
|
|
"-I", "--include" Set include pattern.
|
|
|
|
|
"-c", "--configfile" Specify a different config file. Default
|
|
|
|
|
.ripnewsrc
|
2008-02-06 12:07:43 +00:00
|
|
|
"-g", "--group" only rip specified group
|
|
|
|
|
"-h", "--help" display this help and exit
|
|
|
|
|
"-l", "--list" list configured groups and exit
|
2002-05-05 20:05:11 +00:00
|
|
|
"-L", "--longname" Sets long filenames.
|
2002-07-01 21:28:07 +00:00
|
|
|
"-C", "--combinedname" Sets combined filenames.
|
2005-03-01 09:18:25 +00:00
|
|
|
"-M", "--multipart" Get multipart articles
|
|
|
|
|
"-s" Exit silently if already running
|
|
|
|
|
"-S", "--singlepart" Get singlepart articles
|
2002-05-05 20:05:11 +00:00
|
|
|
"-T", "--test" Set test mode. Newsrc files will not be writen
|
|
|
|
|
to.
|
2005-03-01 09:18:25 +00:00
|
|
|
"-X", "--exclude" Set exclude pattern.
|
2020-03-11 06:17:59 +00:00
|
|
|
```
|
2002-05-05 20:05:11 +00:00
|
|
|
|
2020-03-11 06:17:59 +00:00
|
|
|
# Supported config options:
|
2002-05-05 20:05:11 +00:00
|
|
|
|
2020-03-11 06:17:59 +00:00
|
|
|
```
|
2002-05-05 20:05:11 +00:00
|
|
|
OPT_I=<pattern> Set include pattern.
|
2006-09-12 20:54:54 +00:00
|
|
|
OPT_IF=<patter> Set include from pattern. Filters on poster.
|
2002-05-05 20:05:11 +00:00
|
|
|
OPT_L=<bool> Set long filenames.
|
2002-07-01 21:28:07 +00:00
|
|
|
OPT_C=<bool> Sets combined filenames.
|
2006-09-12 20:54:54 +00:00
|
|
|
OPT_CP=<bool> Sets poster combined filenames.
|
2005-05-10 20:52:58 +00:00
|
|
|
OPT_X=<pattern> Set exclude pattern. Ripnews will read articles
|
|
|
|
|
matching this pattern but it will not attempt
|
|
|
|
|
to download them.
|
2006-09-12 20:54:54 +00:00
|
|
|
OPT_XF=<pattern> Set exclude from pattern. Filters on posters.
|
2005-02-05 11:58:56 +00:00
|
|
|
OPT_MR=<pattern> Set "mark read" pattern. Ripnews will place
|
2005-05-10 20:52:58 +00:00
|
|
|
articles matching this pattern in your newsrc,
|
|
|
|
|
afterwards they will never be present in memory
|
|
|
|
|
again. Great for reducing memory usage when
|
|
|
|
|
checking a group for the first time.
|
2006-09-12 20:54:54 +00:00
|
|
|
OPT_MRF=<patter> Set "mark read from" pattern. Filters on posters.
|
2008-07-28 19:51:15 +00:00
|
|
|
OPT_MRO=<days> Set "mark read old". Filters posts older that set days.
|
2005-05-10 20:52:58 +00:00
|
|
|
OPT_MRR=<bool> Mark Remaining Read. If this is set to
|
|
|
|
|
true and the article doesn't match an exclude or
|
|
|
|
|
include pattern, the article will be
|
|
|
|
|
marked as read. The purpose of this is
|
|
|
|
|
to keep the caches of extremely large
|
|
|
|
|
groups small as to make processing
|
|
|
|
|
quicker.
|
2002-05-05 20:05:11 +00:00
|
|
|
OPT_T=<bool> Set test mode. Newsrc files will not be written
|
|
|
|
|
to.
|
|
|
|
|
TEMPDIR=<dir> Set tempdir location.
|
|
|
|
|
NNTPSERVER=<server>[|server] Set NNTPSERVER names
|
2006-08-21 19:47:27 +00:00
|
|
|
You can also use this notation:
|
|
|
|
|
<user>:<pass>@<server> for each server
|
|
|
|
|
if you need to authenticate by username
|
|
|
|
|
and password.
|
2002-05-05 20:05:11 +00:00
|
|
|
CACHEDIR=<dir> Set cachedir location.
|
|
|
|
|
DATADIR=<dir> Set output dir location.
|
|
|
|
|
NEWSRCNAME=<newsrcbase> Specify newsrc basename. Server names
|
|
|
|
|
will be appended.
|
|
|
|
|
PERMISSION=<perm> Set permission bits for directory
|
|
|
|
|
creation. Standard unix style, eg. 0755.
|
2003-04-27 22:28:59 +00:00
|
|
|
EXTENSIONS=<pattern> Set extension include pattern.
|
|
|
|
|
OPT_M=<pattern> Set EXTENSIONS just for multi part messages.
|
|
|
|
|
OPT_S=<pattern> Set EXTENSIONS just for single part messages.
|
|
|
|
|
DELEXT=<pattern> Set extension "mark read" pattern.
|
|
|
|
|
OPT_MD=<pattern> Set DELEXT just for multi part messages.
|
|
|
|
|
OPT_SD=<pattern> Set DELEXT just for single part messages.
|
2005-01-28 20:06:45 +00:00
|
|
|
INCLUDEFILE=<file> Include another file, only works in main config.
|
2005-11-06 11:50:30 +00:00
|
|
|
PRIMARYTHRES=<int> At least this percentage of the post has to be found
|
|
|
|
|
on the first server.
|
2020-03-11 06:17:59 +00:00
|
|
|
```
|
2002-05-05 20:05:11 +00:00
|
|
|
|
2020-03-11 06:17:59 +00:00
|
|
|
## Ruby patterns:
|
2002-05-05 20:05:11 +00:00
|
|
|
|
|
|
|
|
Ruby patterns are a lot like perl patterns, but there are some
|
|
|
|
|
differences. (?i) is the modifier to turn on case insensitivity, unlike
|
|
|
|
|
perl this modifier only works on the following block. Luckily you can
|
|
|
|
|
group multiple blocks into one by enclosing them with ()'s. So while
|
|
|
|
|
'OPT_I=(?i)foo|bar' would match 'foo' case insensitve and 'bar' case
|
|
|
|
|
sensitive 'OPT_I=(?i)(foo|bar)' will match both 'foo' and 'bar' case
|
|
|
|
|
insensitivly.
|
|
|
|
|
|
2005-02-24 09:02:24 +00:00
|
|
|
Caveat: if for some reason you use a | at the end of a list of patterns
|
|
|
|
|
(for instance: OPT_X=(?i)(foo|bar|) ) the pattern will also match an
|
|
|
|
|
empty string. This can have the result that you exclude everything if
|
|
|
|
|
you use it with OPT_X, or include everything with OPT_I. You have been
|
|
|
|
|
warned.
|
|
|
|
|
|
2020-03-11 06:17:59 +00:00
|
|
|
## Other features:
|
|
|
|
|
|
2003-06-25 07:30:58 +00:00
|
|
|
|
|
|
|
|
You can make a running ripnews process reread it's configuration by
|
|
|
|
|
sending it a SIGHUP.
|
|
|
|
|
|
2020-03-11 06:17:59 +00:00
|
|
|
## Where can I find newsservers:
|
|
|
|
|
|
2002-07-03 19:09:39 +00:00
|
|
|
freenews.maxbaud.net
|
|
|
|
|
www.newzbot.com
|
|
|
|
|
www.gj.net/~bhkraft
|
|
|
|
|
|
2020-03-11 06:17:59 +00:00
|
|
|
## Known bugs:
|
|
|
|
|
|
2002-05-05 20:05:11 +00:00
|
|
|
|
2003-06-25 07:30:58 +00:00
|
|
|
There are no known bugs at this moment. If you find any, please let me
|
|
|
|
|
know. As with all my software, if it breaks you get to keep _both_
|
|
|
|
|
pieces.
|
2002-05-05 20:05:11 +00:00
|
|
|
|
2020-03-11 06:17:59 +00:00
|
|
|
## Credits:
|
|
|
|
|
|
2002-05-07 08:17:50 +00:00
|
|
|
- Stijn Hoop for adding yEnc support
|
|
|
|
|
|
2020-03-11 06:17:59 +00:00
|
|
|
## Contact info:
|
2002-05-05 20:05:11 +00:00
|
|
|
|
|
|
|
|
New problems can be reported directly to me at <ward@wouts.nl>. Patches
|
|
|
|
|
welcome ;)
|
|
|
|
|
|
|
|
|
|
Ward Wouts
|