ripnews/README.md

214 lines
6.8 KiB
Markdown
Raw Normal View History

2020-03-11 06:17:59 +00:00
# Ripnews
2002-05-05 20:05:11 +00:00
Ripnews is a bulk downloader for usenet. It's quite flexible in terms of
configuration. Some of it's features are:
- basic support for multiple servers per group
- cacheing of article headers to speed up reading of newsgroups
- newsrc file support (one newsrc file per server)
- flexible but simple configuration
2020-03-11 06:17:59 +00:00
## Configuration:
2002-05-05 20:05:11 +00:00
I'll just give a commented example config, it should be pretty clear,
after that I'll list the possible options.
2020-03-11 06:17:59 +00:00
```
2002-05-05 20:05:11 +00:00
# Set the default NNTPSERVER to localhost
NNTPSERVER=localhost
2008-02-05 15:30:35 +00:00
# If you need authentication
# NNTPSERVER=user:password@servername
2002-05-05 20:05:11 +00:00
# Set the cachedir, this is where the subject caches are stored
# without this ripnews will be much slower (but should still work)
CACHEDIR=/mnt/newspace/News/.ripnews_caches
2003-05-24 12:09:18 +00:00
# PID lockfile, prevents multiple ripnews processes from running at the
2003-05-26 19:35:04 +00:00
# same time [global keyword]
2003-05-24 12:09:18 +00:00
LOCKFILE=/local/newspace/News/.ripnewslock
2002-05-05 20:05:11 +00:00
# Set the datadir, this where a subdir for each group will be made to
# store the ripped articles
DATADIR=/mnt/newspace/News
2005-02-01 22:10:51 +00:00
# Set the tempdir, used to store the undecoded data. Without this ripnews
2002-05-05 20:05:11 +00:00
# uses a lot more memory
TEMPDIR=/mnt/newspace/News/ripnews_temp
2005-02-05 11:44:29 +00:00
# Set include pattern to a case insensitive "grateful.dead"
OPT_I=(?i)grateful.dead
2002-05-05 20:05:11 +00:00
# Set the base newsrc name. The server name will be appended.
NEWSRCNAME=/ward/src/ruby/ripnews/.newsrc
# Set the permission to create subdirs with
PERMISSION=0700
2003-05-26 19:35:04 +00:00
# Set the niceness of the ripnews process [global keyword]
NICE=20
2005-02-05 11:44:29 +00:00
# For alt.binaries.e-book and alt.binaries.e-books change from defaults...
alt.binaries.e-book| \
alt.binaries.e-books {
2002-05-05 20:05:11 +00:00
# Set another include pattern
2005-02-05 11:44:29 +00:00
OPT_I=(?i)(bible|dickens|shakespeare)
2002-05-05 20:05:11 +00:00
}
alt.binaries.e-book.flood {
# Add to default pattern, this will not be case insensitive
2005-02-03 16:29:18 +00:00
# anymore, because that's how ruby patterns work
2005-02-05 11:44:29 +00:00
OPT_I+=|george.orwell
2002-05-05 20:05:11 +00:00
}
2005-02-05 11:44:29 +00:00
# For both alt.binaries.e-book, alt.binaries.e-books and
# alt.binaries.e-book.flood change some value
alt.binaries.e-book| \
alt.binaries.e-books| \
2002-05-05 20:05:11 +00:00
alt.binaries.e-book.flood {
# Sets long filenames. If this is set the subject will be used
# as a filename instead of the name specified in the encoding.
OPT_L = true
}
# Change default server to news.tilbu1.nb.nl.home.com, since the config
2005-02-03 16:29:18 +00:00
# is parsed in order this will be used from her on down
2002-05-05 20:05:11 +00:00
NNTPSERVER=news.tilbu1.nb.nl.home.com
2005-02-05 11:44:29 +00:00
alt.binaries.music.classical| \
alt.binaries.sounds.lossless.classical| \
alt.binaries.sounds.mp3.classical {
2002-05-05 20:05:11 +00:00
# Add news4.euro.net as a second server for
2005-02-05 12:00:39 +00:00
# alt.binaries.music.classical,
# alt.binaries.sounds.lossless.classical and
# alt.binaries.sounds.mp3.classical
2002-05-05 20:05:11 +00:00
NNTPSERVER+=|news4.euro.net
}
2005-02-05 11:44:29 +00:00
alt.binaries.music.classical| \
alt.binaries.sounds.lossless.classical| \
alt.binaries.sounds.mp3.classical {
2002-05-05 20:05:11 +00:00
OPT_L=true
OPT_I=(?i)( \
2005-02-05 11:44:29 +00:00
verdi| \
vivaldi| \
mozart| \
beethoven \
2002-05-05 20:05:11 +00:00
)
}
2020-03-11 06:17:59 +00:00
```
2002-05-05 20:05:11 +00:00
2020-03-11 06:17:59 +00:00
## Commandline options:
2002-05-05 20:05:11 +00:00
2020-03-11 06:17:59 +00:00
```
2002-05-05 20:05:11 +00:00
"-I", "--include" Set include pattern.
"-c", "--configfile" Specify a different config file. Default
.ripnewsrc
2008-02-06 12:07:43 +00:00
"-g", "--group" only rip specified group
"-h", "--help" display this help and exit
"-l", "--list" list configured groups and exit
2002-05-05 20:05:11 +00:00
"-L", "--longname" Sets long filenames.
2002-07-01 21:28:07 +00:00
"-C", "--combinedname" Sets combined filenames.
2005-03-01 09:18:25 +00:00
"-M", "--multipart" Get multipart articles
"-s" Exit silently if already running
"-S", "--singlepart" Get singlepart articles
2002-05-05 20:05:11 +00:00
"-T", "--test" Set test mode. Newsrc files will not be writen
to.
2005-03-01 09:18:25 +00:00
"-X", "--exclude" Set exclude pattern.
2020-03-11 06:17:59 +00:00
```
2002-05-05 20:05:11 +00:00
2020-03-11 06:17:59 +00:00
# Supported config options:
2002-05-05 20:05:11 +00:00
2020-03-11 06:17:59 +00:00
```
2002-05-05 20:05:11 +00:00
OPT_I=<pattern> Set include pattern.
2006-09-12 20:54:54 +00:00
OPT_IF=<patter> Set include from pattern. Filters on poster.
2002-05-05 20:05:11 +00:00
OPT_L=<bool> Set long filenames.
2002-07-01 21:28:07 +00:00
OPT_C=<bool> Sets combined filenames.
2006-09-12 20:54:54 +00:00
OPT_CP=<bool> Sets poster combined filenames.
2005-05-10 20:52:58 +00:00
OPT_X=<pattern> Set exclude pattern. Ripnews will read articles
matching this pattern but it will not attempt
to download them.
2006-09-12 20:54:54 +00:00
OPT_XF=<pattern> Set exclude from pattern. Filters on posters.
2005-02-05 11:58:56 +00:00
OPT_MR=<pattern> Set "mark read" pattern. Ripnews will place
2005-05-10 20:52:58 +00:00
articles matching this pattern in your newsrc,
afterwards they will never be present in memory
again. Great for reducing memory usage when
checking a group for the first time.
2006-09-12 20:54:54 +00:00
OPT_MRF=<patter> Set "mark read from" pattern. Filters on posters.
2008-07-28 19:51:15 +00:00
OPT_MRO=<days> Set "mark read old". Filters posts older that set days.
2005-05-10 20:52:58 +00:00
OPT_MRR=<bool> Mark Remaining Read. If this is set to
true and the article doesn't match an exclude or
include pattern, the article will be
marked as read. The purpose of this is
to keep the caches of extremely large
groups small as to make processing
quicker.
2002-05-05 20:05:11 +00:00
OPT_T=<bool> Set test mode. Newsrc files will not be written
to.
TEMPDIR=<dir> Set tempdir location.
NNTPSERVER=<server>[|server] Set NNTPSERVER names
2006-08-21 19:47:27 +00:00
You can also use this notation:
<user>:<pass>@<server> for each server
if you need to authenticate by username
and password.
2002-05-05 20:05:11 +00:00
CACHEDIR=<dir> Set cachedir location.
DATADIR=<dir> Set output dir location.
NEWSRCNAME=<newsrcbase> Specify newsrc basename. Server names
will be appended.
PERMISSION=<perm> Set permission bits for directory
creation. Standard unix style, eg. 0755.
EXTENSIONS=<pattern> Set extension include pattern.
OPT_M=<pattern> Set EXTENSIONS just for multi part messages.
OPT_S=<pattern> Set EXTENSIONS just for single part messages.
DELEXT=<pattern> Set extension "mark read" pattern.
OPT_MD=<pattern> Set DELEXT just for multi part messages.
OPT_SD=<pattern> Set DELEXT just for single part messages.
2005-01-28 20:06:45 +00:00
INCLUDEFILE=<file> Include another file, only works in main config.
PRIMARYTHRES=<int> At least this percentage of the post has to be found
on the first server.
2020-03-11 06:17:59 +00:00
```
2002-05-05 20:05:11 +00:00
2020-03-11 06:17:59 +00:00
## Ruby patterns:
2002-05-05 20:05:11 +00:00
Ruby patterns are a lot like perl patterns, but there are some
differences. (?i) is the modifier to turn on case insensitivity, unlike
perl this modifier only works on the following block. Luckily you can
group multiple blocks into one by enclosing them with ()'s. So while
'OPT_I=(?i)foo|bar' would match 'foo' case insensitve and 'bar' case
sensitive 'OPT_I=(?i)(foo|bar)' will match both 'foo' and 'bar' case
insensitivly.
2005-02-24 09:02:24 +00:00
Caveat: if for some reason you use a | at the end of a list of patterns
(for instance: OPT_X=(?i)(foo|bar|) ) the pattern will also match an
empty string. This can have the result that you exclude everything if
you use it with OPT_X, or include everything with OPT_I. You have been
warned.
2020-03-11 06:17:59 +00:00
## Other features:
You can make a running ripnews process reread it's configuration by
sending it a SIGHUP.
2020-03-11 06:17:59 +00:00
## Where can I find newsservers:
2002-07-03 19:09:39 +00:00
freenews.maxbaud.net
www.newzbot.com
www.gj.net/~bhkraft
2020-03-11 06:17:59 +00:00
## Known bugs:
2002-05-05 20:05:11 +00:00
There are no known bugs at this moment. If you find any, please let me
know. As with all my software, if it breaks you get to keep _both_
pieces.
2002-05-05 20:05:11 +00:00
2020-03-11 06:17:59 +00:00
## Credits:
2002-05-07 08:17:50 +00:00
- Stijn Hoop for adding yEnc support
2020-03-11 06:17:59 +00:00
## Contact info:
2002-05-05 20:05:11 +00:00
New problems can be reported directly to me at <ward@wouts.nl>. Patches
welcome ;)
Ward Wouts