home *** CD-ROM | disk | FTP | other *** search
- Path: senator-bedfellow.mit.edu!bloom-beacon.mit.edu!xlink.net!howland.reston.ans.net!cs.utexas.edu!uunet!looking!brad
- Message-ID: <S615.ad9@clarinet.com>
- Date: Sun, 21 Nov 93 5:10:15 EST
- Expires: Wed, 22 Dec 93 5:10:15 EST
- Newsgroups: clari.net.newusers,news.answers
- From: brad@clarinet.com (Brad Templeton)
- Reply-To: clarinet@clarinet.com
- Followup-to: poster
- Approved: brad@clarinet.com
- Subject: Decoding ClariNet special article headers (Oct/92)
- Lines: 262
- Xref: senator-bedfellow.mit.edu clari.net.newusers:123 news.answers:14929
-
- Archive-name: clarinet/headers
-
- ClariNet articles come in the USENET message interchange format. This is
- a variant of the ARPA/Internet electronic mail format. Exact details on
- that format can be found in documents known as RFC822 (Mail format) and
- RFC1036 (USENET format) which are stored for anonymous pickup on UUNET and
- a variety of other machines.
-
- ClariNet articles use the standard USENET headers, plus a variety of
- special custom ones. Here we explain how we use the standard headers and
- the meanings of our extensions.
-
- Standard headers:
-
- From:
-
- The mail address found here will almost always be clarinews@clarinet.com.
- The comment, or user's full name, will the the reporter's name. In some
- cases, a title like "Science Reporter" or an affiliation will be added.
- In some cases, the From: address is an e-mail address that reaches the
- news agency. In the case of UPI stories, this is not true, and replies
- go simply to us.
-
- Subject:
-
- In most cases, this is a professional reporter/editor's headline for the story.
- In some cases, such as standing (regular) stories -- stock reports, weather,
- sports statistics, etc. -- a headline is filled in by ClariNet, possibly
- including the date of the story.
-
- UPI headlines are in mixed case. Some syndicated feature headlines are in
- upper case. Newsbytes headlines come in upper case, but are converted by
- ClariNet software to mixed case.
-
- Keywords:
-
- On this line, we translate the reporter's story coding along with our own
- keywords. A list of possible regular keywords is available. All keywords
- are human-generated by reporters and editors. Unfortunately, the coding
- system UPI uses is prone to errors. It's very terse, and a single keystroke
- error can create a ridiculous keyword. With thousands of stories moving
- every day, this is frequent enough to be annoying, but infrequent enough to
- be easily tolerated.
-
- Newsgroups:
-
- Articles are cross posted to a variety of newsgroups based on their coding
- and keywords. In addition, certain regular stories are put in special
- newsgroups based on their slugword (see below.) In general, a story is
- crossposted to up to 5 groups, so that those following a topic get every
- story related to that topic. All modern news reading software makes sure
- that you never see a crossposted article more than once, no matter how many
- groups it appears in.
-
- Date:
-
- The time that we got the story directly from the wire, which we receive
- via satellite. It will usually not reach you for another two hours on
- average, due to batching, propagation delays and deliberate delays required
- by contract.
-
- Message-id:
-
- We form message-ids from the slugword and an encoding of the date and time.
- Sometimes a checksum is used when the story arrives without a date and time.
-
- References:
-
- ClariNet messages contain References lines that can be used by thread
- following newsreading tools such as trn. References are generated
- when a story is an update to an earlier story, and when a story is a
- sidebar to an earlier story. We do not list all the messages in a
- reference chain -- normally, we will list only the immediate predecessor
- of a story, and the root of the story tree. This is done for each level
- of sidebar -- though normally sidebars only go one level deep.
-
- If you use a threaded newsreader you will thus see chains of updates
- grouped together. Not all updates replace their predecessor, so you can
- see several real stories in a chain. For example, if you come in to
- clari.sports.baseball after a few days, you might see an entire series
- of ball game stories grouped together as one thread. You will also see
- related stories on a major topic grouped together.
-
- Supersedes:
-
- On some stories, when a story replaces an earlier version, the Supersedes
- header is used to specify the message-id of the replaced story. This
- doesn't always work, so a cancel message is also issued. In some cases
- only the cancel is issued, and we note what was replaced with an
- x-supersedes header, which is really just a comment.
-
- ClariNet Special Headers
-
- Slugword:
-
- This is a special story-specific keyword. Every story is assigned a
- slugword. If the story is updated, it goes out again with the same slugword.
- We use this to cancel the old story before issuing the update, so that only
- one version of the story exists on your machine at a given time.
-
- Most slugwords are just simple words. The main story on George Bush, for
- example, is usually slugged "bush." There is no formal pattern to this that
- you can use, however. It is a safe bet that any story slugged "bush"
- would be about him, but if some other bush became news, it might be used
- in that context as well.
-
- Sidebars to stories will often use a component slug that links them to the
- main story. For example, the Panama invasion was slugged "panama," and a
- variety of stories around it were slugged "panama-response," "panama-nuncio"
- and so on. Sometimes more levels will appear.
-
- Slugwords can also be used to indicate standing stories -- those that
- repeat with some frequency. The daily PEOPLE column is always slugged
- "people." You can track a standing story by looking for its slug. A list
- of standing stories is available.
-
- Location:
-
- This field provides the location for the story. Sometimes a comma delimited
- list of locations is provided. Unfortunately, quite often the reporter does
- not code the location of a story, particularly on U.S. domestic news. Most
- international news is coded for location.
-
- Possible location codes include country names such as "canada" or
- "france" and state names such as "california." Regions and continents
- are also coded, and even a few places like New York City.
-
- In general, expect a location only on an International story or a U.S. regional
- story.
-
- ACategory:
-
- This provides the ANPA story category. There are just over a dozen of these.
- They provide a general story category. Our keywords give far more specific
- coding. This is useful if you're looking for general coding. The categories
- are:
-
- usa General U.S. related news
- special Special section (rarely used)
- feature Feature article
- food Recipes etc. (rarely used)
- entertainment
- financial
- international Non U.S. stories
- commentary Editorials etc.
- lifestyle
- weather
- regional Regions of the USA
- national Artificial category, local version of
- national story.
- political
- scoreboard Sports score reports
- racing (Not covered by ClariNet)
- sports
- travel
- advisory (For editors only -- not released)
- washington
- reserved (Unknown Category)
-
- natbriefs Radio National Briefs
- briefs Radio briefs
- headlines Radio headlines
- reg-headlines Radio regional headlines
- markets Radio stock market reports
- billboard
- television Radio reports about Television
-
- Most stories are usa, international, financial, sports or entertainment.
- Most stories in clari.local groups are regional.
-
- Priority:
-
- This is a general indication of the importance of the story.
- Priorities are:
-
- "FLASH", Once a decade type stories
- "BULLETIN", Top stories of the week
- "urgent", Top breaking stories of the day
- "major", Big non-breaking stories (artificial category)
- "regular", Most stories
- "daily", Lower priority stories
- "deferred", /* never used */
- "release-at-will", Advance material for release any time
- "advance", Material for future release
- "weekend", Material for weekend newspapers
-
- Stories of the "flash," "bulletin" or "urgent" priority are what is known
- as "breaking news." Each priority has its own newsgroup so that you can
- track the biggest stories directly. We have never seen a posting to
- clari.news.flash yet. The last known flash was "space shuttle explodes."
- (Flashes are always 3 words, followed up by a bulletin.)
-
- You usually see 2-4 bulletins a week, although there will always be
- multiple versions of any bulletin story. You see 2-4 urgent stories per
- day as well.
-
- "major" is a priority we created. This is for stories that are, in wire
- parlance, "skedded." They have a regular priority but are rated as important
- stories by the desk editors. They go into the "top" news groups.
-
- Format:
-
- The format field is somewhat redundant. It describes what sort of
- story this article is. It is most useful on sports stories which come
- in a variety of formats. Formats depend on the ACategory. Some formats,
- like a "game story," are only possible on a sports story.
-
- advisory For editors only -- not sent out
- annual Annual summary (Sports/financial/some news)
- audio advisory For radio stations
- breaking Urgent/bulletin/flash
- briefs Short summaries of major stories
- close Report at close of market trading
- correspondent's advisory For reporters only -- not sent out
- daily Lower priority news
- daybook For reporters only -- not sent out
- feature Feature stories
- game story Report on a game
- glances Sports at a glance report
- headlines Two sentence summaries of major stories
- interim Report while market is open
- linescores Broken down score reports
- market wrapup Final market report
- open Market opening report
- ratings Team rankings and reports
- regular Most news
- scorecard List of scores
- snap scores Quick scores for radio
- summary Summaries (sports/stock market/etc.)
- table Sports statistics
- week-end Market reports at end of week
-
- Some stories may have multiple formats, comma delimited. Unfortunately this
- is more often than not a coding error.
-
- ANPA:, Codes:, X-takes:
-
- These lines mostly serve as comments, used by us to track how our
- software decodes the stories from the non-formalized wire format.
- While it is not a supported header, here are the meanings of the fields on
- the ANPA line.
-
- ANPA: Wc: 446; Id: a0723; Sel: na--i; Adate: 3-17-1235pes; Ver 2/0; V: sked ld
-
- Wc: Word count
- Id: Internal wire story ID -- unique number for the day
- Sel: Wire selector code
- Adate: Date story was written
- Ver: Major and minor version numbers for this story.
- V: Version field, sometimes indicates reason for update
- Many keywords are possible here, which we won't document.
-
- Codes:
-
- This comment line contains the original reporter's cryptic coding of the
- story. We have translated all this into human readable information above.
- It is their for our debugging purposes, only. Not a supported header.
-
- X-Takes:
-
- If the story was sent to us in multiple parts (don't ask why), the number of
- parts received is listed here. Not a supported header.
-