From rik@netcom.com Fri Apr 14 13:11:48 2000 Date: Thu, 13 Apr 2000 18:33:24 -0400 From: Rik Kabel To: Martin Mokrejs Subject: Re: splitting large mail msgs. (fwd) * Martin Mokrejs [000413 09:22]: > Hello Rik, > I'm sorry to bother you again, but I can't find anywhere your response to > this message, althogh I know I had it. I didn't have much time to use your > script (because needed these examples). So now, I'm in the same problem. > I wish you have also copy of our sent-mail's .;) I think this is it: Subject: Re: splitting large mail msgs. To: mmokrejs@natur.cuni.cz (Martin Mokrejs) Date: Thu, 9 Mar 2000 15:11:43 -0500 (EST) In-Reply-To: from "Martin Mokrejs" at Mar 09, 2000 06:02:28 PM Organization: none X-Loop: rik@netcom.com X-Mailer: ELM [version 2.5 PL3] Content-Length: 11266 You (MM) wrote: MM> MM> Hello, MM> I'm trying to use this recipe but have few questions: MM> MM> On Mon, 13 Dec 1999, Rik Kabel wrote: MM> MM> > josh said: MM> > MM> > > My Cell phone allows emailed text messages, but only of size 140 MM> > > char. or less. This is great, and procmail allows me to forward MM> > > certain emails to my phone, but it'd be nice to get the WHOLE email. MM> > > :) splitting it into several messages would allow me to get around MM> > > this problem. MM> > > MM> > > Originally I was trying to split the message with the command MM> > > 'split' but was having problems as you can see. I'm really quite new MM> > > to all this. MM> > MM> > Just because you _can_ send your mail to your phone doesn't mean it MM> > makes sense to send complete messages to it. Among other things, it MM> > leaves you open to some very annoying pranks by friends. MM> > MM> > Also consider how you will handle mime messages. More and more MM> > mailers use mime encodings for even simple text. Sending complete, MM> > mime-encoded binaries to your handset may not be a good thing. MM> > MM> > Perhaps you only need to send the first part of the message, along MM> > with sender and subject information. MM> > MM> > If that is sufficient, perhaps the following will serve as a guide. MM> > This is designed to page me during acceptable hours iff I am not MM> > online. It handles some mime messages, and strips quoted text. Before MM> > you can use it, you have to: MM> > 1. Remove ## comments, including leading whitespace, MM> > on condition lines. MM> MM> Lines like: MM> MM> :0 c ## make a copy, and MM> MM> are invalid? Do I really have to remove comments? No, that isn't a condition line. Condition lines begin with *, or are continuations of lines which begin with *. MM> > 2. Figure out how to determine if you are online, or remove MM> > the test. MM> MM> Shall I know how do you do it? I use the last(1) command, which returns a string with 'still' in it if I am online. Your environment may provide a more efficient way to do it, mine probably does too, but I haven't looked to hard to find it. The test of online in the example you quoted shows how I use it. :0 h i w ## are we online? online=| last -1 $LOGNAME ## has string "still" if we are MM> > 3. Figure out how to specify pagehours, or remove the test. MM> MM> Shall I know how do you do it? Some pagers allow you to specify a time range in which they receive pages but don't alert you (private time on Motorola Advisor Elite pagers). Using this, you could have mail sent to your pager at any time and control the annoyance at the pager. If your pager doesn't support this feature, or you don't want to pay for receipt of these messages, you can control it in procmail. The following definition pagehours_=0[7-9]|1.|2[012] ## these are the allowed paging hours works for me. This allows paging from 0700 to 2200 in the time zone under which your procmailrc operates. I set TZ in my .forward to the timezone in which I currently reside. My .forward looks like: "|TZ=EST5EDT; export TZ && exec /u/u4/rik/bin/procmail -Ytf- MODE=prod /u/u4/rik/.procmailrc || exit 75" Note that the MODE variable is defined here as well, and I use it to chose different actions when I am running in production or testing. MM> > 4. Change the length determination/extract to fit your needs. MM> MM> That means define $pagesms before calling recipe? No, it means something else, although the recipe does assume that pagesms is already defined. pagesms is the email address of your short message system (sms) device. It can be a list if you have more than one, so you can send a message to both your cell phone and your pager. The length determination and extract refers to the size of the message body your messaging vendor allows. Some allow message bodies of 100 characters, some 200 or more. Some will reject a message body that is too long, some will simply truncate overlong messages to size. MM> Should I call it using INCLUDERC or SWITCHRC ? That depends on whether you have additional processing after this is done. I do, so I use INCLUDERC. If this were the end of my processing I might use SWITCHRC. Neither is necessary if the code is inline in your rc. MM> > 5. Determine if you should use print, as I do, or echo. MM> MM> No idea. I use ksh, and prefer the semantics of print over echo. Both can accomplish the same thing, and both are pretty lightweight in terms of resources. On many systems the behavior of echo may vary depending on your shell and on the order of binaries in your search path. Print is better defined, and I don't have to worry about just what is will or won't do. (For example, how does echo treat \n in the echoed string? How does it treat -n at the beginning of the string?) MM> Few more questions: MM> This script extracts attachments and other MIME message parts into some MM> files, what are they names? The script does not extract parts or attachments. It does attempt to locate plain text in a multipart message. I do, however, have other recipes which handle MIME messages in various ways, including combining parts and removing parts, but nothing which saves part or attachments to separate files. MM> Does this script remove Received:, X-.* header lines? The script does nothing to the headers of the message except to use the f option on the sendmail invocation. This changes the From: header to reflect the From name on the message as I received it. The sms providers with whom I deal typically remove all headers except From: and Subject:. MM> Why don't you use the ja-mime-kill.rc recipe to kill all attachments and MM> just send the plaintext message after breaking with your recipe into MM> small pieces? One question at a time. First, I do not kill attachments here. As I said above, I have other recipes which deal with MIME. By the time a message is processed by these recipes, my MIME processing is complete. It may still be a multipart message, and if it is, there may be a plain text part. If that is the case, I want to send that text if I can. That is what the recipe does. Second, I don't use Jari's MIME routines because I have an integrated set of recipes which do much much more than his modules do. My recipes handle nested parts, and decode, translate, and combine parts where possible. Third, my paging recipe does not break a message into multiple parts and, bit by bit, send the complete message. It sends one message which contains as much of the body of the original message as my sms provider allows. The original message is filed on my ISP's server and I can look at the whole thing when I connect. MM> > It has some warts, but it works well enough for me. MM> > MM> > :0 c ## make a copy, and MM> > * pagesms ?? . ## if an sms paging destination is defined MM> > * $ hour ?? $pagehours_ ## and it ain't the middle of the night MM> > * ! online ?? () still ## and we are not online still MM> > { pagebody="no text" ## shrink and forward the copy like this: MM> > oldmetas=SHELLMETAS SHELLMETAS ## disable SHELLMETAS MM> > :0 B f b w i ## first, MM> > * ^> ## if there are any >quoted lines MM> > | sed -e '/^ *>/d' ## remove them MM> > SHELLMETAS=oldmetas ## and restore SHELLMETAS MM> > :0 B ## if MM> > * H ?? ^content-type:.+multipart ## it is a multipart message MM> MM> > * ^content-type:.+text/plain(.*$)+\ ## with at least one plain part MM> > [ ]*$(([ ]*$)*)*\/[^ ].*$(.*$)?(.*$)?(.*$)?(.*$)?(.*$)?(.*$)?(.*$)?(.*$)? MM> MM> That's a really long line, so that's why I should remove comments -it's MM> broken by comment into two parts? No, you should remove the comments because if you don't, nothing will work. Procmail syntax does not include comments in condition lines. That means that the line above beginning '* ^content' will be read as a condition containing a pattern, and that pattern includes '\ ## with' within it. The next line, beginning '[ ]*$' will then be treated as an action line, and likely lead to an error. (By the way, [ ] is meant as a space and a tab within square brackets.) When you do remove the comments, make sure that you remove all the contiguous whitespace preceding the comment as well. Again, in the first line above, the \ must be the final character in the line for procmail to recognize that the next line is a continuation line for the condition, and not an action line. I use a preprocessor to automatically remove the comments (and to handle meta-variables I use in writing recipes, and to manage multiple includerc files). This lets me freely comment the source file, but makes sharing source a little more difficult. The source file for my rc files is over 3600 lines. The actual rc files in production, after stripping of comments and folding of continuation lines, total under 1950 lines. MM> > { pagebody=$MATCH } ## and save them here MM> > :0 E B ## else MM> > * ^^([ ]*$)*[ ]*\/[^ ].*$(.*$)?(.*$)?(.*$)?(.*$)?(.*$)?(.*$)?(.*$)?(.*$)?(.*)? MM> > { pagebody=$MATCH } ## and save them here MM> > :0 ## then trim off MM> > * MATCH ?? ^^\/((([^-]|-[^-]).*)?$)* ## sigs and mime part slop MM> > * MATCH ?? ^^\/(.*$)*.*[^ ] ## and trailing whitespace MM> > { pagebody=$MATCH } ## and resave MM> > smsmaxmsg=200 ## longest ShortMessageService msg MM> > sms10=.\$*.\$*.\$*.\$*.\$*.\$*.\$*.\$*.\$*.\$* ## 10 msg characters MM> > sms100=$sms10$sms10$sms10$sms10$sms10$sms10$sms10$sms10$sms10$sms10 ## 100 MM> > :0 ## NOTE: adjust $sms100$sms100 to fit MM> > * $ pagebody ?? 1^1 > $smsmaxmsg ## if pagebody is longer than allowed MM> > * $ pagebody ?? ^^\/$sms100$sms100 ## cut it at 200 characters MM> > { pagebody=$MATCH } ## and save them here MM> > :0 f b w i ## then MM> > | print -- "$pagebody" ## replace body with $pagebody MM> > :0 ## and then MM> > ! -f "$efr" -- $pagesms ## page it MM> > } ## done with copy MM> > MM> > -- MM> > Rik Kabel Old enough to be an adult rik@netcom.com MM> > MM> MM> TIA MM> -- MM> Martin Mokrejs - PGP 5.0i key at: finger://mail.natur.cuni.cz/mmokrejs MM> Faculty of Science, The Charles University -- Rik Kabel Old enough to be an adult rik@netcom.com