From: David W. Tamkin To: Richard G. Ball Cc: procmail@lists.RWTH-Aachen.DE Sender: procmail-admin@lists.RWTH-Aachen.DE Errors-To: procmail-admin@lists.RWTH-Aachen.DE X-BeenThere: procmail@lists.RWTH-Aachen.DE X-Mailman-Version: 2.0rc1 Precedence: bulk List-Help: List-Post: List-Subscribe: , List-Id: discussion of the procmail program List-Unsubscribe: , List-Archive: Date: Tue, 17 Apr 2001 09:58:16 -0500 (CDT) Subject: Re: splitting a large mailbox Richard asked, | I have many large mailboxes (a few thousand messages in each) and I'd like to | be able to split them quickly (one-message-per-file style) for subsequent | searching/processing. I thought to use formail but it turns out that | splitting with formail anf feeding to procmail to do the file writing takes a | *very* long time. You didn't say exactly how you're invoking formail and procmail, so perhaps they can be sped up, but maybe you could use csplit instead? What seems to slow things down the most is procmail's locking attempts. This code: #!/bin/sh export mailbox for mailbox in pattern do FILENO=00001 formail -ns sh -c 'cat > $mailbox.$FILENO' < "$mailbox" done despite the invocations of sh and cat, ran much faster than #!/bin/sh export mailbox for mailbox in pattern do FILENO=00001 formail -ns \ procmail -pm DEFAULT=$PWD/$mailbox.'$FILENO' /dev/null < "$mailbox" done (and I'm still not sure why I needed to give a full path for $DEFAULT instead of assuming $PWD as the start when procmail had the -m option). But the fastest thing I tried was to use procmail but prevent the locking; where .splitrc had this code, :0 $mailbox.$FILENO this ran like the wind in comparison: #!/bin/sh export mailbox for mailbox in pattern do FILENO=00001 formail -ns procmail -pm ./.splitrc < "$mailbox" done _______________________________________________ procmail mailing list procmail@lists.RWTH-Aachen.DE http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail