Known/suspected bugs/Missing Features on current ZMailer sources

Wanted done for 3.0:
  - router  aliases.cf  to have similar 'protocols' feature as
    domain routing has currently
  - WEB-based configuration/administration interface
  - scheduler to kill message/thread online without a need to
    run  "manual-expirer"
  - scheduler to be able to "reroute" given channel/host, e.g.
    push the message recipient(s) back to router
  - LICENSE state cleanup.
  - A work-over of the documentation

smtp:
	In EZMLM mode: if some recipient is bounced, ones following it
	seem to be getting rather spurious looking errors. (case hotmail.com
	giving occasional 'transaction failed').   Possibly semi-generic
	problemas the EZMLM-rcpt-splitting is a subcase in larger code.
	Possibly also old code running...  Updated the system, and begun
	to watch for the phenomena...

sm:	pass the address meta information also in environment
	variables, and handle multirecipient cases as well!
	(see what 'mailbox' does.)

mailbox:
	When doing delivery into a pipe, give  MAIL  variable to
	tell what would be this user's mailbox ?

sfio:
	feature-config to use simpler (s)ed facilities than GNU-stuff.

	Implement stream-specific storage of 'errno' value yielded
	last time around there was some error which was not handled
	internally ??  That way picking up a simulated error, like
	when socket write-out fails with a select() timeout, we would
	get ETIMEOUT error out nicely...

check: Is router/db.c::db() now consistent at  argv1  construction ??

check: scheduler.conf: command=  $channel  expansion ??
	2.99.52-pre3  has apparently some problems!

configure.in:
	--includedir=.... not working ?? (Xos Vzquez <xose@smi-ps.com>)

libsh/builtins.c: sh_read()
    Command:   echo "11 22" | read v1 v2 v3
    Does put:   v1="11", v2="22", v3="22"
    Should put: v1="11", v2="22", v3=""
    Command:   echo "11" | read v1 v2 v3
    Does put:   v1="11", v2="11"
    Should put: v1="11", v2="", v3=""


proto/newdbprocessor.in:
	Support  '-i -t ordered,$MAILVAR/db/realdata.dat -f ... '

post-install:  Needs a way to define $(prefix) !

mailq: man-page update to match all options

getmxrr()  MX retrieve failure with its merriment:
  [root@vger mea]# /opt/mail/bin/getmxrr-test cs.lakeheadu.ca
  DNS lookup reply: len=33 rcode=0 qdcount=1 ancount=0 nscount=0 arcount=0 RD=1 TC=0 AA=0 QR=1 RA=1
  getmxrr() rc=0 EX_OK; mxcount=0
   gaih_inet('cs.lakeheadu.ca') gethostbyname() h=(nil)  h_errno=4
   g->gaih[INET]('cs.lakeheadu.ca',...) rc=261
  getaddrinfo('cs.lakeheadu.ca','smtp') -> r=-5 (no address associated with name), ai=(nil)

  For a comparison a successfull access has:
    DNS lookup reply: len=237 rcode=0 qdcount=1 ancount=4 nscount=3 arcount=4 RD=1 TC=0 AA=0 QR=1 RA=1

  CAN I DO ANYTHING ?!  Lack of data *is* valid result, how can those
  rare bad lacks of data be detected ?  Flags do look the same :-/



- Everywhere:  Spell-check  (propably -> probably, among others)

- "Overview" -- revise at least the list subscription instructions;
		verify also other document files

- Solaris 2.6 native make --> no VPATH ?? (sfio fails)

- BUG(security sensitivity):
  - "zmailer newdb" script will run "newdb" database compilations
    without becoming same uid+gid as the source file is (and with
    matching umask) -- perhaps the current PERL script must be turned
    into e.g. C program ? 
- FEATURE:
  - Also, how about enabling non-privileged users to execute "zmailer newdb"
    to recompile their own databases even though they won't be able to
    recreate all dbases and their hook .zmsh scripts ?

FEATURE:
  router+scheduler+TA status MIB areas in an SHM segment
  (Solaris et.al. which don't allow ARGV overwrite..
   DNS lookups, connecting, protocol exchanges, ...)
  + tool(s) to read that segment

- smtp TA to ETRN type destinations (usually not reachable) work
  poorly ever since non-blocking connection was taken into use.
    (cured at 2.99.52 ??)

INSTALL:
	Badly out of date!

router:

    FEATURE:
	run_listexpand() and run_listaddresses() (latter with part of code)
	to support expanding '"%1"@domainpart'  type patterns with
	original address *local* part (a.k.a. "user" part).
	Problem being that the place where this replacing can run is
	definitely doing it in TOKEN822 objects, not on strings.

	THAT IS WRONG APPROACH!

	Better one: At DB lookups support substituting %0 (the key) thru
	%1 thru %9 (optional parameters) 

    FEATURE:
	router/rfc822.c:  sequencer()
	Consider providing some data/function/whatever telling to
	the configuration script how many "Received:" headers there
	are so that the message can be trapped for a loop-prevention
	in the script, and not hardwired in the sequencer() code.
	(Now has a global variable that gets the count of "Received:"
	 headers, but the loop-count exceeded trap is done in the
	 same C-code without calling the scripts...)

    FEATURE:
	List expansions should not relay DSN data thru as is, but rather
	handle NOTIFY=SUCCESS locally ("expanded"), and then continue
	with whatever fancy the thing may want..

	BUG?  router/functions.c: $(homedirectory ..) function does
		getpwnam() from somewhere WITHOUT checking at the
		possible errno telling that there is e.g. temporary
		access error with backend database...
	   Fixed(?) on 21-Sep-2000 with HOLD/DEFER code.
	   Not, pulled that code back.  It is DIFFICULT problem..

	Error address pickup  ( router/rfc822.c: erraddress() )
	picks *bad* choices sometimes; e.g. it might consider
	'Sender:' or 'Errors-To:' headers as good picks for the
	address where to send the report to...

	RFC 822 special bug(?):  '%' is a special, and/or combination
	of   "From: foo%bar <fii@faa.fuu>"  formats causes the scanner
	to react mistakenly with 'Illegal-Object:'.

	The BIND subset might not handle the 'bigmx.zmailer.org'
	correctly -- or at all..


transports/smtp:

   VERIFY:
	That failed connection to site A won't lead to recipient
	diagnostic() calls before all possible MXes/addresses have been
	gone thru for connection attempts.

   BUGLET:
	Right now we make eminently sensible commentary diagnostics carrying
	individual recipient addresses, but how shall we do diagnostics when
	e.g. DATA/dot phase ends up on violation ?  Now we get the *last*
	recipient address before DATA, not them all..


   BUG:	PIPELINING:  If all RCPTs are rejected with 400 series code,
	and the final DATA is then rejected with 500 series code, DON'T
	abort with permanent error -- plus retry..
		Fixed ? (20000314)
	Not.  Apparently some combinations still conspire to produce
	rejections in cases which should retry.
		Fixed ? (20000619)

   DRUMS/SMTPUPD:
	If MX set after the backup-prune is empty (and initially was non-
	empty), reject the message with suitable report, don't continue
	to try to use A-record -- parametrized somehow..
	E.g. go to use of A in this case only if WKS checking is enabled.
		Conditionalized on non-usage of -W option
		Works ? (20000314)
   BUG?
	Bad interaction with scheduler in some instances.  If e.g.
	message goes to retry because MAIL FROM or some RCPT TO
	addresses yield 400 series code, it should not kick queue
	processing back the same way as 'connection timed out' does.
	(But no response within timeout from remote server should be
	 treated the same way as 'conn timeout' -- I think ?)
		Fixed ? (20000314)
	Not fixed on 20000314 ...
		Fixed ? (20000619)

   BUG:	If reply waiting in smtp_sync() gets SIGALRM, program crashes..
	(syncing for BDAT, for example.)

   VERIFY: BDAT (and DATA) status processing in case the remote
	yields disk-full temporary error.

   FEATURE:
	To rewrite the MX processing in streamlined flashion:
	  In case two domains have common (sub-)set of MX servers, and
	  we have been feeding mail to domain A thru MX-server B, and
	  now we change to deliver domain C which too has MX-server B.
	  We are to reuse the already open channel to server B to feed
	  the message out even though the domain C might have had server
	  D with lower preferrence..
	  (Noting that all MX-hosts with same or larger privilege,
	   than ourselves are to be pre-removed from the list!)
	  sendmail does this by observing that target domains A and B
	  have same hosts in same preferrence order (*not* pref. value!)
	  and then considering them as one.  Some MTAs choose to use even
	  finger-grained parallellism -- if connection is active to server
	  address N, and it is MX for several domains, all those domains
	  are routed via it.
	Worthwhileness of this is *not* unambiguous.
	How many domains really are served by same system ?
	At some places a lot, but elsewere perhaps not...

   FEATURE:
	We violate RFC 1123 part 5.2.2 which says that at MAIL FROM
	and at RCPT TO the domain parts MUST be primary DNS objects,
	*not* CNAMEs.  We never rewrite...

- Somewhere ? (smtpserver, and perhaps smtp transport agent ?)
	Implement OUTBOUND message size limit mechanism.
	Possibly a multi-value system which works against
	IP/reversed domain information, and adjust maximum
	inbound message size at the smtpserver per that
	given server.

	A further complication is need to limit outbound
	email's sizes when the inbound size is limited so
	that our system will not send out anything to which
	error report size is larger than our inbound limit.

	Problem being, of course, that outbound channels
	need to interrogate that same configuration mechanism
	to do "Sorry, this is too large for sending" processing.

- All transport agents:
	- Unify the content MIME processing routines;  Construct them
	  to be a common sub-layer ?
	- Do not decode Q-P or Base-64 if the resulting material exceeds
	  RFC-821/RFC-822 line length limits (or whatever limits have
	  been set).
	- Track the memory leakage on long-lifetime TA agents

	(router + TA library + TA programs):
	Router to scan message MIME structure, and to store that
	into dense concise format which doesn't need prerescan of
	the message *body* for every delivery attempt in case some
	"in-flight" data conversions are desired.
	(2.99.52patch2 prepared for it: _CF_MIMESTRUCT block defined
	 for the TA-specfile.)

	- transporters: When decoding MIME-QP the final line MAY be of
		".....=", which means there is no final NEWLINE.
	Fixed on SMTP, must fix on MAILBOX, and SM!  (4-Jan-95)
	(Prepared to do it, however only on SMTP it is fatal -- I think..)

- 'channel error' detection is partial at places;
	transporters: hold, errormail


- scheduler:
    BUG:
	When scheduler does resyncing, the  mtaMessageCount & global_wrkcnt
	will decrement below the real count.  "leak negative" so to speak.

    FEATURE:
	scheduler + mailq: Complete the V2 protocol authentication,
	and turn it into mandated default mode.

    FEATURE/BUG:
	When a transport-agent dies due to e.g. SIGSEGV, simulate
	a diagnostic message telling of this situation!

    BUG? (2001/6/30)
	An error message produced by  "errormail"  channel appears to
	become lost sometimes ?  Error on error on error ?

    FEATURE (2001/6/30)
	Add syslogging also into the scheduler (msgerror.c)  - list
	spoolids created (=messages) while processing earlier msg.


    BUG? (2000/3/29)
	Scheduler running with  "maxthr>1"  seem to loose sync about
	tasks already fed to the processing.

    BUG?
	How does arrival of a new job into active thread work in case
	the current activity has already reached the end of the thread ?
	When will the new job be fed to the transport agent ?
		(solved?)

    BUG?
	Sometimes will generate spurious successfull-delivery-
	reports when the command is retried, and the tag selector
	is unable to start the real processor.
		(quite old entry -- 1997-1998 ?)

    BUG?
	When there are NO reports to be given out, produce no report
	message either -- we do seem still do it :-/    (20000313)
		Fixed ? (20000314)

    FEATURE:
	Tool/mechanism for requeueing a message from the transport
	back to the router -- all message recipients that have not
	been processed yet.

    FEATURE:
	Tool/mechanism for the scheduler to command expiration of all
	recipients in given thread.  (alike 'manual-expiry', except
	a bit smarter..)

    FEATURE:
	- Delivery messages plain-text part needs easier human
	  understandable format on the messages.

    FEATURE:
	Reports about delivery DELAY.

    BUG?
	Have two waiting threads (one message each):
	- start both of them with ETRN
	- both issue report of "retryat +nn"
	- both are rescheduled about immediately..
	- system spins (FAST) on those two...
		(Fixed ?  Haven't seen this for a while)

    FEATURE:
	- Another:  maxlife  -- maximum lifetime for a transport agent
	  for limiting the time a smtp-channel stays open for a long-
	  living connection...


- smtpserver

    BUG:
	Control somehow if lack of configuration and/or PolicyDB is
	a system security offence, or not..

    BUG?:
        Timeout control on network socket writes ???

  - VERIFY: behaviour on BDAT (and DATA) phases when the disk fills
  -	Report that if the router subprocess crashes, interactive
	processing goes into mixed state -- it reports 2XX, and
	yet claims to be in wrong state (MAIL FROM or RCPT TO ?
	I have forgotten..)

    FEATURE:
        Script-controlled policy-testing -- without having ROUTER around ?
	Suggestion: If some *users* need RBL testing, do it for them, not
	for *all* addresses.
	(Script to do test order, choices, and reactions.)

	Counter suggestion: Delay the lookup until the somebody uses the data
	Refutation:         Needs storing connecting IP address somewhere

    FEATURE:
	accept email to postmaster(s), never mind what else policy testers
	have determined...

    BUG:
	"-s" option processing is entirely wrong...

    FEATURE:
	The connection source IP reversal is not verified paranoidly, we
	may accept people who claim dishonest reversals..  (But then we
	can test against their IP address..)

    FEATURE:
	Have some flag passing mechanism from the initial policy db
	back to the smtpserver proper so that it can:
	- give more meaningfull explanations of rejection reason
	- run more verification subroutines on addresses per policy
	  driver control; namely  "this domain can be valid source only
				   when it is coming from that IP address"
	  type of checkup...


- Integrate transports/fuzzyalias/ into the configuration system

- Manuals leave a lot to desire.. Especially a good users/administrators
  manual is in need
	[July 1997: technical writers are hard at work on this one]
	[April 1998: Completing the work is at Matti's hands..]


- transports/sm/sm.c:
	- Support ESMTP + DELIVERBY (RFC 2852)
	- what UID it runs programs like procmail with ?
	  What uid it should run them ?
	  I don't think the current (up to 2.99.49p*) way of running
	  procmail/cyrus is entirely safe.
	- I have gotten comments that they are suid/sgid root/root
	  at most installations, and indeed so procmail is at my
	  workstation too...
	- DSN handling flags ?
	- "localize" the destination address properly.  That is, strip
	  quotes from around a quoted destination address according to
	  the RFC822 syntax. -- Is this always correct ?


- post-install-check
	- check existence of mailq/tcp in /etc/services
	- check MAILVAR/MAILSHARE/POSTOFFICE protections
	- check "nobody" and "daemon" accounts

- systemwide .forward checker
	- iterate all users in the system
	- check peoples directory ownerships and permissions
	- check peoples .forward ownerships and permissions

- Need a program to verify that given configuration is ok, checks
  for things like:
	- "nobody" userid
	- scheduler's resource control working to the maximum
	  extend that system supports
	- mmap() correct operation  (GNU autoconf does this)
	- sprintf() return type is autoconfigured correctly

- Need a program to run thru various file permission checks:
	- $MAILVAR/db/ -dir, and files in it
	- $MAILVAR/lists/ -dir, and files in it
	- ~/.forward -files (and home directories)

- IPv6 stuff:
	- lib/selfaddrs.c: Don't properly do automatic IPv6 interface address
			   picking (IPv4 is ok)
	- libc/myhostname.c: don't do automatic IPv6
	- transports/hold/hold.c: incomplete NS/AAAA, no IPv6 PTR
	- transports/mailbox/mailbox.c: BIFF gethostbyname() ?

- whole chain (smtpserver/router/mailbox+hold):
	- Have a NEW pseudoheader (NRCPT) in addition to the ORCPT, which
	  we now (in not completely kosher manner) create, if we don't get
	  it originally.  That is, similar entry for traversing unchanged
	  thru the router to tell what address we got as input.
	  (Thru some pseudo-alias..)
	- Retain (in some TA-header object) the original "MAIL FROM:<..>"
	  address, and be able to store it into the MAILBOX channel.
	  (That is, don't let the routing process to alter this!)

- transports/mailbox/mailbox.c:
	- Store the ORCPT information into a "X-Orcpt-To: "-header
	- Store the NRCPT information into a "X-InRcpt-To: "-header
	- Store the ENVID information into a "X-Envid: "-header
	- Store the original FROM:<..> into a "X-OrigFrom: "-header

	- Provide aforementioned "headers" also as environment variables
	  to the pipes
	- At the router:run_rfc822(): if the message has these headers,
	  purge them away.

- Autoconfig problems:
	- System mailbox locking schemes are sometimes non-obvious..
	- For the last PERHAPS must back introduce the host-configuring
	  for describing host-system dependencies like mailbox-lock schemes

- Got mail with a suggestion: (via zmailer-list)

 I suggest you to improve the 'crossbar' function of 'router'.
 It might return a list of header rewrite functions as well
 as a single function like now. So different rewrite methods
 could be applied for sender and recipient addresses.
 The crossbar() C function contains some comment that refers
 to some similar thing, but I found it unimplemented.
         /*    
          * We expect to see something like
          * (rewrite (fc fh fu) (tc th tu)) or
          * ((address-rewrite header-rewrite) (fc fh fu) (tc th tu))
          * back from the crossbar function.
          */

  5-Dec-1999: Reading the code, actually it *is* implemented.
	      ... or perhaps not, *usage* of that double-rewriter
		  form isn't done, existing code has always expected
		  single rewriter address, not two elt list.

- When killing previous routers/scheduler/smtpserver, should wait
  the previous process group leader to die before writing over
  the  $POSTOFFICE/.pid.KIND -file.

- Co-ordinated shutdown for the scheduler -- send a signal to it
  (SIGQUIT), and it sends out newlines to all processes still
  receiving anything from the scheduler, and shuts the feed channels
  down from the scheduler to the transporters.  Then it will spend
  some time (infinity?) to get responces from the transporters.
  When all childs are dead, it can exit.
	20-Jan-97: Sort of facility exists now..

- DSN (Delivery Status Notification) mechanism does not (yet) report
  on "DELAY". (Dec-1999: placeholders for mechanism exist:
	r+          smtp zimage.com method@zimage.com 60000
	  123456ABCD  - 1-thru-6 are TA process PID space, A-thru-D are
		        a 'delay has been reported' flag storage.)

- Sometimes incoming SMTP can be a hellish load, need to introduce
  a load-limited incoming SMTP acceptor (smtpserver/smtpserver.c)
  (Rayan wrote it, but never released it..)
	13-Dec-94:  It exists for Linux ....
	23-Dec-94:  Pulled bits from "top" -program.  Now it
		    exists also for Suns..

  A discussion on the ZMailer -list revealed that the acceptance blocking
  is a BIG can of worms, and that even the  BSD-sendmail has had a long
  journey along the rocky path to "get it right"...  (5-Jan-95)

  Hmm.. Perhaps we could do a `single process doing multi-stream reception ?'
  it would require fairly large rewrite of the SMTP-server..

  Also: What to do when there are more incoming SMTP sessions than the
  process can have open file descriptors ?  One for the SMTP socket, one
  for each spool file in active use at the input phase, and one for log.
  (The stdio is used only for SMTP responses, and for spooling out the
   accumulated message.  Thus there SHOULD be enough resources for all
   uses -- except when the system runs out of FDs per any individual
   process..)

- Processing of several headers is questionable/lack of:
	Generic: header wrapping within N chars (like to 80-chars
		 space of BITNET) and also infinite wide systems,
		 like News.. (transports/libta/writeheaders.c ?)

	Generic: RFC-1342 aka. non-US-ASCII chars in the headers..
		 Auto-conversion and wrapping
