
# SID   @(#) coding.txt 1.10 22/10/26 12:02:31

# Description about Program Code and Documentation
# It is not shown with +help but can be retrieved with: $0 --help=ProgramCode

HACKER's INFO

    Program Code

        First of all: the main goal is to have a tool to be simple for users.
        It's not designed to be academic code or simple for programmers.

        Also testing for various flaws in other tools and protocols could not
        be done in  a standardised generic way, using well designed software,
        but often needs highly adapted code for each individual check, and --
        more worse -- sometimes variants of the same code.
        Please keep this in mind, before trying to unitise the code.

        Note:  following descriptions mainly uses the term "sub" (the perlish
        term) when talking about functions, procedures, and/or methods.

      Syntax style

        Identation is 4 spaces. TABs would be the better solution, IMHO.
        Unfortunately some repositories have issues with TABs,  so spaces are
        used only. Sick.

        Additional spacing is used to format the code for better human reada-
        bility. There is no strict rule about this, it's just done as needed.

        Empty lines (empty, without any space!) are used to group code blocks
        logically. However, there is no strict rule about this too.

        K&R-style curly brackets for subs and conditions are used.
        K&R-style round brackets for subs and conditions are used (means that
        definitions of subs, and calls of subs  do not use  spaces before the
        opening bracket, while conditions use spaces).

        Calls of subs are written in  K&R-style using round brackets, and the
        perlisch way without round brackets. This may be unitised in future.
        Main exception is  print, which is used without brackets.

        Short subs and conditions may be written in just one line.
        Note: there is no need for each command in its one line, as debugging
        on code level is rarely done.

        subs are defined with number and type of parameters to have a minimal
        syntax check at compile time.
        This will be changed in future.

	For quoting see below.

      Code style

        Global variables are used, see X&Variables& for details below.

        Variables are declared at beginning of subs. I.g. we do not use local
        or my declarations in blocks (there may be some exceptions).

        The code tries to avoid if-else constructs as much as possible. If an
        else condition is used, it is written in one line:   } else {  .
        elsif is used in "borrowed" code only ;-)

        Early return statements in subs are prefered, rather than complicated
        and nested conditions. There are also goto statements in parsers, but
        return statements are prefered.

        Most code is seqential instead of using functions, except the code is
        used multiple times. This is currenlty changed ...
        It is not intended to have OO-code,  even perl's  OO capabilities are
        used when rational.

      Code quality

        The code is regulary analysed using "perlcritic" -a front-end command
        line tool to the Perl::Critic module. As decribed above, this code is
        not supposed to be academically correct. Nevertheless, we follow most
        recommendations given by perlcritic to make the most of it.
        For details please see:

          *  Annotations at end of o-saft.pl
          *  .perlcriticrc
          *  contrib/critic.sh .

        All tests can be done with make, see:

          *  make help.test.critic

      Quoting

	Using single quotes  '  and/or double quotes  "  is the nature of the
	processed data, the generated data or the program code.  These quotes
	are written literally in data, whereas  Perl provides various methods
	to handles different types of quotes. Coding gets complicated if both
	quotes are part of the data, in particular when printing data.

        The  code  mainly uses  'text enclosed in single quotes'  for program
        internal strings such as hash keys. It uses  "double quoted text" for
        texts being printed. However, exceptions if obviously necessary ;-)
        Strings used for  RegEx are always enclosed in single quotes.  Reason
        is mainly to make searching texts a bit easier.

	Perl's operators  'q()' and 'qq()' are used only to avoid escaping of
	character, because all types of quotes can smoothly mixed there.
	In general  "double quoted text" is used for interpolated strings. In
	some cases, it is necessary to use Perl's  'sprintf()'  function.
	For details, please see:

	    perldoc -f q
	    perldoc -f qq
	    perldoc -f sprintf
	    perldoc perlop # following sections:
	        Comma Operator
	        Quote and Quote-Like Operators
	        Gory details of parsing quoted constructs

      General

        Exceptions are not used, there is no need for them.

        In general, the code *must not* use any additional libraries. We know
        that there exist infinite marvellous libraries and frameworks (called
        modules in perl), which would make some programming simpler,  but one
        of the main goals of this tool is that it  should work on  any system
        with just the core language (i.e. perl) installed. We do not want any
        additional dependency, in particular no dependency on versions beside
        the core language.  Currently some perl modules are an exception, and
        will be removed in future.
        However, libraries part of perl5 are assumed to be "core language".

        Perl's  'die()'  is used whenever an unrecoverable error occurs.  The
        message printed will always start with '**ERROR: '.
        warnings are printed using perl's  'warn()'  function and the message
        always starts with '**WARNING: '.

        All output is written to STDOUT.  However perl's 'die()' and 'warn()'
        write on STDERR. Only debug messages inside $0 are written to
        STDERR.

        All 'print*()' functions write to STDOUT directly.  They are slightly
        prepared for using texts from  the configuration (%cfg, %checks),  so
        these texts can be adapted easily (either with  OPTIONS  or in code).

        Calling external programs uses 'qx()' rather than backticks or perl's
        'system()' function.  Also note that it uses round brackets insted of
        slashes to avoid confusion with RegEx.

        The code flow often uses postfix conditions, means the  if-conditions
        are written right of the command to be executed. This is done to make
        the code better readable (not disturbed by conditions).

        The code is most likely not thread-safe. Anyway, we don't use them.

        For debugging the code, the  --trace  option can be used. See  DEBUG
        section below for more details. Be prepared for a lot of output!
	In fact it's a mix of tracing and debugging, to be technical correct.

      LHS condition checks

        It's a common (human) error to write assignments where conditions are
        required, in paricular in  if(), while()  and  until()  statements.
        To avoid such unintended statements, which often happen when changing
        the code,  the conditions are written with the constant  LHS  and the 
        variable  RHS, for example:   if (0 < $var) { print "$var > 0"; }
        This works with constants (integers, strings) and variables only, but
        not when matching RegEx (probably with Perl 6.x).

      Comments

        Following comments are used in the code:

          # TODO:       Parts not working perfect, needs to be changed.
          # FIXME:      Program code known to be buggy, needs to be fixed.
          # NOTE:       Special brief descriptions.
          #!#           Comments not to be removed in compressed code.
          #?            Description of sub.
          #|            Code sections (documents program flow).
          ##            Comments used by third-party programs  (for example:
                        contrib/gen_standalone.sh, perlcritic).
          # sub-name    Name of sub behind the closing bracket of the sub.

        Comments usually precede the code line(s) or are placed at end of the
        code line which they belong too. If the comments are placed after the
        code line which they belong too, the lines are idented.

      Annotations

        To reduce the huge amount of comments, and to write them only once if
        needed multiple times,  an annotation section  have been added at the
        end of  $0  file. In this section a title for the comment is
        followed by the comment's text. Example:

          == Note:A Title
             Text for comment.

        Such comments are refered to in the code by:

          SEE Note:A Title

        This is experimental, to see if it helps  to make the control flow of
        code more readable by humans, but keep the full documentation.

      Modules

        I.g. using modules, Perl modules and private ones, should be avoided.
        Unfortunately maintaining huge code  without proper modularization is
        difficult. Hence Perl modules are reduced to a minimum needs, and all
        private modules are loaded when needed only.

        All private modules are in the sub-directory  OSaft. The name  O-Saft
        is prefered, but the  -  (dash) in the name is not possible in Perl's 
        module names. Sick.

        Some modules are still in the main directory. This will change soon.

        While  Net::SSLinfo  uses  Net::SSLeay(1), $0 itself uses only
        IO::Socket::SSL(1). This is done 'cause we need some special features
        there. However,  IO::Socket::SSL(1)  uses  Net::SSLeay(1)  anyways.

      Variables

        As explained above, global variables are used to avoid definitions of
        complex subs with various parameters.

        Most subs use global variables (even if they are defined in main with
        'my'). These variables are mainly: @DATA, @results, %cmd, %data, %cfg,
        %checks, %ciphers, %prot, %text.

        Variables defined with 'our' can be used in   o-saft-dbx.pm   and
        o-saft-usr.pm.

        For a detailed description of the used variables, please refer to the
        text starting at the line  '#| set defaults'  in  o-saft.pl.

      Function names

        Some rules used for sub names:

          check*        Functions which perform some checks on data.
          print*        Functions which print results.
          get_*         Functions to get values from internal data structure.
          _<function_name>    Some kind of helper (internal) function.
          _trace*
          _y*           Print information when  --trace  is in use.
          _v*print      Print information when  --v  is in use.

        Function (sub) definitions are followed by a short description, which
        is just one line right after the 'sub' line.  Such lines always start
        with  '#?'  (see below how to get an overview).

        Subs are ordered to avoid forward declarations as much as possible.

        In general all names are in lower case. Using _ instead of camel case
        is prefered. However, there are some exceptions for wrapper functions
	for  Net::SSLinfo  and  Net::SSLeay .

      Code information

        Examples to get an overview of perl functions (sub):
          egrep '^(sub|\s*#\?)' $0

        Examples to get an overview of programs workflow:
          egrep '(^#\|\s|\s\susr_)' $0

        Following to get perl's variables for checks:
          $0 +check localhost --trace-key \
          | awk -F'#' '($2~/^ /){a=$2;gsub(" ","",a);next}(NF>1){printf"%s{%s}\n",a,$2}' \
          | tr '%' '$'

        Makefile.dev contains following targets for easy use:
          make testarg-dev-grep_subs
          make testarg-dev-grep_sub
          make testarg-dev-grep_desc

      Debugging, Tracing

        Most functionality for trace, debug or verbose output is encapsulated
        in functions (see X&Function names& above). These subs are defined as
        empty stubs in $0. The real definitions are in  o-saft-dbx.pm,
        which is loaded on demand when any  --trace*  or --v  option is used.
        As long as these options are not used,  $0  works without
        o-saft-dbx.pm.

        Debug messages always start with    '#dbx#'.
        verbose messages always start with  '#o-saft.pl: '.
        Trace messages always start with    '#o-saft.pl::'.
        Following formats are used in trace messages:
          #o-saft.pl:: some data           - output from $0's main
          #o-saft.pl::subfunc(){           - inital output in subfunc
          #o-saft.pl::subfunc: some data   - some output in subfunc
          #o-saft.pl::subfunc() = result } - result output of subfunc
        However, these rules are implemented very lazy.

        The prefix for verbose and trace message can be configured like:
          $0 --cfg-init=prefix_verbose="#VERBOSE: "
          $0 --cfg-init=prefix_trace="#my-trace:: "

        When  --trace  is used,  additional trace output  with timestamps are
        are also printed, even if no  --trace-time  was given. This is useful
        because it automatically "scopes" other output with  '{'  and  '}'.

        Note: in contrast to the name of the RC-FILE, the name  o-saft-dbx.pm
        is hard-coded.

      Abstract program flow
          check special options and command (+exec, +cgi, --envlibvar)
          read RC-FILE, DEBUG-FILE and USER-FILE if necessary
          initialise internal data structure
          scan options and arguments
          perform commands without connection to target
          loop over all specified targets
              print DNS stuff
              open connection and retrive information
              print ciphers
              print protocols
              print information
              print checks

      Arguments (+commands and --options)

        As described multiple times, it should be possible to mix options and
        commands and other arguments in any order. It is also possible to use
        various formats of commands and options.
        A simple method to allow variants of  a string (command or option) is
        to match it against RegEx.  Unfortunately it is hard to use a generic
        way to parse commands and options. perl's Getopt module would be nice
        but requires a hash with fixed keys.  Using a hash,  which conatins a
        proper RegEx for each command and option, could be done, but the code
        for such a sophisticated parser may be hard to understand.  So a loop
        over all arguments with a huge "switch" statment is implemented. Each
        "switch-case" matches a RegEx,  and then assign a proper value in the
        configuration. See the  "#{ COMMANDS"  and  "#{ OPTIONS"  sections in
        $0 .

      Program flow

        As explained in the documentation (please see +help) there are mainly
        3 types of `checks':
          +info    - getting as much information as possible about the target
                     its certificate and the connection
          +cipher  - checking for supported ciphers by the target
          +check   - doing all the checks based on +info and +cipher

        Most information is collected using Net::SSLinfo and stored in %data.
        All information according ciphers is collected directly and stored in
        @cipher_results . Finally, when performing the checks, these informa-
        tions are used and compared to expected well know values. The results
        of these checks are stored in  %checks .
        Then all information from  %data and %checks is printed by just loop-
        ing through these hashes.

        Information is just collected using  Net::SSLinfo  and then printed.
        Checks are performed on provided data by  Net::SSLinfo  and specified
        conditions herein.  Most checks are done in functions  'check*',  see
        above.
        Some checks depend on other checks,  so check functions may be called
        anywhere to solve dependencies. To avoid multiple checks,  each check
        function sets and checks a flag if already called, see  $cfg{'done'}.

      Documentation

        Documentation for the project consist at least of these parts:

          u)  documentation for the user
              SEE Public User Documentation (in o-saft.pl).
          d)  documentation for development
              SEE Internal Code Documentation (in o-saft.pl).
          m)  documentation for testing in development
              This dcumentation for testing the tools is done in t/Makefile*.
              There exists  t/Makefile.pod,  which contains the "Annotations"
              for the make system.  The description of the Makefile's targets
              is done as target in each Makefile itself.

        Note that all documentation (user and development) is  actively  used
        by the program itself and by some additional tools, see for example:
            o-saft.pl --help=opt

        Initially all user documentation was written in  Perl's POD  format.
        After two years of development, it was observerd that POD was not the
        best decision, as it makes extracting information from  documentation
        complicated, sometimes. Using POD is also a huge  performance penulty
        on all platforms. Hence  POD  will now be genereated on request using
        the  --help=pod  option.

        For mor Details  SEE Note:Documentation (in o-saft.pl).


    Repository

        All code is maintained in a repository. Due to various conceptual and
        personal opinions,  the master repository used internally is based on
        SCCS/CSSC, while the public repository is git/github.  The maintainer
        keeps both in sync.  The public repository must contain workable code
        only, well nobody is perfect ...

      Rules

        * Each file has at least one string containing  the unuique id of the
          file version in the repository. This id is named  SID  or SID_<id>.
          This string looks like:
            filename ID_in_repository date_of_last_change time_of_last_change
          example:     filename 1.42 18/10/23 23:42:23 
          If this is assigned to a variable, it may look like:
            my $SID = "coding.txt 1.10 22/10/26 12:02:31";
          (Note beside: previous example is the SID of this file)

          Exceptions: some files for documentation only miss the SID string.

        * Changes are made in very small chunks,  each commited separately to
          the repository.

        * Different types of changes see  Commit Comments  below) must not be
          mixed in one commit.

      Commit Comments

        * Since 11/2018 all comments are prefixed with a 2-letter code, which
          describes the type of the committed change. The types are:

          Bd: - bugfix in internal (developer) documentation
          BD: - bugfix in public/user documentation
          BT: - bugfix in core SSL/TLS check or test functionality
          BF: - bugfix in general functionality or feature
          ED: - enhancement/improvement in documentation
          ET: - enhancement/improvement in core SSL/TLS checks or tests
          EF: - enhancement/improvement in general functionality or feature
          ND: - new documentation
          NT: - new core SSL/TLS check or test functionality
          NF: - new general functionality or feature

          Lower case b instead of B means  that the bug was  introduced in a
          recent change.

VERSION

        @(#) $VERSION

AUTHOR

        18. Jan. 2018 Achim Hoffmann
