# Bugs and Todo list


# Bugs
Attribute values which are empty like '' or "" are translated wrong
* [ ] Parse should die on obvious errors;
  * [ ] `$!abc.def.ghi`
  * [ ] `$a.b`
  * [ ] `$a b=c d f=a`. This becomes `$a b=c [] d f=a`
  * [x] one `$br` generates two of them! Caused by improper input key selection where the proper table for html could not be found.


# Todo

## Redesigning the program
The program is redesigned to cope with the several actions which got more and more mingled in the parsing phase. It is however possible to pull several actions out of the parsing phase and do it later when parsing is finished. This will become a better separation of concerns. The tree of node objects is build after which methods in external modules will be called, providing the attributes and content to the methods.

### Benefits sofar
* XML is generated by the classes and  therefore the external XML module can be kicked out. Might still need it to reverse engineer XML text to an internal node tree.
* Elements and attributes in the sxml namespace are not removed. They just do not generate XML except for the sxml:fragment element at the top when appropriate.
* Basic searching is implemented in the Node class. This searching has the same properties as xpath using codes and paths. Later a grammar might be created to simplify the calling interface. This means that XML::XPath is also not needed anymore. However, some tests still use it.
* The object tree is build before external methods are called. This means that these methods can search in the tree to insert objects at other places then where the method is called from. a previously created method to map elements to other places afterwards is therefore also not neededanymore.

```plantuml


Start: SemiXML text

[*] -> Start
Start --> parsing

state "Parsing process" as parsing {
  [*] --> parse
  parse: Parse SemiXML\ntext
  parse --> error
  parse -> success
  error: Throw exception\nand finish
  error -> [*]
  success: Return AST
  success --> [*]
}

parsing -> tree
state "Build node tree" as tree {
  [*] -> mktop
  mktop: Create top
  note top of mktop : Top  can be a real\nroot node or holder\nof a fragment
  rast: Read AST
  mktop -> rast
  note top of rast : Read the AST and\nbuild the node tree
  rast --> [*]
}

tree --> meth
state "Modify tree by calling methods" as meth {
  [*] -> init
  init: init module
  note top of init : Initialization of methods\nworking through\ntree top down
  init -> call
  call: call methods
  note top of call : Calling methods\nworking through\ntree bottom up
  call --> [*]
}

meth --> ntree
ntree: Finished node tree

ntree --> [*]
note right of ntree : From this tree it is easy\nto generate XML text
note left of ntree : Other modules can instead\ngenerate this tree. E.g. reverse\nengineer from XML
```

```plantuml


class Node <<Role>> {
  Element: parent
  enum: element-type
  Array[Node]: nodes
}

class Element {
  Array[Body]: bodies
}

Actions *-> Element
Sxml *--> Actions
Sxml *--> Grammar
Sxml *--> Document
Document *--> Node
Actions o--> Node

Node <|-- Element
Node <|-- Text

'Node -> Node
'Node --> Node

```

## Parser and actions.
* [x] An exception class X::SemiXML is created to throw parsing errors.
* [x] Throwing happens from within Grammar and Actions instead of Sxml.
  * [x] **Attributes must be followed by a content body**. Error is thrown when attributes are used but no content body follows. Previously it was possible but it makes the readability worse. E.g What is more understandable; `$a key=value text follows` or `$a key=value [] text follows`? For elements without attributes it is still useful to leave the breakets if there is no content. E.g. `$p [ first line $br second line ]`
  * [x] **Cannot start a content body with '['**.
  * [x] **Unexpected content body start/close character**.

## Syntax
* XML element name can contain any alphanumeric characters. The only punctuation mark allowed in names are the hyphen '-', underscore '\_' and period '.'. Xml namespaces are separated by one colon ':'. These characters can not be used to start an element or to separate a module key from its method.

```
      Current syntax          Becomes             Note    Done

      $|xyz []                $xyz                        x
      $|xyz [x]               $xyz [x]                    x
      $xyz a=b                $xyz a=b []         7       x

      $*|inline [x]           $inline [x]         3       x
      $|*inline [x]           $inline [x]         3       x
      $**inline [x]           $inline [x]         3       x
                              $other =sxml:inline [x]

      $|nonnest [! x !]       $nonnest {x}                x
                              $nonnest «x»                x

      $|spcresrv [= x ]       $spcresrv [x]       3,5     x
                              $other =sxml:keep [x]

      $!key.method [x]        Remains the same    4       x
```
* Notes;
  1) `$*|`, `$|*` and `$**` All types are removed.

  2) Removing comments is done at a later phase after parsing.

  3) The configuration will be searched for those elements which are inline and need a special treatment of spacing around elements. Also non nestable and space preserving elements are searched for in the configuration. The inline elements must also check for some non-alphanumeric characters following the block. E.g. in case of `,` or `.` etc. no space should be placed between the block and the following character.

  4) The module methods have also a name placed in the sxml namespace. E.g. $!module.method gets a name **sxml:module.method**. Therefore these names can also be used in the F-table entries. See below about info of the sxml namespace.

  5) The space reserving '=' character at the start of a block is removed completely. One can specify that some elements are to be space preserving in a local configuration file. Furthermore the :keep option will keep all spacing as was typed in.

  6) Some of the above can be changed by using boolean attributes like `sxml:inline`, `sxml:keep`, `sxml:noconv` and `sxml:close`.

  7) The content brackets are made obligatory even when there is no content.

## List of elements and attributes in sxml ns
### Namespace
* [x] reserved prefix name is **sxml**
* [x] url for the ns is **https://github.com/MARTIMM/Semi-xml**

### Attributes in ns sxml
* [x] **sxml:inline**
* [x] **sxml:keep**
* [x] **sxml:noconv**
* [x] **sxml:close**

### Elements in ns sxml
* [x] **sxml:fragment**. Top level element. Not always visible when only one top element is created.
* [x] **sxml:TN- < max-10-characters-text > - < 3-digit-hexnum >**. Text node names generated from the content and a generator e.g. **sxml:TN-thatsit-00A**.
* [x] **sxml:css-block**. Generated by one of the methods in Css module.
* [x] **sxml:var-decl**. Declare a variable.
* [x] **sxml:var-ref**. Reference to a variable declaration.
* [x] **sxml:modkey.method**. Generated name for a method node.

## Addition of several types of comments
  * [x] **# \<text> EOL**. Comments are removed and can only be used at top the level and in **\$x [ ... ]** parts. '#' Characters used within **\$x { ... }** or **\$x « ... »** are unprocessed.
  * [x] Generated XML Comments using **\$!SxmlCore.comment [ ]**. These produce `<!-- ... -->` texts.
  * [x] $!SxmlCore.drop « ... » throws away all that is enclosed.

## External modules located in SxmlLib tree
* [x] Library paths to find modules are provided using the ML table in default configuration from the resources directory.
* [ ] A module should be accessible from within another perl6 sxml module. Problem of registration.


## Attribute grammar
* [x] **key=value**. Value cannot have spaces.
* [x] **key='v a l u e'**. Value can have spaces.
* [x] **key="v a l u e"**. Value can have spaces.
* [x] **=x** and  **=!x** meaning **x=true** or **x=false**. Boolean attributes
* [x] **key=<v a l u e>**. Attributes are also given as argument to module methods. In this case the attribute value becomes a list of values ('v','a','l','u',e'). The items are split on spaces and the characters ',', ';' or ':'. The value can therefore also be written like **key=<v, a,l,u :;e>**. Of course, choose wisely for readability! Empty items are not possible.


### Content body delimiters.
* [x] `[ ... ]`. The content can have other elements which is handled automatically by the grammar. The content text can be any range of characters of which the characters `$`, `[`, `]`, `\` must be escaped using a backslash character. E.g. `\[` or `\]`. After parsing all comments are removed. These start with `#` and will end at the end of a line or end of a content body. To use a `#` in text, it must also be escaped.
* [x] `{ ... }`. The content cannot have any elements, they will not be interpreted and left as text. Also, comments are not removed. Other characters can be used freely except for the `\`, `{` and `}`.
* [x] `« ... »`. The interpretation of this content is the same as for `{ ... }` except the characters needed to escape are now `\`, `«` and `»`.

### F Table
The F table in the configuration is used to control the formatting of the elements and text in the content bodies. There are 4 entries all having an array of element names. The element names are checked against the current element in the translation process. For HTML, most of the elements are defined in the proper categories. Docbook5 is in progress.
* [x] `inline`. When elements are in this category, the program will check around this element to see if the spacing is done right. It also checks for punctuation characters following the element. Example elements from HTML are `b`, `strong` or `a`.
* [x] `no-conversion`. This is a category of elements who's content cannot be interpreted and changed. Most content can be controlled like this using one of the last two body types. Examples of this type are `script` or `style`. The actions understood to be conversions are those which confuses XML and must be 'escaped' into entities. These are;
  * `&` -> `&amp;`
  * `<` -> `&lt;`
  * `>` -> `&gt;`
  * `\s` -> `&nbsp;` This entity must be defined in the DTD or doctype except for html.
  * `\<any char>` -> `<any char>`
* [x] `space-preserve`. No spaces are removed except to reduce indenting on multi line text. One example is `pre`.
* [x] `self-closing`. In this category are the elements who do not have content. Examples for this kind are `meta`, `br` and `hr`. If content is supplied, it will be removed.

### Attributes
Special attributes can be used to modify the behavior of the F-tables. These are boolean typed. An example is `=sxml:inline` to force the element in the inline category and `=!sxml:inline` to force the opposite.
* [x] `sxml:inline` controls the inline category.
* [x] `sxml:noconv` controls the no-convert category.
* [x] `sxml:keep` controls the space-preserve category.
* [x] `sxml:close` controls the self-closing category.


## Items needed in program sxml2xml or SemiXML/Sxml.pm6
  * [x] Dependencies on other files. This is controlled by the D table in the config.
  * [ ] After having translated to, or loaded from XML sources, try to reverse engineer the XML back into sxml. The result can only be a static result but it can be helpful to get Sxml text from XML templates and then modify the code later.
  * [ ] Add a convenience method to Helper.pm6 to process %attrs for class, id, style etc. and add those to the provided element node. Then remove them from %attrs. `method std-attrs ( XML::Element $node, Hash $attributes ) { }`

## A few of the core methods are transformed to simple tags.
* [x] `$!SxmlCore.comment` is now `$sxml:comment`
* [x] `$!SxmlCore.cdata` is now `$sxml:cdata`
* [x] `$!SxmlCore.pi target=x` is now `$sxml:pi target=x`

## Configuration
The configuration is maintained in a `toml` type of config file. The user must edit this file to control the transformation process. There can be many files which are merged together using the Config::DataLang::Refined module. There are several steps to find and merge these config files;
  * [x] **Perl6 Resource Location/sha1 translated resource for SemiXML.toml**. This one is read first and hold some defaults for use with XML, HTML and Docbook.
  * [x] **users sxml file's directory/SemiXML.toml**.
  * [x] **users home directory/.SemiXML.toml**.
  * [x] **current directory/.SemiXML.toml**.
  * [x] **current directory/SemiXML.toml**.
  * [x] **users sxml file's directory/sxml filename.toml**.
  * [x] **users home directory/.sxml filename.toml**.
  * [x] **current directory/.sxml filename.toml**.
  * [x] **current directory/.sxml filename.toml**.

The configuration file represents a few tables which can be refined using keywords. These keywords are provided via the `:refine([in,out])` attribute or `--in=...`, `--out=...` commandline options. When choosing the proper keywords, one must keep the following in mind. First, the document you edit is always written in the **sxml** language. What XML Language it represents should be the first option (by default **xml**) and what it should become the next option (also by default **xml**). An example is `--in=docbook5` and  `--out=pdf` or `:refine([<docbook5 pdf>])`.

### Configuration table
    # [C] Content additions table. only used with out-key and file. Looked
    # up after parsing to prefix data to result. Used for booleans to control
    # inclusion of XML description(X table), doctype(E table) and message
    # header(H table)
    [ C ]
    [ C.out-key ]
    [ C.out-key.file ]

    # [D] Dependencies table, only with in-key. The file is used to
    # specify the array of files on which this file depends.
    # Looked up before everything is started. Used by sxml2xml program.
    [ D ]
    [ D.in-key ]
    out-key = [ 'dep-file in-key;dep-file out-key;dep-file', ...]
    out-key = 'dep-file in-key;dep-file out-key;dep-file'

    # [DN] Default namespaces and other namespaces on the root element
    [ DN.in-key ]
    [ DN.in-key.file ]
    # attributes xmlns="url1" and xmlns:svg="url2"
    default = "url1"
    svg = "url2"

    # [E] Entity table. Only with in-key and file.
    # Looked up after parsing to prefix data to result.
    [ E ]
    [ E.in-key ]
    [ E.in-key.file ]

    # [F] Formatting table. Used to control formatting of text. Used while
    # parsing and translating.
    [ F ]
    [ F.in-key ]
    [ F.in-key.file ]

    # [H] Http table, only with out-key and file. Looked up after parsing.
    [ H ]
    [ H.out-key ]
    [ H.out-key.file ]

    # [ML] Combined module and library table. Only with in-key and file.
    # Looked up just before parsing.
    [ ML ]
    [ ML.in-key ]
    [ ML.in-key.file ]
      mod-key = 'Module[;library]'

    # [R] Run table only with in-key and file. The run-key is used to select
    # the command line. Looked up after parsing. Used to send the total
    # finished document to a program for further processing instead of saving
    # it to disk.
    [ R ]
    [ R.in-key ]
    [ R.in-key.file ]
      run-key = 'command line'
      [ run-key = 'command line', target-file]

    # [S] Storage table, only with file. Looked up after parsing.
    [ S ]
    [ S.out-key ]
    [ S.out-key.file ]

    # [T] Trace table. Does not use in or out keys, only the filename
    [ T ]
    [ T.file ]

    [ U ]
    [ U.in-key ]
    [ U.in-key.out-key ]
    [ U.in-key.out-key.file ]

    # [X] xml description table
    [ X ]
    [ X.out-key ]
    [ X.out-key.file ]


  All these ideas could also replace the one option --run from the program which only had a selective influence on the [output.program] table. Also less files might be searched through as opposed to the list shown above.
  This is now implemented.


## Modules and ideas
Many parts of any xml like language can be coded so this will never be finished, but lets say that when a few things are implemented, then there are examples to build the next methods.


### Plugin modules
* [ ] Use role Pluggable to handle plugin modules. Delivered modules in the Sxml namespace can be handled this way.
* [ ] Use the resources field from META.info to save the core Sxml plug-able modules.


### What a module must be able to do

* [x] Get hold of the primary sxml file name which is parsed. It is now stored as a filename attribute in the Globals class and is readable for every module.
* [ ] Call another sxml module.
* [x] Access to the configuration.
* [x] A module user may define entries in the configuration for the module to use. These entries could reside in the [ U ] table (or user table).


### Html
* [x] Support html
  * [x] `IN` refinement assumed to be `html`
  * [x] config.toml in resources


### css a la scss/sass
* [ ] **SxmlLib::Css**. Support css

Css can be generated using methods. Nesting can take place like in sass/scss is done. Variable generation explained above can help here for example to generate color palettes.

* [x] **\$!css.style** to use at the top and generates the \<style> elements with the css content.
An example css definition
  ```
  $!css.style [
    $!SxmlCore.colors base='red' type=single-color []
    $!css.b s='.infobox >' [
      $!css.b s=.message [
        border: 1px solid $sxml:color-four;
        $!css.b s='> .title' [
          color: $sxml:color-eight;
        ]
      ]
      $!css.b s=.user [
        border: 1px solid black;
        $!css.b s='> .title' [
          color: black;
        ]
      ]
    ]
  ]
  ```

  The code above could produce (This will be more like a one liner, but is pretty printed here)

  ```
  <style>
  .infobox > .message {
    border: 1px solid #440000;
   }

  .infobox > .message > .title
    color: #880000;
   }

  .infobox > .user {
    border: 1px solid black;
  }

  .infobox > .user > .title {
    color: black;
  }
  </style>
  ```
  * [x] block with selector spec
  * [x] nesting blocks like in sass
  * [x] reset css definitions
  * [ ] looping structures, sass like


### Docbook
* [ ] Support of docbook 5
  * [x] `IN` refinement assumed to be `db5`
  * [ ] config.toml in resources


### Plain XML or independent of any XML language
* [x] Plain XML
  * [x] `IN` refinement assumed to be `xml`
  * [x] config.toml in resources


### Independent of any XML language
* [x] **SxmlLib::File**. Load or refence to external file
  * [ ] Link to page or image checking and generating.
  * [x] Load sxml file
  * [x] Load xml file

* [x] **SxmlLib::LoremIpsum**.
  * [ ] Better and longer texts and store them in resources. So the text can be loaded when needed instead of having all texts in the module.

* [ ] **SxmlLib::Html::FixedLayout** - Content from files to be used in e.g. pre elements.
  * [ ] load-test-example


#### Variables

* [x] This is defined in the main lib SxmlCore. An example;
```
$!SxmlCore.var name=aCommonText [ $strong[Lorem ipsum dolor simet ...] ]
```
That method sets a variable in the `sxml` namespace. Any use of **\$sxml:var-ref name=aCommonText** would then be substituted by the variable value `Lorem ipsum...` instead of translating it into an XML element **\<sxml:var-ref name="aCommonText" />**. What the method generates is simple and can be written more directly as **\$sxml:var-decl name=aCommonText [ \$strong [Lorem ipsum dolor simet ...] ]** without calling the `var` method.

Scope is local except when global attribute is set. The local scope is however a bit strange because the use of a variable might come before the declaration of it. This is because the declaration is searched first and then, with that information, searched for the variable uses in the set of child elements found in the parent element of the declaration. This needs some more thought to get the behavior one expects from 'local'.

* [x] User methods can also declare variables. The only thing it needs to do is generating an element such as from the example above;
    ```
    my SemiXML::Element $var .= new(
      :name<sxml:var-decl>, :attributes(%(:name=<aCommonText>))
    );

    $var.append(
      :name<strong>, :text('Lorem ipsum dolor simet ...')
    );
    ```

* [x] Mistakes in names of variables can be prevented by writing the brackets with empty content like so **pre\$sxml:var-ref name=data []_map**. If **\$sxml:var-decl name=data [ pqr ]**, this would become `prepqr_map`.

* [ ] A variable declaration with some kind of substitution. E.g. a declaration like **\$!SxmlCore.var-decl name=hello |name='World' [Hello \$|name]** has a variable in it. This is used like **\$sxml:var-ref name=hello |name=Piet []** which translates to `Hello Piet` and **\$sxml:var-ref name=hello []** translates to `Hello World` where the default is used. One might also think of **\$sxml:var-ref- name=hello [Other Planet]** which translates to `Hello Other Planet`. In this example the declaration attribute `|name` is used to define a default value for **\$|name**. With this the syntax must be extended to understand |name and $|name. Extending; **\$!SxmlCore.var-decl name=hello |planet='Earth' |moon='Moon' [Hello \$|planet and \$|moon]**. To refer to that one; **\$sxml:var-ref name=hello |planet=jupiter [io]** translates to: `Hello jupiter and io`.

* [ ] Substitution in attribute values.
* [ ] Map one variable to another


#### Calculation of color palettes
* Generating a set of colors is useful in defining several of the properties in css. Instead of coding the colors individually, the colors can be calculated using some algorithm and stored in variable declarations. When one is not satisfied, the calculations can be repeated with different values without changing the used variables.

See also [w3c color model](https://www.w3.org/TR/2011/REC-css3-color-20110607/#html4)
* Attributes for the color calculations
  * Input color.
    * [x] base-rgb; '#xxx[,op]', '#xxxxxx[,op]' or 'd,d,d[,op]' where x=0..ff and d=0..255 or percententage. op (opacity) is a Num 0..1 or percentage and is optional
    * [x] base-hsl; 'hue,saturation,lightness[,op]' as an angle,percentage,percentage and op or opacity is optional.
  * [x] Type of calculation
  * [x] Output variables

  * Calculation of color ranges
    * [x] Blending
    * [x] Single color range
    * [ ] Primary and secondary color ranges
    * [ ] Ternary ranges

### Other ideas
* [ ] Handle and generate ebooks

* Supporting perl6 module testing to generate reports
  * [x] **SxmlLib::Testing::Test**
  * [x] **SxmlLib::Testing::Summary**
  * [ ] Make benchmark reports using `Bench`
  * [ ] Make code coverage reports with `Rakudo::Perl6::Tracer`.
  * [x] Possibility to modify layout with css

* [ ] Make use of javascript to make text dynamic
* [ ] Avatar linking
* [ ] Generating tables
* [ ] Generating graphics, statistics, etc using javascript libraries
* [ ] Scalable Vector Graphics or SVG, see spec at [w3c][svg].

* [ ] Make use of XPointer, see spec at [w3c][xpoint].
* [ ] XLink, see spec at [w3c][xlink].
* [ ] XInclude, see spec at [w3c][xincl].
* [ ] XML Binary, see spec at [w3c][xbin].
* [ ] Fragments. W3C has closed their specification of XML Fragments but is used by this module as a leading specification, see spec at [w3c][frag].
* [ ] XML Stylesheets (xsl), see spec at [w3c][xstyle].
* [ ] XML Schema (xsd), see spec at [w3c][xschema].
* [ ] XPath grammar for the search engine.
* [ ] XSL and XSLT

* [x] Syntax highlighter for atom editor. Its called language-sxml in a separate repository at MARTIMM's.
* [ ] Syntax checker using one of the parsers shown [here][jsparse]. Choose a parser which is written for javascript. Then it can be used in the atom editor.
* [ ] Execution and display result in atom like Markdown-preview-enhance.

## And …
  * [ ] Documentation in a manual.
  * [ ] Module and program pod documentation
  * [ ] Documentation is started as a docbook 5 document. There are references to local iconfiles and fonts for which I don't know yet if they may be included (license issues).
  * [ ] Tutorials.

<!-- References -->
[colors1]: http://paletton.com
[colorspace]: https://martin.ankerl.com/2009/12/09/how-to-create-random-colors-programmatically/
[colors2]: http://devmag.org.za/2012/07/29/how-to-choose-colours-procedurally-algorithms/

[svg]: https://www.w3.org/TR/SVG11/
[jsparse]: https://tomassetti.me/parsing-in-javascript/

[xlink]: https://www.w3.org/TR/xlink11/
[xinclude]: https://www.w3.org/TR/xinclude/
[xpoint]: https://www.w3.org/TR/xptr-framework/
[frag]: https://www.w3.org/TR/xml-fragment.html
[xbin]: https://www.w3.org/TR/xbc-characterization/
[xstyle]: https://www.w3.org/TR/xml-stylesheet/
[xschema]: https://www.w3.org/TR/xmlschema11-1/
