(c) Wally Bass 1995, 2009.
XYENC.COM is a program which encodes XyWrite XPL programs into a form that is readable by any most any ASCII editor. XYDEC reverses the process.
XYENC reads a XyWrite XPL program file as input and writes a translated version of the file with the following properties:
XYDEC can also be useful or convenient in supporting creation of XPL code directly in the XYENC encoded format, and then using XYDEC to create actual XPL code. This can be done for either portions an XPL program, or for an entire XPL program. This technique can be particularly useful for content that involves a lot of XyWrite 3 byte encoded data characters, or a lot of special characters (e.g., chars in the 0-1Fh and 80h to 0FCh range).
If you develop XPL programs directly in the XYENC encoded format, you will have the advantage of being able to format them with line breaks, indentation, and comments as you go, with all of these "human readable" additions being filtered out by XYDEC when you go on to produce the actual executable XPL file.
XYENC can actually be used to encode any file -- for transmission by email, for example. But it is optimized for, and its results are visually more useful for typical XPL program files. It also does fairly well with normal XyWrite files, but probably would be not that efficient for an arbitrary binary files.
In the output file, XYENC does a number of character translations. These have been chosen with some care, based on character usage (and non-usage) typical in XPL programs.
XPL programs generally contain few blanks. XYENC translates any blanks in the input to underscores.
3 byte encoded normal data characters -- those whose base character has a "code value" less than 100h, and whose encoding beginning with a 0FFh byte (e.g., NOT including XyWrite IV's 654 additional characters whose 3 byte encoding starts with a 0FEh byte) -- are collapsed by XYENC to the corresponding 1 byte characters (which may then subsequently be re-encoded as required by XYENC's encoding scheme), but these characters are preceded by an inserted ":" (colon) or ";" (semicolon) in XYENC's output stream, to trigger a re-expansion to 3 bytes by XYDEC on decode. (The ":" is used normally, but ";" is used for the alternate 3 byte character encoding scheme that XyWrite IV uses for certain search argument characters that have "wildcard" significance. XyWrite IV by default displays characters encoded with this alternate scheme in MD78 (yellow on brown).)
3 byte encodings for data which are not actually created by XyWrite (e.g., which are create by a with a hex editor or by direct manipulation of 0FFh prefixed 3 byte sequences by the user) are frequently not in "standard form." For example, bytes 2 and 3 of the XyWrite III+ alternate data byte encoding should each be '0' to '9' or uppercase 'A' to 'F', but one sometimes finds lower case 'a' to 'f' in encodings that are "produced by hand." XyWrite will actually accept *any* byte value for these digits without complaint (as long as byte 2 is <80h, which is necessary to indicate this kind of encoding) -- for any char other than '0' to '9', XyWrite simply subtracts 7 and applies a 0Fh mask, to compute a "nibble" value for the byte.
As indicated, XYENC saves a XyWrite 3 byte data encoding as just the (possibly XYENC encoded) 1 byte equivalent character, plus a prefix "flag" (":" or ";") that says that re-expansion back to a 3 byte encoding is necessary on decode. XYDEC therefore would not nominally have enough information to restore 3 byte data encodings to anything other than the "standard" 3 byte encoding. To deal with non-standard encodings, XYDEC needs more information. The "!" (exclaim) prefix is defined to address this problem -- this prefix allows the inclusion of extra data in the encoded file which effectively indicates which non-standard encoding was used in the source file, so that XYDEC can ferret out the actual (non-standard) encoding that was used for the character in the original file.
For "control characters" whose code values are less than 20h, XYENC uses an encoding similar to the ^x (where x=@,A to Z,[,\,],^,_) encoding often seen for "control characters" in Unix, but uses a ˜ (tilde) rather than ^ (caret) for the first character of the encoding.
XYENC translates 3 byte functional primitives to a right single quote (abbreviated rsquote hereafter in this note) prefixed 'xx encoding, where xx is the two letter code used by XyWrite in the display of the primitive on screen (the x's can be any of *,<,>,@,#,$,&,0 to 9,A to Z, as per XyWrite IV's choice of primitive names). XyWrite IV's primitive names are used, rather then XyWrite III's. This does not affect subsequent decoding, but does means that XyWrite III users will see an "improper" (e.g., XyWrite IV's, rather than XyWrite III's) two character "name" for the following primtives in an XYENC encoded file:
┌─────────────────────────── XyWrite III+ (3.57) "name" │ ┌────────────────────── XyWrite IV "name" for same encoding │ │ ┌─────────────── XyWrite III+ function definition │ │ │ ┌── XyWrite IV function definition ┴─ ┴─── ┴───────── ┴─────────────────────── ML MS Move Left mouse MR NM Move Right No Markers NM MK No Markers no MarKer mode DK MW undef undef XX many undef (XX appears 22 times in XY3's primitive tbl) TX OV undef undef FT JM undef Jump to Menu LO SG undef get text macro (prompts) TG XH undef clear Help SG FT undef undef CG BX undef Blind eXecute TM MN undef MeNu αL CB undef Cycle Backwards αR M9 undef Italic mode αB MZ undef Bold Italic mode αE ZZ undef undef XT RZ undef undef KF ST undef Show Triange help OV KF undef Keyboard Flip HG JC undef Jump over Commands
Undefined (undef) here means neither XyWrite's documentation nor Herbert Tyson's XyWrite Revealed book has a definition. The only cases where the substitution of the XyWrite IV names for a XyWrite III "well defined" primitive might cause confusion when reading a XYENC encoded XyWrite III file seems to be the XY3 ML and MR primitives. ML seems to be the same as CL, so I would expect zero use of ML, with CL being used instead. MR has a very strange and essentially useless definition as well (according to Tyson), so I wouldn't expect significant usage of the MR primitive in XyWrite III+ files either.
All of this results in XYENC output where "_" (underscore), ":" (colon), and ";" (semicolon) have special interpretations, and where the "!" (exclaim), "˜" (tilde) and "'" (rsquote) characters are used as "prefix" characters to signal a variety of multicharacter encodings. Hence, when any of these characters ("_", ":", ";", "!", "˜", or "'") actually occur as text in the input, XYENC must and does replace them with something else, so that they won't trigger our special interpretations or trigger a prefix interpretation of the characters during decode. So XYENC also replaces all of these characters, as they occurred in the original input, with some encoding using a "˜" (tilde) or "'" (rsquote) prefix.
In choosing our translations, we intentionally made sure that blanks, tabs, CRs, LFs, % (percent) and guillemet chars are not generated as necessary information in XYENC output. In so doing, we also coded XYDEC to discard these characters (or embedded commands, in the case of guillemets) when they appear as input to the XYDEC program during decode. This means that one can freely (anywhere except within a multibyte encoding) add blank, tab, CR, and LF characters, and also XyWrite embedded commands, to an encoded file, to (drastically, even) reformat it for better readability.
The "%" character was included in this group to make the "%" character also available for arbitrary use in an XYENC encoded file. This was intended to facilitate editing "tricks" that a user may want to do in an encoded file, without the risk of "surprise" search/change hits caused by "%" characters generated by the encoding scheme. For example, when I write XPL programs, I initially give symbolic names to the Save/Gets (S/Gs) that I use. You would therefore find me (for example) writing an embedded XS command as something like
˜<XS%input,%splt,%left,%middle,%right˜>.
I would then later do a series of commands like
BC cha /%input/03/XC BC cha /%splt/04/XC
at the "last moment," to actually assign specific S/Gs to be used in the XPL program. The availability the "%" in such schemes, without danger of conflicting with any XYENC generated output, helps insure that (a) I don't get false hits with text elsewhere in the XPL program, when I do such changes, and (b) helps me check that all variables have been converted. In my usage of the encoded format, I usually save a copy of the XPL program before these final substitutions, and place the require change command, as per the above, at the top of the file as a macro, so that the assignment of variables to S/Gs can easily be redone in the case of later conflicts.
XYDEC also has a sequence -- "'%" (rsquote,percent) -- which causes discarding of input up to the next CR or LF during XYDEC decoding. This effectively provides a commenting facility for encoded versions of XPL programs, where adding comments to the encoded version has no effect on what is generated, when an encoded file is decoded via XYDEC.
When reading a XYENC encoded copy of an XPL programs, the user needs to be aware that the following translations have been made for the following simple characters
From To From To From To From To « ˜< blank _ ! '| _ '- » ˜> CRLF '^ : '. ˜ '? ∈ ˜{ % '/ ; ', ' '` ≡ ˜=
Although XYENC doesn't generates any 3 byte encoded data characters in its output, XYDEC has been coded so as to recognize XyWrite III type 3 byte data character encodings (that is, 0FFh,h,h encodings, where the h's are '0'-'9','A'-'F'). XYDEC restores these back to 1 byte encodings, which are then taken "as is" (e.g., not having a special interpretation, such as "as a prefix"). Note that this means that if you put a 3 byte data byte encoding in a encoded file and you really wanted a 3 byte encoding to persist in the XYDEC translated output, you must (as usual) precede the three byte encoding with a : (colon) or ; (semicolon).
Because 3 byte data encodings are decoded to 1 byte and then passed as is by XYDEC, one can make XYENC encoded files "look more" like an actual XPL programs, by doing the substitutions done by the following macro, on an encoded file:
BC cha $_$ 20$XC BC BC cha $˜<$ AE$XC BC (=«) BC cha $˜>$ AF$XC BC (=») BC cha $˜{$∈$XC BC BC cha $˜=$≡$XC BC BC cha $'-$ 5F$XC BC (=_) BC cha $'/$ 25$XC BC (=%) BC cha $'|$ 21$XC BC (=!) BC cha $'.$ 3A$XC BC (=:) BC cha $',$ 3B$XC BC (=;) BC cha $'?$ 7E$XC BC (=˜) BC cha $'`$ 27$XC BC (=')
These changes replace the indicated translations with 3 byte encodings of the original characters. You may want to skip the first of the changes above (e.g., you may want to preserve the blank to underscore to substitutions), though, since 3 byte encoded blanks are easily deleted by accident. (Note that, other than those following the "cha" commands, what appear as blanks in the above macro are 3 byte encoded 0FFh characters. The stuff starting with the ˙ at the end of each line is strictly for commenting purpose. (In this usage, the 3 byte encoded 0FFh at the end of each line prevents the CRLF at end of line from causing the damage that it would otherwise cause in macro execution.)
These changes, of course, cause the resulting file to once again be suitable only for XyWrite viewing, since it will now again contain 3 byte encoded data. However, if you want to pass on such a file via an email, nothing prevents you from doing a "second level" re-encoding with XYENC.
In encoded form, code can be broken into lines, indented, and commented, for documentation purposes, without affecting the decode form. Archiving of routines in encoded form, with added comments, makes a lot of sense to me.
The XYDEC decode translations and actions are specified by the following. The specification of the decoding implies what the corresponding encoding must be.
Single char input ┌───────────────────────────────────────────────── length of encode │ ┌─────────────────────────────────────────── trigger char │ │ ┌──────────────────────────────────── graphic of input │ │ │ ┌─────────────────────────────── graphic of output │ │ │ │ ┌─────────────────────── description of input │ │ │ │ │ ┌─────────── description of output │ │ │ │ │ │ ┌─ comment ┴ ───┴─── ─┴── ┴ ───┴─── -> ────┴────── ┴─────────────────────── 1 blank blank -> none discarded 1 tab ␉ tab -> none discarded 1 LF ␊ LF -> none discarded 1 CR ␍ CR -> none discarded 1 % % percent -> none discarded 1 EOF → EOF -> none discarded 1 « ▲ l-chev -> none discard through matching "»" (embedded commands). note [4] 1 » ▲ r-chev -> none matches "«" 1 u_score _ u_score -> blank NA colon : colon deferred expand following char to 1 byte -> 3 byte 0FFh,h,h type encode NA semicln ; semicln deferred expand following char to 1 byte -> 3 byte 0FFh,0C0h,80h+x encode Prefixed encodings ┌───────────────────────────────────────────────── length of encode │ ┌─────────────────────────────────────────── trigger char │ │ ┌──────────────────────────────────── graphic of input │ │ │ ┌─────────────────────────────── graphic of output │ │ │ │ ┌─────────────────────── description of input │ │ │ │ │ ┌─────────── description of output │ │ │ │ │ │ ┌─ comment ┴ ───┴─── ─┴── ┴ ───┴─── -> ────┴────── ┴─────────────────────── 3 exclaim !cc !,2 chr deferred note [2] 3 0FFh 3 byte -> 1 byte 0FFh,x,y ,x<80h input. note [1] 3 0FFh 3 byte -> unchanged 0FFh,x,y, 80h<=x<0C0h, input 3 0FFh 3 byte -> unchanged 0FFh,x,y, x>=0C0h, input. (XyWrive IV alternate form data encode) 4 tilde ˜ddd, 256<=ddd<999 3 byte char 3 byte XY IV 0FEh pfx encoded out 4 tilde ˜ddd, 000<=ddd<255 1 byte char note [1], note [3] 2 tilde ˜x y ˜x -> 1 byte char x=@,A-Z,[,\,],^,_ y=1 byte, 0h to 1Fh, note [1] 2 tilde ˜I ␉ ˜I -> tab specific case of above 2 tilde ˜< « ˜LT -> l-guillemet note [1] 2 tilde ˜> » ˜GT -> r-guillemet note [1] 2 tilde ˜= ≡ ˜equal -> contains note [1] 2 tilde ˜{ ∈ ˜lbrace -> offset in note [1] 2 tilde ˜# ˜pound -> none same as "«" -- discard to matching "»" (block comment). note [4] 2 rsquote '- _ 'minus -> u_score note [1] 2 rsquote '/ % 'slash -> percnt note [1] 2 rsquote '^ 'caret -> CRLF 2 rsquote '| ! 'v-line -> exclaim note [1] 2 rsquote '. : 'period -> colon note [1] 2 rsquote ', ; 'comma -> semicln note [1] 2 rsquote '? ˜ 'qmark -> tilde note [1] 2 rsquote '` ' 'lquote -> rsquote note [1] 3 rsquote 'xy 'xy -> 3 byte prim x=@,#,$,&,*,<,>,A-Z, y=those and +,-,0-9 2 rsquote '% 'percnt -> none discard to CR or LF (comment) 2 rsquote 'dd/dd/dddd dd:dd:dd none note [5] 2 rsquote 'dd-dd-dddd dd:dd:dd none note [5] Other characters Characters not "caught" by above filters are passed as-is.
x = ( ( b ) & 0F0h ) | ( ( c >> 4 ) & 0Fh ); y = ( ( b << 4 ) & 0F0h ) | ( ( c ) & 0Fh ); x = fix( x , ( n >> 1 ) & 1 ); y = fix( y , ( n ) & 1 ); where "fix" is defined char fix(n,flag) { if ( n < '0' || n >'9' || flag ) return ( n + 7 ) mod 256; else return n; }
Since blank, tab, CR, LF, and % don't appear in normal encode output, the XYDEC decode program simply discards them (BUT, XYDEC currently will get tripped up and fail if these characters are added in the middle of an encoded sequence!). This means that you can add these to an encoded program (other than within encoded sequences) for formatting purposes, without changing at all what the file will decode back to. Similarly, you can add end of line content containing any characters, prefaced by "'%" (rsquote,percent), without changing the decode output, allowing you to add comments to encoded content.
If the target of an encode is not for email transmission or such, but only for XyWrite viewing, you can translate some or all 2 byte tilde or rsquote prefixed encodings to 3 byte XyWrite encodings, for better readability, and the decode will not be affected. Also, most ˜x and ˜ddd translations for chars in the "control char" (0h to 1Fh) and the 80h to 252h range can be "backed out," to 3 byte encoded data chars for 0h, 08h (backspace), 09h (tab), 0Ah (LF), 0Dh (CR), 11h, 1Ah (EOF), and 1Bh (Esc), 0AEh («), and 0AFh (»), or to the 1 byte character for others, if that is suitable for your use of the encoded file -- doing so will not affect your ability to decode the file with XYDEC.
Translations are by no means optimum, but were instead chosen with the idea of having an encoded form which is about as readable as possible, even in its all ASCII (20h<=char<=7Fh) incarnation -- hence translations like « to ˜< , » to ˜> , ≡ to ˜= , and ∈ to ˜{ .
Two assemble time switches in XYENC are available for more compact encoding, at the cost of using additional character codes in the XYENC output.
IF CTLOK was set at assemble time, XYENC leaves characters in the range 0-1Fh -- except for 0h, 08h (backspace), 09h (tab), 0Ah (LF), 0Dh (CR), 11h, 1Ah (EOF), and 1Bh (Esc) -- intact, without any translation. By convention, modules compiled with this flag have an 'L' appended to the module name.
Similarly, if HI_OK was set at assemble time, XYENC leaves the characters in the range 80h-0FCh -- except 0AEh («), 0AFh (») 0EEh (∈), 0F0h (≡), intact, without translation. By convention, modules compiled with the HI_OK flag have an 'H' appended to the module name.
The "help" text from when the program is invoked without args will indicate if a given XYENC module instance was assembled with either or both of these flags. If your usage of XYENC output is not hampered by the existence of these characters, you might find these modified encoded representation preferable for many or even most purposes. Decoding of such files by XYDEC is not affected.
Any input stream is valid input to XYENC, but there can be errors in an input stream to XYDEC. When such errors are found, XYDEC signals an error at the end of its processing, and simply includes the offending input text in the output, bracketed by "[[[[" and "]]]]". Searching the output for these characters should help identify the offending input.
Wally Bass
Last updated Jan 2009