/* version 3.2 Date: 2024-06-16
Versions of this program are designated both by the name, and by the date. This may not be
conventional but there have been so many small changes recently that it is inconvenient to just use the version numbers.
************************************************************
Ideas for developments in the next major version:
Define an IRRP as an IRR without its middle element.
An IGR is an IRRP (LHS), a char (alpha), and an array of IRRP's (the RHS's)
In the derivation of an IGR, from its LHS the pointer starts from next to alpha, reaches near the other end then returns to alpha.
Then after removing the redundant symbols, it must actually reach the other end first. Write a function to generate the set of RHS's
of an IGR from its LHS. To generate the set of IGR's from an initial IGR x, take each RHS of x, apply this function to get the set S
of IRRP's. Then the results of F have their LHS's equal to each RHS of x, and it RHS's equal to the corresponding set S.
An algorithm for finding all the IGR's might be to start with those to get IRR(2), then repeatedly take each one of these, apply F to it and add all the
results that are not already present to the same set and mark the original IGR as done. Repeat until all are done. It would be very interesting
if this gave a finite final set. I think this might not work because in general context symbols need to be added before F can be done and their number
appears to be unlimited. It might be instructive anyway if the result is not trivial.
This needs to be modified to handle RCS's correctly.
*************************************************************
The outputs (to the screen and file) have been reorganised and are much improved with
better dialogues. The IRR's are now numbered and counted correctly for all the different types.
The sorting of the IRR's works for all lengths. The boolean matrix showing which IGR's follow others
is represented more efficiently and Tarjan's algorithm is incorporated to show the strongly connected
conmponents of the network of IGR's using the previous notation (names) for the IGR's.
IGR's that can directly follow themselves are listed.
Allowing for T1 and T2 to have different lengths appears to be done and correct.
The dialogues have changed and I've done a lot of tidyng up, i.e. deleting out of date
comments and unwanted commented out code. The IGR(2) are now correct.
This is the computer program TIE (Turing machine Inference Engine) version 3.0 draft. It employs
the strategies described in the latest draft manusript of the Turing Machine analysis paper to be found
on www.bluesky-home.co.uk
It generates IRR's and IGR's (see references and latest draft paper) from a Turing Machine. This works for
non-terminating TM's, and it has not all been checked for TM's that halt. TM's that don't halt are a simpler concept
that it makes sense to analyse first, and the analysis of halting TM's could be obviously based on this.
This program must be compiled before it can be used. It was written in D and
originally had extension .d, but I prefer to keep it on the web as a .txt file and rename it before compilation. To compile
it go to dlang.org and download one of the compilers (I use dmd)
then do for example the following in a terminal program that takes text commands:
$mv tie_v2_0.txt tie.d
$dmd tie.d
and run it with the command that includes the file name of the TM to be analysed.
$./tie TM_file_name.txt
An alternative if the user is familiar with the dialogue is to ignore the screen output
and redirect the output to any file eg
$./tie TM_file_name > out.txt, and then put the answers to the dialogues.
The IRR are essentially the non-redundant set of computational short-cuts that
can be derived by running the Turing Machine(TM) on a finite length of tape.
Every computation of the TM as a sequence of single TM steps can be replaced
by an equivalent shorter sequence of applications of the short-cut rules,
(though the set of such rules may be and usually is infinite). Any computation with a TM
expressed in terms of its IRR is reduced in terms of the number of computation steps needed.
By correctly guessing the iterative formulae or method for obtaining the IRR for a particular TM
with the aid of the list of IRR generated by this program,
and verifying them by an induction argument, it is possible to obtain long-
term information on the behaviour of the TM that is not obvious initially, and
may shed light on the problem which the TM was designed to solve. Because TM's
can be set up to solve any problem that can be solved by an algorithm, the
possibilities of application of this method seem limitless.
The notation for each irreducible regular rule (IRR), including the original
Turing Machine (TM) table, is in the following order: two configuration sets (CS)'s
each defined by a quadruple (state,pointer position,length,tape symbols) representing
the input and output respectively of the rule, followed by the status of the rule
(L left-moving,R right-moving,H halt), and finally the numbers of steps involved in its
derivation (number of derivation steps from rules of length n-1, number of TM steps).
The lines of the TM machine table must be sorted first by old state increasing, and
then by symbol read in alphabetical order a to z.
The layout of this file is as follows:
(1)Copyright notice and literature references
(2)History of different versions what they do including the current one, in reverse order.
(3) The program code itself which is divided as follows
(3.1) Global variables and library imports
(3.2) Class definitions
(3.3) Write functions for the classes
(3.4) Other functions
(3.5) The main function which is where the program execution begins.
Copyright (C) 2018 John Nixon
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
A copy of the GNU General Public License can be obtained from
.
Mathematica Aeterna, (2013) Vol. 3, no. 9, 709 - 738 (Paper 1)
a second paper continuing this study has now been published
Mathematica Aeterna, (2017) Vol. 7, no. 1, 19 - 56 (Paper 2)
Program modifications
New in version 3.0:
The latest draft of the paper describes the construction and use of IGR's (IRR generating rules). These are rules that
describe the generation of new IRR's from old ones using the general procedure called F in the paper. This version of the
program generates these IGR's up to a given length. The motivation for doing this is that a small number of IGR's could be
combined to generate a large indeed infinite number of IRR's. It is unknown at this stage under what conditions on a TM
is the number of IGR's for that TM finite.
New in version 2.1:
Foreach loops over arrays of class objects do not need ref to avoid unnecessary copying.
them to exactly mimic the output from the default write functions for structs to aid in program testing, but
which could now be modifed if needed.
I did this to understand the difference between structs and classes, and to attempt to write the program
more efficiently. I have simplified the code a little and eliminated some unnecessary copying, but
probably not much. I was motivated by the elegant way in which I could implement Tarjan's algorithm with a class that was much more awkward with a struct.
I have almost always used ref parameters for functions to ensure 2-way passing of information though this is sometimes unnecessary. I think this should be the default with the use of const where possible to
aid in undertanding the program.
New in version 2.0:
The procedure for generating the IRR's has been changed to include generating the origins of the IRR's i.e. the IRR's are now
represented as triplets. This gives a more informative result providing the proof of the
reachability of the LHS's of the IRR's and the algorithm has been changed as a result and
is now much easier to describe. It also runs about twice as fast as version 1.3 which is partly due to avoiding the
unnecessary copying of IRR's through the default method D has for copying objects (pointer
copies) instead of physical copies obtained with .dup. I had concerns about doing this at
first until I realised that it is quite easy to check the program for the absence of "back doors"
i.e. other variables that point to the same memory that could be used to change such a variable. There
will often be such other variables initially and it must be checked that they go out of scope without modifying
the variable of interest. This mostly deals with my earlier comment about .dup.
New in version 1.3: the IRR's are printed with the additional "L" or "R" giving the
pointer position on the LHS of the IRR. This is just a convenience because the pointer
position is included as before, but the 'type' of the rule e.g. LL, LR, RL or RR can now be
read off quickly after the second CS e.g. to determine if the rule has type RL or LR which allows
it to be extended by an additional symbol and not be reducible.
New in version 1.2:
(1) The program has been translated from C++ into D (dlang.org).
[D is much easier to use than C++, has bounds checking making anything like Valgrind
unnecessary as far as I can see, has many more features and has similar syntax.
This otherwise excellent language does unfortunately require the programmer to set up
and use functions called "dup" that duplicate things that I think should be done
automatically in assignments and copying objects into arrays etc. If the programmer
forgets to do this correctly a new pointer will be assigned to the object instead
of making a new object copy, so if the object is subsequently changed, errors could
arise in code that appears naively to be correct.]
(2) The symbols are now in the alphabet {a,b,c,...z} and the states are numbered 1,2,3,...
allowing for Turing Machines with many more states and symbols to be analysed.
This also removes ambiguity in symbolic expressions where exponents are used and is in accord
with the notation in Paper 2.
(3) A much improved format for reading Turing machines.
(4) There was a modification to the way halting is handled: halting is now considered
to be a property of a TM instruction rather than a TM state. As such, the TM can either move right (R)
left(L) by one square, or halt (H) at each TM step. This avoids needing a distinguished state, but it means that
a configuration set (CS) reached as a result of a halt must specify this explicitly e.g. by putting (H) after it.
In any case my interest is mainly in describing non-terminating computations so I probably won't
make much use of this in future.
The original version 1.1 was based on the translation into D of TIEv1.0 (written in C++). It generates the set of all
irreducible regular rules (IRR) for any Turing Machine (TM) up to any specified
length, written by John Nixon (john.h.nixon1@gmail.com formerly john.h.nixon@spamcop.net).
Comments are welcome.
In the terminology of paper 1, "rule" is to be read as
"regular rule" i.e. RR. Likewise "irreducible_rule" and "reducible_rule" are
to be read as IRR and RRR respectively as defined in paper 1.
Comments on D:
Function parameters: "ref" is sometimes needed to ensure proper functioning of functions using 2-way information flow
even if the arguments are ref types, so use it always for 2-way flow.
I use "out" for output-only.
I don't use "in" because it is confusing because it is unfortunately almost synonymous with "const", not just copying
input variables which is what the semantics suggests. It should be for input-only.
I use const instead.
You cannot declare a variable conditionally without restricting the scope of that variable to the
code in the conditional block. Could the scope of that variable be the next outer scope instead?
*/
import std.stdio;
import std.algorithm;
import std.string;
import std.file;
import std.conv;
import std.typecons;
import std.format;
//****************************************************************************************************
//Global variables: and definitions
//Those used throughout:
File of;//output file;
int nsy;//# symbols
int nst;//# machine states
IRR [][] irrs = new IRR[][0];//The set of IRR's for the TM, used by main and apply_rules
IGR [][][][] igrs = new IGR[][][][0];//igrs[i][j1][j2] is the set of IGR's that create the set IRR(i + 2) from the set IRR(i + 1) i.e.
//the IGR(i + 2), and have parameters j1 = i1 - 1, j2 = i2 for generating new origins and new RHS's respectively (see paper pp 22-25).
//The array of IGR's (initially inclusive IGR's i.e. IGR and context strings are combined before the contexts are removed)
//igrs is split into a 3D array of sub-arrays of IGR's igrs[i][j1][j2].
//The recursive function "new_origins" was probably the most difficult part of the program to write and the most important.
//It generates all the new endpoints of the reverse computation from a given CS as a result of adding a single extra symbol
//to the finite part of the TM machine tape considered. new_origins uses and can reuse memory space that does not have to be declared
//for each call and is conveniently packaged into one structure. The Type name is Store, and the instance is called st.
alias Store = Tuple!(bool, "inside", bool, "dircheck", CS[], "L", bool, "cycle",bool, "right_at_end", bool, "left_at_end", bool, "cond_1", int, "j", int, "ind", int, "j1", CS[], "one_step_ori", bool, "right_at_start");
Store st;
alias FIGR = Tuple!(CS, "lo", CS, "lr", char, "alpha", CS, "ro", CS, "rr");//flattened IGR lo,lr,alpha,ro,rr for left hand side origin, left hand side right hand side etc.
alias LIGR = Tuple!(CS, "lo", CS, "ro");
alias RIGR = Tuple!(CS, "lr", CS, "rr");
alias int_pair = Tuple!(int, "j1", int, "j2");
alias NOT = Tuple!(CS, "new_ori", int, "j1", char, "alpha");//new origins triplet
alias NOOCO = Tuple!(CS, "cs0", NOT[], "nota");//new origins one cycle output
//Global variables used by Tarjan's Algorithm for picking out the strongly connected subsets of IGR's
int index;//Used by strongConnect. index is zero if it has not been set. Values >0 have been set
vertex w;
vertex[] S;//the stack
vertex[] SCC;//accumulates each SCC as it is found, then it is emptied to the output.
vertex[] G;//the graph
//*************************************************************************************************************************
//***************************************** class vertex (from Tarjan_v1_0.d) ****************************
class vertex {
string name;
int index;
int lowlink;
bool onStack;
vertex [] descendents;
int opCmp(vertex o) const {
// writeln("\nthis.name = ",this.name, " o.name = ",o.name," this.name > o.name = ",this.name > o.name);
return this.name > o.name;}
// void write(){
// writeln("(name = ",name, ", index = ",index," lowlink = ",lowlink," onStack = ",onStack,", # descendents = ",descendents.length,")");}
this(string name,int index,int lowlink,bool onStack,vertex [] descendents){
this.name = name;
this.index = index;
this.lowlink = lowlink;
this.onStack = onStack;
this.descendents = descendents;}
}
//************************** class CS ************************************************
//This class represents a configuration set(CS) as defined in the paper.
//I will use the convention that if p>0, symbols will be added to the left
//of the pointer, and if p=0, symbols will be added to the right.
//This distinguishes the two cases when l=0 because then p=1 and 0 respectively.
class CS{
int s = 0;//state
int p = 0;//pointer position
int l = 0;//length of tape given
char [] t = [];//part of tape of specified length
@safe final this(const int s,const int p,const int l,const char[] t){
this.s = s;
this.p = p;
this.l = l;
this.t = t.dup;}
@safe final this(){
}//needed for calling 'new' without an argument to create a default class object
@safe final CS dup() const {return new CS(s,p,l,t);}
//Order relation for sorting CS's: Compare first by pointer positions. If
//these are the same compare by machine states, if these are equal
//and the pointer is at the left ( <= 1), compare by symbol strings on the "tape", and if the pointer
//is at the right ( >= l), compare by the reversed symbol strings on the "tape". This is such that the most
//significant end of the tape is the end closest to the pointer. Then truncation at the other end does not alter the sort order.
final int opCmp(ref const CS rhs) const{
if (p != rhs.p) return p - rhs.p;
else if(s != rhs.s)return s - rhs.s;
else if (p <= 1){
if (t != rhs.t) {
return t < rhs.t ? -1 : 1;} else return 0;}
else if (p >= l){
char [] str1 = t.dup;
char [] str2 = rhs.t.dup;
if(str1 != str2) {
return reverse(str1) < reverse(str2)? -1 : 1;}else return 0;}
else {writeln("Error ocurred in opCmp"); return 0; }}
//This is needed though it seems superfluous! needed to equality test class objects by their values
@trusted override bool opEquals(const Object o) const {
auto rhs = cast(CS)o;//not allowed in safe code
if(s != rhs.s) return false;
else if(p != rhs.p) return false;
else if(l != rhs.l) return false;
else if(t != rhs.t) return false;
else return true;}
}
//******************************** class IRR **************************************
//relation between two CS's cs1 and cs2 defined by the action of the TM. In the paper
//special types of rule were defined: regular rules i.e. rules defined by a simple
//recursive definition, and the subset of these that are non-redundant i.e. derived
//in more than one step. These are the irreducible regular rules (IRR's). Later the
//array of CS's cs0 was added to the definition. These are all the CS's that lead by
//one or more TM steps to cs1.
class IRR{
CS[] cs0 = [];
CS cs1;
CS cs2;
char lsig = ' ';//pointer position in cs1 i.e. LHS: 'L' for 1 and 'R' for cs1.p > 1, blank if cs1.l=1
char sig = ' ';//L,R,H, or C i.e. left moving, right moving, halting, or cycling resp.
//See the paper for details.
int n_derivation = 0;//# derivation steps from cs1 to cs2.
int n_machine = 0;//# machine steps from cs1 to cs2.
@safe final this(ref CS[] cs0,ref CS cs1,ref CS cs2, const char lsig, const char sig, const int n_derivation,const int n_machine){
this.cs0 = cs0.dup;
this.cs1 = cs1.dup;
this.cs2 = cs2.dup;
this.lsig = lsig;
this.sig = sig;
this.n_derivation = n_derivation;
this.n_machine = n_machine;}
@safe final this(){ }
@safe final IRR dup(){ return new IRR(cs0, cs1, cs2, lsig, sig, n_derivation, n_machine);}
//Compare IRR's by their RHS CS's (cs2) then by cs1. The reasoning is that cs2 is always available in IRR's and IGR's but cs1 is often not available
//eg in IGR's but could be used to disabiguate sort orders. cs0 in general has multiple CS entries that themselves may not be sorted so should not be used.
final int opCmp(const ref IRR rhs){
if(cs2 != rhs.cs2) return cs2 < rhs.cs2? -1: 1;
else if(cs1 != rhs.cs1) return cs1 < rhs.cs1? -1: 1; else return 0;}
}
//************************************** class IRRP_set **************************************
//This class is for sets of IRR patterns (IRRP's) and the alpha value that generates them from another IRRP. see in my paper (2021 version).
//The RHS is unique but there can be many origins.
//An IRRP_set is a set of IRR's that have a common set of origins and rhs that have
//arbitrary context strings $T_1$ and $T_2$ respectively. These are just place-holders and will not be represented in the class
//because the information is coveyed by the class name itself.
//This is for convenience when using apply_rules to generate RHS's.
//The class IRRP_set is part of the class IGR below.
class IRRP_set{
char alpha = '*';//default value to permit sorting
CS[] origins = [];
CS rhs;
final this(const char alpha, CS[] origins, const ref CS rhs){
this.origins = origins.dup;
this.rhs = rhs.dup;
this.alpha = alpha;}
@safe final this(){}
final int opCmp(ref const IRRP_set irrps) const{//<,>,== should be exhaustive
if(alpha < irrps.alpha) return -1;
if(alpha > irrps.alpha) return 1;
if(rhs < irrps.rhs) return -1;
if(rhs > irrps.rhs) return 1;
if(origins[0] < irrps.origins[0]) return -1;
if(origins[0] > irrps.origins[0]) return 1;
return 0;}
final IRRP_set dup() { return new IRRP_set(alpha,origins,rhs);}
override bool opEquals(const Object o) const{//needed to compare class objects by their values
auto irrps = cast(IRRP_set)o;
if(alpha != irrps.alpha) return false;
if(origins != irrps.origins) return false;
if(rhs != irrps.rhs) return false;
return true;}
}
//****************************************** class RHE *****************************************
//right hand element. An array of these form the rhs of an IGR.
class RHE{
IRRP_set irrp_set;
int n_TM_step = 0;
//n_TM_step is the number of TM steps from the original rhs to the final rhs on the right of the IGR that contains the RHE.
int n_Sub_step = 0;
//n_Sub_step is the number of substitution steps involved in this.
@safe final this(){}
final this(ref IRRP_set irrp_set, const int n_TM_step, const int n_Sub_step){
this.irrp_set = irrp_set.dup;
this.n_TM_step = n_TM_step;
this.n_Sub_step = n_Sub_step;}
override bool opEquals(const Object o) const {
auto rhs = cast(RHE)o;//not allowed in safe code
if(n_TM_step != rhs.n_TM_step) return false;
if(n_Sub_step != rhs.n_Sub_step) return false;
if(irrp_set != rhs.irrp_set) return false;
return true;}
final RHE dup(){ return new RHE(irrp_set, n_TM_step, n_Sub_step);}
final int opCmp(ref const RHE rhe){
return irrp_set.opCmp(rhe.irrp_set);}
}
//****************************************** class Context_pair *********************************
//This is for any pair of strings. It is used only as a context for an IGR to be added on the left and right respectively of the implication.
//
class Context_pair{
char [] s1 = [];//left
char [] s2 = [];//right
final this(const ref char [] s1, const ref char [] s2){
this.s1 = s1.dup;
this.s2 = s2.dup;}
override bool opEquals(const Object o) const {
auto rhs = cast(Context_pair)o;//not allowed in safe code
if(s1 != rhs.s1) return false;
if(s2 != rhs.s2) return false;
return true;}
final this(){}
final Context_pair dup(){ return new Context_pair(s1,s2);}
}
//***************************************** class IGR **************************************
//This class is for sets of IGR's with a common lhs IRRP and with a set of pairs of strings (contexts). Each instance of
//this class will correspond to a row of Table 3 or Table 4. see my paper (2021 version).
class IGR{
int ns, j1, j2;//(ns) number of symbols in the rhs of the IGR (origin & rhs are same length),
// including context symbols, extreme positions (j1 = i1 - 1,j2 = i2) of the TM on the "tape" found when deriving the new origin and the new RHS.
//related to the length of the symbol strings. See paper.
Context_pair[] context = [];//this corresponds to a set of $T_1$ and $T_2$ pairs in the paper.
//The context is the same for each value of alpha.
IRRP_set lhs;//This IRRP_set corresponds to a set of IRR's to which when F is applied generates the set
//of IRRP_sets (combined with two ints into RHE's) on the rhs, one for each possible value of alpha (the symbol added).
RHE[] rhs = [];
final this(int ns, int j1, int j2, ref Context_pair[] context, ref IRRP_set lhs, ref RHE[] rhs){
this.ns = ns;
this.j1 = j1;
this.j2 = j2;
this.context = context.dup;
this.lhs = lhs.dup;
this.rhs = rhs.dup;}
final int opCmp(ref const IGR igr) const{
//compare IGR's by their lhs's. This is just based on logic, the set of rhs's follows from the lhs.
return lhs.opCmp(igr.lhs);}
//This is needed.
override bool opEquals(const Object o) const {
auto rhs1 = cast(IGR)o;//not allowed in safe code
if(ns != rhs1.ns) return false;
if(j1 != rhs1.j1) return false;
if(j2 != rhs1.j2) return false;
if(context != rhs1.context) return false;
if(lhs != rhs1.lhs) return false;
if(rhs != rhs1.rhs) return false;
return true;}
final this(){}
final IGR dup() { return new IGR(ns, j1, j2, context,lhs,rhs);}
}
//*********************** Several write functions ****************************
//A "write" function is not supplied by default for classes, but is for structs. I wrote
//CS_write, array_CS_write, IRR_write, array_IRR_write to mimic the results of the default struct write functions to compare
//the program outputs with classes and structs (for CS's and IRR's) and they agreed and other write functions were added later
//in the similar way.
void CS_write(File of,const ref CS cs){
if(cs !is null){
of.write("CS(");
of.write(cs.s);
of.write(", ");
of.write(cs.p);
of.write(", ");
of.write(cs.l);
of.write(", ");
of.write('"');
of.write(cs.t);
of.write('"');
of.write(")");}
else of.write("null CS");}
void Context_pair_write(File of, const ref Context_pair sp){
if(sp !is null){
of.write("(");
of.write(sp.s1);
of.write(",");
of.write(sp.s2);
of.write(")");}
else of.write("null Context_pair");}
void IRR_write(File of,const ref IRR r){
if(r !is null){
of.write("IRR(");
of.write("[");
bool first=true;
foreach(ref x;r.cs0){if(!first)of.write(", ");
CS_write(of,x);first = false;}
of.write("]");
of.write(", ");
CS_write(of,r.cs1);
of.write(", ");
CS_write(of,r.cs2);
of.write(", '");
of.write(r.lsig);
of.write("', '");
of.write(r.sig);
of.write("', ");
of.write(r.n_derivation);
of.write(", ");
of.write(r.n_machine);
of.write(")");}
else of.write("null IRR");}
void array_CS_write(File of,const ref CS[] array){
if(array !is null){
of.write("[");
foreach(ref x;array){
of.write(" ");CS_write(of,x);}
of.write(" ]");}
else of.write("null CS[]");}
void context_write(File of,const ref Context_pair[] array){
if(array !is null){
of.write("context[");
foreach(ref x;array){
of.write(" ");Context_pair_write(of,x);}
of.write(" ]");}
else of.write("null context");}
void array_IRR_write(File of,const ref IRR[] array){
if(array !is null){
bool first = true;
of.write(array.length);
foreach(ref x;array){
of.write(" ");if(!first)of.write("\n");IRR_write(of,x);first = false;}}
else of.write("null IRR[]");}
void IRRP_set_write(File of, const ref IRRP_set irrp_set){
if(irrp_set !is null){
of.write("IRRP_set(");
of.write(irrp_set.alpha);
of.write(", ");
of.array_CS_write(irrp_set.origins);
of.write(", ");
of.CS_write(irrp_set.rhs);
of.write(")");}
else of.write("null IRRP_set");}
void RHE_write(File of, const ref RHE rhe){
if(rhe !is null){
of.write("RHE(");
of.IRRP_set_write(rhe.irrp_set);
of.write(", ");
of.write(rhe.n_Sub_step);
of.write(", ");
of.write(rhe.n_TM_step);
of.write(")");}
else of.write("null RHE");
}
void array_RHE_write(File of, const ref RHE[] array){
if(array !is null){
of.write("[ ");
bool first = true;
foreach(ref x; array){
if(!first)of.write("\n");
RHE_write(of,x);
first = false;}
of.write(" ]");}
else of.write("null RHE[]");}
void array_IRRP_set_write(File of,const ref IRRP_set[] array){
if(array !is null){
bool first = true;
of.write("[ ");
foreach(ref x;array){
if(!first)of.write("\n");IRRP_set_write(of,x);first = false;}
of.write(" ]");}
else of.write("null IRRP_set[]");}
void IGR_write(File of, const ref IGR igr){
if(igr !is null){
of.write("IGR(");
of.write(igr.ns);
of.write(", ");
of.write(igr.j1);
of.write(", ");
of.write(igr.j2);
of.write(", ");
of.context_write(igr.context);
of.write(", ");
of.IRRP_set_write(igr.lhs);
of.write(", ");
of.array_RHE_write(igr.rhs);
of.write(")");}
else of.write("null IGR");}
void array_IGR_write(File of,const ref IGR[] array){
if(array !is null){
bool first = true;
of.write(array.length);
foreach(ref x;array){
of.write(" ");if(!first)of.write("\n");IGR_write(of,x);first = false;}
of.writeln();}
else of.write("null IGR[]");}
void NOOCO_write(File of, const ref NOOCO nooco){
write("\n nooco.cs0 = ");CS_write(stdout,nooco.cs0);
foreach(y; nooco.nota){//new origins triplet array
write("\n y.new_ori = ");CS_write(stdout,y.new_ori);
write("\n y.j1 = ",y.j1);
write("\n y.alpha = ",y.alpha);}}
void FIGR_write(File of, const ref FIGR figr){
of.CS_write(figr.lo);of.write(", ");
of.CS_write(figr.lr);of.write(", ");
of.write(figr.alpha);of.write(", ");
of.CS_write(figr.ro);of.write(", ");
of.CS_write(figr.rr);of.write("\n");}
//then a lot of the structure can go, and an IGR can then be of types CS, CS, char, CS, CS ie lo,lr,alpha,ro,rr for left hand side origin, left hand side right hand side etc.
void LIGR_write(File of, const ref LIGR ligr){
of.CS_write(ligr.lo);of.write(", ");
of.CS_write(ligr.ro);of.write(", ");}
void RIGR_write(File of, const ref RIGR rigr){
of.CS_write(rigr.lr);of.write(", ");
of.CS_write(rigr.rr);of.write(", ");}
void Store_write(const ref Store st){
write("\n\ninside = ",st.inside);
write("\ndircheck = ",st.dircheck);
write("\nL = ");array_CS_write(stdout,st.L);
write("\ncycle = ",st.cycle);
write("\nright_at_end = ",st.right_at_end);
write("\nleft_at_end = ",st.left_at_end);
write("\ncond_1 = ",st.cond_1);
write("\nj = ",st.j);
write("\nind = ",st.ind);
write("\nj1 = ",st.j1);//extreme pointer position counted from the alpha end of the string st. pointer starts at position 0.
write("\none_step_ori = ");array_CS_write(stdout,st.one_step_ori);
writeln("\nright_at_start = ",st.right_at_start);}
void G_write_full(){
foreach(v;G){
v.write();
foreach(x;v.descendents){
write(" ");x.write();}}}
void G_write_short(){
foreach(v;G){v.write();}}
//********************************** function can_write **************************************
//This function finds a filename to be written to for output and returns it. It ensures
//that no file is overwritten without the consent of the user. "base_name" is usually the input
//datafile name and "extension" is the default file extension for output (including the dot)
@trusted char[] can_write(const char[] base_name,const char[] extension){
char[] filename = (base_name ~ extension).dup;
char[] chars;
char[] question;
bool write1,fn,overwrite=false;
do{
fn = exists(filename);
if(fn){
writeln("The file ",filename," already exists.");
question="Do you want me to overwrite it?".dup;
overwrite=yes(question);
if(!overwrite){
writeln("The new path and file name is ",base_name,"",extension);
std.stdio.write("Enter the extra characters: ");
readln(chars);chars=chomp(chars);//not allowed in safe code
filename=base_name ~ chars ~ extension;}}
write1=(!fn) || overwrite;}
while (!write1);
writeln("Output will be to the file ",filename);
return filename;}
//************************************** function yes *****************************************
//This function asks the user the question by dispaying the text 'question' and waits for
//the response y (for yes) or n (for no)
@trusted bool yes(const char[] question){
char ans='*';
bool ret;
char[] res;
while(ans!='y' && ans!='n'){
std.stdio.write(question,"(y/n): ");
res=chomp(readln().dup);//not allowed in safe code
ans=res[0];
if(res.length==1){
if(ans=='y')ret = true;
else if(ans=='n') ret = false;}
else{ans='*';}}
return ret;
}
//*********************************** function input_int ***************************************
//The purpose of this function is to extract an int from the input stream and handle
//the error that occurs when a non-number is entered, and allow the user to enter it again.
@trusted void input_int(ref int i){//ref is needed here
bool error;
char[] line;
for(;;){
line = chomp(readln().dup);//not allowed in safe code
error = false;
try{i = to!int(line);} catch (Exception){error = true;}
if(!error) break;
std.stdio.write("An error occurred. Please try again: ");}
}
//*********************************************** function is_substring_left ************************************************
//This determines whether of not the shorter of the char arrays is identical to the portion of the other string of the same length
//taken from the left.
bool is_substring_left(ref char[] str1, ref char[] str2){
int l = to!int(min(str1.length, str2.length));
foreach(i; 0 .. l){
if(str1[i] != str2[i])return false;}
return true;}
//************************************ function strongConnect ********************************
void strongConnect(vertex v){
int count;
v.index = index;
v.lowlink = index;
index++;
S ~= v;//push in v
v.onStack = true;
foreach(w; v.descendents){
if(w.index == -1){
//writeln("calling strongConnect for ",w.name);
strongConnect(w);
// writeln("strongConnect for ",w.name," ended");
v.lowlink = min(v.lowlink,w.lowlink);}
else if(w.onStack){
v.lowlink = min(v.lowlink,w.index);}}
if(v.lowlink == v.index){
SCC = remove!("true")(SCC);//empties SCC if not already empty
do{
// writeln("S.length-1 = ",S.length-1);
w = S[S.length - 1];//last element of S
SCC ~= w;
--S.length;//s.pop()
w.onStack = false;
// writeln("w = ");w.write();writeln(" v = ");v.write();
}while(w.name!= v.name);
count = 0;
foreach(v1;SCC) ++ count;//checks that #(SCC) > 1 before outputting it.
if(count > 1){
// of.write("SCC = ");
// SCC.sort!();//this is ignored!
foreach(v1; SCC){of.write(v1.name);of.write(" ");}of.writeln("");}}
}
//************************************************* function can_follow ****************************************
//This function determines whether or not one IGR can be followed by another
//and if true, two IRR's can be derived in turn from a single IRR.
//The arguments are 6 CS's, the origin and rhs of the rhs of the first IGR, then the same for the lhs of the second IGR.
//and in addition, cs1 and cs2 at the origin of the lhs of the IGRs. This is needed to check that the type of the first
//IGR to ensure it corresponds to extendable IRR's i.e. type LRL or RLR.
//There are several conditions that must be satisfied:
//1) The direction of adding alpha must be the same
//2) The states on the left and right must match (rhs of x and lhs of y)
//3) IGR x (the first) must be of extendable type i.e. RLR of LRL
//4) The strings on the left and right must match from the end indicated in step 1.
bool can_follow(const ref FIGR figr1, const ref FIGR figr2){
bool lg_1,lg_2;
char[] str1_l,str1_r,str2_l,str2_r;
lg_1 = figr1.lo.p == 1;//figr1 is left-going (LRL or LRR)
lg_2 = figr2.lo.p == 1;//figr2 is left-going
if (lg_1 != lg_2)return false;//ensures condition 1
if(figr1.ro.s != figr2.lo.s)return false;//these ensure
if(figr1.rr.s != figr2.lr.s)return false;//condition 2
if(!((lg_1 && figr1.rr.p == 0) || (!lg_1 && figr1.rr.p == figr1.rr.l + 1)))return false;//figr1 must generate IRR's with extendable type
str1_l = figr1.ro.t.dup;
str1_r = figr1.rr.t.dup;
str2_l = figr2.lo.t.dup;
str2_r = figr2.lr.t.dup;
if(lg_1 && is_substring_left(str1_l,str2_l) && is_substring_left(str1_r,str2_r))return true;//strings match(left-going case)
reverse(str1_l);
reverse(str1_r);
reverse(str2_l);
reverse(str2_r);
if(!lg_1 && is_substring_left(str1_l,str2_l) && is_substring_left(str1_r,str2_r))return true;//strings match(right-going case)
return false;}
//***************************************** template function unique_sort_add ******************************************
//add element x to array in such a way as to maintain increasing order, and such that no elements are repeated
//in the array.
//Algorithm. Bisect the array segment repeatedly so as to locate between which pair of
//elements x is to be inserted to maintain inreasing order. When the pair of elements differ by 1 in index
//x can be inserted between them provided x is different from each of this pair.
//nova (meaning new which cannnot be used because it is a keyword) tells whether x is distinct from all the elements of array or not
void unique_sort_add(T)(ref T[] array, ref T x, out bool nova){
nova = true;
int l = to!int(array.length);//include case when l > 0 but elements are null
if(l == 0) array ~= x;//array is empty
if(l > 0){
//deal with the end cases next
if(x.opCmp(array[0]) < 0) {
if(array is null)array = [x]; else array = [x] ~ array;}//x is smaller than all array elements
else if(x.opCmp(array[l - 1]) > 0) array ~= x;//is greater than all array elements
else{//l != 0 and x is not to be inserted at an end.
//i1 and i2 are the indices which bracket the position where x is to be placed.
//and are at the lower and upper end of the range respectively.
int i3,i1 = 0;
int i2 = l - 1;
while(i2 - i1 > 1){
i3 = i1 + (i2 - i1)/2;
if(x.opCmp(array[i3]) < 0) i2 = i3;
else i1 = i3;}
if(array[i1] != x && x != array[i2]){//x should be included
array = array[0 .. i1 + 1] ~ x ~ array[i2 .. $];}
else nova = false;
//x is not different from all the elements of array, so array is not altered.
}}}
//******************************** function split_string ***************************
//Remove n_symbols from the string of a CS on the left or right opposite the pointer and put them in cont.
//The pointer is at or one space beyond the end of the string when called, and the length of the symbols strings is >= 2.
void split_string(ref CS cs,const int n_symbols, out char[] cont){
// write("\nsplit_string: cs = ");CS_write(stdout,cs);write(" n_symbols = ",n_symbols);
if(cs.t.length != cs.l){write("cs.l != cs.t.length");}
int i;
CS cs1 = cs.dup;
bool right = cs.p - cs.l == 0 || cs.p - cs.l == 1;//pointer is at right
cs.l -= n_symbols;
if(right){
cs.p -= n_symbols;
i = to!int(cs.t.length) - n_symbols;
rotate(cs.t,i);
cont = cs1.t[0 .. n_symbols];}
else cont = cs1.t[$ - n_symbols .. $].dup;
cs.t.length -= n_symbols;
}
//******************************** function add_opposite_pointer *******************
//Add symbol string (not inplemented as a D string!) s at the opposite end of the string of symbols from the pointer. formerly add_at_x
//The routine assumes that the pointer is at one end of the known
//symbols on the tape (other cases will not be needed). If the length of the string of known symbols is 1 s will be added on the right
//unless direction is set to 'L'.
@safe void add_opposite_pointer(ref CS cs,const char [] s, const char direction){
//symbols s are added at the right. Pointer is at left. ( i.e. assume cs.p == 1)
int len = to!int(s.length);
cs.t ~= s;
cs.l += len;
if(cs.p > 1 || direction == 'L'){//the symbols are added at the left, pointer is at right ( if cs.p == cs.l )
//rotate the string round so that s is now added at the left, and adjust cs.p accordingly.
cs.p += len;
rotate(cs.t,len);}}
//********************************* function add_at_ptr_or_end ****************************
//Add symbol sy to cs (1) at or (2) adjacent to the pointer position at end of symbol string.
//The routine assumes that the pointer is just off one end by one space (case 1) or at one end (case 2)
//of the known symbols on the tape so that the added symbol never clobbers previous known symbols on the tape.
//This assumes that the length of the symbol string is > 1 for case (2). If l = 1 and case (1) then
//the other parameters cannot tell whether the symbol sy should be added at the left or right. In this case
//this is determined by the parameter direction (L or R) respectively. 'R' is the default value.
@safe void add_at_ptr_or_end(ref CS cs,const char sy, const char direction){
cs.t ~= sy;
if(cs.p <= 1 && cs.l > 1 || cs.p == 0 && cs.l == 1 || (cs.p == 1 && cs.l == 1 && direction == 'L')){
rotate(cs.t,1);//swaps added symbol from right to left end.
++(cs.p);}
++(cs.l);
}
//******************************** function rotate *********************************
//Rotates a char[] array of length l by r places to the right.
@safe void rotate(ref char[] str,const int r){
char[] temp = str.dup;
int l=to!int(str.length);
temp.length = 2 * l - r;
foreach(ref i;0 .. l - r)temp[i + l] = temp[i];
foreach(ref i;0 .. l)str[i] = temp[i + l - r];
}
//******************************* function subset **********************************
//Returns whether cs1 is a subset of cs2 (includes equality)
@safe bool subset(const CS cs1,const CS cs2){
int ll;
if(cs1.s != cs2.s)return 0;//false if states differ
ll = cs1.p - cs2.p;//difference between left end of tape represented
if(ll < 0)return 0;
if((cs2.l - cs2.p) > (cs1.l - cs1.p))return 0;
char[] sub;
sub.length = cs2.l;
foreach(ref i;0 .. cs2.l)sub[i]=cs1.t[i + ll];
return sub == cs2.t;}
//******************************* function st_init ************************************
//This initialises the Store object st.
void st_init(){
st.inside = false;
st.dircheck = false;
st.L = [];
st.cycle = false;
st.right_at_end = false;
st.left_at_end = false;
st.cond_1 = 0;
st.j = 0;
st.ind = 0;
st.j1 = 0;}
//*************** why not initialise one_step_ori and right_at_start? **************
//************************* function new_origins ************************************
//new origins is a recursive algorithm (i.e. it calls itself) to perform the following tasks, but more efficiently.
//Initially the CS cs is augmented by adding an arbitrary symbol (called alpha in the papers) at the
//same end where the pointer is for the first call to new_origins. (This is the opposite end of the string to where
//the pointer is in IRR.cs1 i.e. its LHS.)
//All values of alpha are included in the search.
//Then new_origins finds all single backward TM steps from there and their
//resulting CS's. This is repeated for each branch recursively, until
//one of 4 conditions occurs in each branch. Condition 1: the pointer finishes at the opposite end from
//where it started in the LHS which is the same end where it was in cs (CS A in the paper). Condition 2: the pointer finishes at the same end
//to where it started in the LHS which is the opposite end where it started in cs.
//Condition 3: the computation can go no further because backward steps from there are impossible.
//Finally, condition 4: a CS is reached that has been entered before.
//All the CS's arrived at as a result of branches ending in condition 1 constitute the array ori on exit.
//The algorithm is based on a depth-first search of the tree. An array of the cs's obtained on
//the way (L) must be maintained in order to check for a loop (condition 4). This has to be updated at each step of
//the search. The first time new_origins is called for a CS cs, L is empty. Otherwise L is the current list of all
//previous CS's in the backward search for origins, excluding the one that has just been or being found.
//The new element of L is added first for each call to new_origins.
//To avoid repetition, alpha is not actually added to the string until the pointer reaches the end.
//The initial pointer position in the original LHS i.e. right or left end of the string, is passed down through the variable
//"right_at_start" meaning the pointer is at the right in IRR.cs1 (its LHS) which is true or false. This is used to distinguish
//case 1 and case 2 when the pointer ends up at the opposite end or same end.
//n_symbols is the number of symbols involved in this calculation for this branch. This is needed in order to find
//the number of symbols involved in apply_F to express it with no redundant symbols.
//n_symb is the array of values of n_symbols, one for each branch. n_symb is empty on first calling new_origins.
//The variables cycle, right_at_end, left_at_end, cond_1, j, ind, n_symbols and one_step_ori are in a global structure st of type Store
//of which there is just one instance to avoid creating new variables each time new_origins calls itself.
//ref is needed for all parameters (value and reference types) to guarantee 2-way information flow
//and avoid new copies being used for each call.
//The values in ori, n_symb, and alpha in nooco_object at the first call do not affect the running of new_origins except that these
//values will be prepended to any subsequent values added to these arrays by new_origins. So in a sense they are
//"out" only variables, but not quite. They should be ideally declared as "append only" array arguments but this is not possible.
//
@trusted void new_origins(ref CS cs,ref NOOCO nooco_object){
bool test;
st.cycle = canFind(st.L,cs);//not allowed in safe code
st.L ~= cs;
st.right_at_end = cs.p == cs.l;
st.left_at_end = cs.p == 1;
st.cond_1 = (st.right_at_start && st.left_at_end) || (!st.right_at_start && st.right_at_end);//condition 1
st.one_step_ori = [];
if(!st.cycle && ((!st.right_at_end && !st.left_at_end) || st.cond_1)){//conditions under which current branch is not required to stop i.e.
// non-cycling or condition 1. This eliminates cycling and condition 2.
//get the array (one_step_ori) of all CS's ncs that go to cs in 1 TM step with the pointer on the tape and all CS's in ori that
//do so from a new pointer position just outside the existing tape. The latter includes
//the extra symbol in place of the symbol alpha added.
foreach(ref k2; 1 .. nst+1){//look for a possible reverse TM step
foreach(ref k1; 0 .. nsy){//through the whole TM table
st.j = (k2 - 1) * nsy + k1;//index of TM table row
if(irrs[0][st.j].cs2.s == cs.s){//if the new TM state is the state of cs (1st condition for reverse step)
st.ind = cs.p - irrs[0][st.j].cs2.p;//array index of symbol matched. st.ind + 1 = (cs.p - 1 if TM step to be reversed goes right,cs.p + 1 otherwise)
st.inside = st.ind >= 0 && st.ind < cs.l;//pointer position after reverse TM step is within symbol string, otherwise it is at alpha
st.dircheck = (irrs[0][st.j].cs2.p == 2 && st.ind == -1) || (irrs[0][st.j].cs2.p == 0 && st.ind == cs.l);//checks that if condition 1 is true,
//the direction of the next reverse TM step takes the pointer just outside the current symbol string.
test = (st.inside && irrs[0][st.j].cs2.t[0] == cs.t[st.ind]) || (!st.inside && st.dircheck);
if(test){//if the new TM symbol is the symbol at ind (2nd condition for reverse step)
CS ncs = new CS;
ncs.s = irrs[0][st.j].cs1.s;//This would be k2 if TM table is written in order in file.
ncs.p = st.ind + 1;
ncs.t = cs.t.dup;
if(st.inside){
ncs.l = cs.l;
ncs.t[st.ind] = irrs[0][st.j].cs1.t[0];
st.one_step_ori ~= ncs;//dup not needed: cs2 goes out of scope next, so it cannot provide a "back door" to one_step_ori
}
else{//reverse TM step takes pointer outside string, so terminating this branch of the search.
//Find st.n_symbols which is the number of symbols involved i.e. the length of the character string involved in finding
//the new origin in the current branch.
//writeln("test11");
if(st.right_at_start){//The TM is at the right hand end of the symbol string in the LHS
st.j1 = 1;//find n_symbols i.e. the number of symbols used in the computation
foreach(ref x; st.L) st.j1 = max(st.j1, x.p);
st.j1 -= 1;}//because j1 = i1 - 1 is needed.
else{//The TM is at the left in the LHS
st.j1 = cs.l;
foreach(ref x; st.L) st.j1 = min(st.j1, x.p);
st.j1 = cs.l - st.j1;}//ex must be the extreme position based on starting at 0 instead of at cs.l and counting leftwards.
//find ncs when the new pointer position is just beyond the symbol string, remembering that if the
//reverse TM step goes left, then the new symbol is on the left increasing all the symbol positions by 1.
ncs.l = cs.l + 1;
ncs.t ~= irrs[0][st.j].cs1.t[0];
if(st.ind == -1) {++(ncs.p); rotate(ncs.t,1);}//add irrs[0][j].cs1.t[0] at position 0 or at right end of string otherwise
NOT noto;
noto.new_ori = ncs.dup;
noto.alpha = irrs[0][st.j].cs2.t[0];
noto.j1 = st.j1;
nooco_object.nota ~= noto;
}}}}}}
//This is the recursive bit
foreach(ref cs3;st.one_step_ori){
new_origins(cs3, nooco_object);}
if(st.L.length) --st.L.length;//remove last item in L if there are any after all other branches of the search have ended. (this CS cannot be an intermediate CS).
//It terminates the branch and is appropriate just prior to back-tracking from a branch that ends to follow another reverse TM step.}
}
//************************* function apply_rules ************************************
//The function "apply_rules" applies the previously found IRR's up to length one less than the current CS being
//processed to carry out substitution
//steps for the TM computation starting from the LHS named cs. n_steps is # substitution steps, nm = # TM steps.
//Do this repeatedly until one of the following
//results occurs: (1)pointer moves outside the length of tape with known symbols,
//to the right (R) or left(L).
//(2) a halt condition occurs (H), (3) an infinitely repeating cycle occurs(C).
//The result is a new rule with the stopping condition (LRHC) indicated and
//the numbers of derivation (i.e. substitution) and machine steps involved. In the case C there
//is no final CS. In the output there must be one, but C is
//indicated showing that it is not really final.
//The new rule is added to the IRR, because it must be irreducible
//as the result of the correct functioning of this computer program.
//Each computation will end in state L,R,H or C, meaning the pointer is
//at the left,right, the computation halts or does not terminate, respectively.
//On output, cs now is the final result of the substitutions. The new rules get
//inserted into the array of arrays of IRR's "irrs" in the main program.
//j2 is the extreme position of the pointer in this computation if it starts at position 0.
@trusted void apply_rules(ref CS cs,out int n_steps,out int nm, out char d, out char lsig, out int j2){
//d is "sig" for output cs: L,R,C, or H, lsig is sig for original cs (L or R).
if(cs.p == 1){lsig = 'L'; j2 = 0;} else {lsig = 'R'; j2 = cs.l;}//initialise j2
//write("\n1 apply_rules: j2 = ",j2);
CS[] cs_array;
cs_array ~= cs.dup;//dup is needed because cs_array is a record of the changing values of cs.
n_steps = 0;//# derivation steps already done
nm = 0;//# machine steps
bool match;
d = '0';
char sig;
int l;
IRR irr2;
for(;;){//repeatedly apply appropriate rules to continue computation
if(cs.p == 0){d = 'L';break;}
if(cs.p > cs.l){d = 'R';break;}
match=0;
for(int k = cs.l - 2;k >= 0; --k){//find the longest applicable rule
foreach(ref irr3;irrs[k]){
if(subset(cs,irr3.cs1)){match=1;
irr2 = irr3;//irr3 next goes out of scope so dup not needed provided irr2 is not altered bolow.
break;}}
if(match==1)break;}
//match now indicates if a matching rule has been found. This should be true always
if(!match){
throw new Exception("Error: no matching rule found");}
//Do substitution. The first position is element 0 as far as replace is concerned
if(cs.p == 1){//pointer at left in cs and irr2.
foreach(ref i;0 .. irr2.cs2.l){
cs.t[i] = irr2.cs2.t[i];}}
else if(cs.p > 1){//pointer at right in cs and irr2.
foreach(ref i;0 .. irr2.cs2.l){
cs.t[cs.l - irr2.cs2.l+i] = irr2.cs2.t[i];}}
cs.s = irr2.cs2.s;
cs.p += irr2.cs2.p - irr2.cs1.p;//increment of pointer position
// write("\n2 apply_rules: cs = ");CS_write(stdout,cs);
sig = irr2.sig;//keep a copy of this
l = irr2.cs2.l;//and this
//write("\n apply_rules: cs.p = ",cs.p);
//write("\n2 apply_rules: j2 = ",j2);
//cs.l is unchanged. cs is now the result of applying rule irr2 to previous cs
++ n_steps;
nm += irr2.n_machine;
if(count(cs_array,cs) > 0 || irr2.sig == 'C')d = 'C';//not allowed in safe code
cs_array ~= cs.dup;//dup needed
if(irr2.sig == 'H'){d = 'H';break;}
if(d == 'C')break;//if this CS was found before, there is an endless stationary loop so go to next case.
}
if(n_steps == 1)j2 = l - 1;
else{
if(sig == lsig) j2 = cs.l - 1; else j2 = cs.l;}
}
//********************************* main program **********************************************
void main(string[] args){
//1 Initial statement output and dialogues
writeln("This is TIEv3.2 copyright (C) 2024 John Nixon.\nThis program comes with ABSOLUTELY NO WARRANTY;\nThis is free software, and you are welcome
to redistribute it\nunder certain conditions; for more information see the source code file.\n");
if(args.length!=2){
writeln("Usage: ");
writeln("For example from the same directory as the program:");
writeln(" ./ ");
writeln("See the initial comments in the program source file for more information.");
writeln("The information in the file of the Turing Machine must be presented in the following");
writeln("order with each element separated from the next by any number (including zero) of");
writeln("whitespace characters (tabs spaces newlines etc).");
writeln("to allow great flexibility of presentation:");
writeln("Number of TM states (nst), Number of TM symbols(nsy), then for each combination of these:");
writeln("Old state,symbol read,new state,symbol printed,direction of movement (L or R) or halt (H).");
writeln("The states are 1,2,... nst, and the symbols are a,b,c,...");
writeln("The lines of the TM machine table must be sorted first by old state increasing, and then by");
writeln("symbol read in alphabetical order a to z.");
writeln("This last requirement simplifies the algorithm for searching for reverse TM steps");
writeln("For example\n3 \n2 \n1a2bR\n1b1bH\n2a3bR\n2b3aL\n3a1aL\n3b1aR\n");
writeln("For verification, the program's interpretation of the user specified TM is first shown.");
return;}
bool print_IRRs = yes("In the output file do you want the IRR's printed?");
bool print_full_IGRs = yes("In the output file do you want the full IGR's printed?");
bool print_IGRs = yes("and do you want the IGR's without contexts printed?");
int n_symbols;
char alpha;
char dir1, dir2;//direction for addition of new symbol(s) for the context strings dir1 will be used, for alpha
//dir2 will be used. ( values will be 'L' or 'R' or possibly ' ' if is not needed).
string inputfilename=args[1];
File infile = File(inputfilename,"r");
if(!exists(inputfilename)){writeln("I cannot find file ",inputfilename);return;}
char[] outputfilename=can_write(inputfilename,"_out.txt");
of.open(outputfilename.idup,"w");
infile.readf(" %s",&nst);infile.readf(" %s",&nsy);
int c = nsy*nst;
char [] question;
int q, n, opt;
stdout.write("\nEnter the maximum length of derived computation rules: ");
input_int(n);
if(print_IRRs){
stdout.write("\nEnter 1 for showing all IRR's, 2 for showing only IRR's of type LL, or 3 for showing only IRR's of type RR: ");
input_int(opt);}
irrs.length = n;
of.writeln("The Turing Machine table expressed as IRR's of length 1");
writeln("The Turing Machine table expressed as IRR's of length 1");
stdout.write("See the documentation in the program source file for the notation.\n");
of.write("See the documentation in the program source tie.d file for the notation.\n");
//2 Reading in the input Turing Machine file
foreach(ref oc;0 .. c){
IRR r1 = new IRR;//"new" is needed (a class object) to avoid a segmentation fault with r1.dup
CS[] cs0;
CS cs1 = new CS;//new is needed here
infile.readf(" %d",&cs1.s);
cs1.l=1;
cs1.t.length = 1;
infile.readf(" %s",&cs1.t[0]);
cs1.p=1;
cs0 ~= cs1;
CS cs2 = new CS;
infile.readf(" %d",&cs2.s);
cs2.l=1;
cs2.t.length = 1;
infile.readf(" %s",&cs2.t[0]);
infile.readf(" %s",&r1.sig);
if(r1.sig == 'L')cs2.p = 0;else if(r1.sig == 'R')cs2.p = 2;else cs2.p = 1;
r1.cs0 = cs0;
r1.cs1 = cs1;
r1.cs2 = cs2;
r1.n_derivation = 1;
r1.n_machine = 1;
irrs[0] ~= r1;}//dup not needed because r1 goes out of scope next.
//r1 is now out of scope so cannot be used to change irrs
array_IRR_write(stdout,irrs[0]);
array_IRR_write(of,irrs[0]);
int count, countll, countlr, countrl, countrr;
CS csi = new CS;
//3 Generate the reversed Turing Machine
CS[][CS] rtm;//reversed Turing Machine steps represented as an associative array. This structure is filled in next.
//An effect of using the associative array is that each time it is constructed the
//key-value pairs are usually in a different order! The storage is using a hash table, whose elements are stored in an unordered way.
//This is a very sophisticated data structure designed for rapid retrieval of information not really needed here.
//Do this first for the TM steps going left
int ind;
foreach(ref st;1 .. nst+1)foreach(ref sy;0 .. nsy){
CS cs1 = new CS;
cs1.s = st;
cs1.p = 0;
cs1.l = 1;
cs1.t.length = 1;
cs1.t[0] = to!char(sy+97);
rtm[cs1] = [];//declares that there are no CS(1)'s leading to this (state,symbol) combination yet found.
foreach(ref st1;1 .. nst+1)foreach(ref sy1;0 .. nsy){//search through TM table to find them all
ind=(st1 - 1) * nsy + sy1;
//IRR_write(of,irrs[0][0]);
//array_IRR_write(stdout,irrs[0]);
if(irrs[0][ind].cs2.p == 0 && irrs[0][ind].cs2.s == st && irrs[0][ind].cs2.t[0] == to!char(sy + 97)){//checks state and symbol and direction
CS cs2 = new CS;
cs2.s = irrs[0][ind].cs1.s;
cs2.p = 1;
cs2.l = 1;
cs2.t.length = 1;
cs2.t[0] = irrs[0][ind].cs1.t[0];
rtm[cs1] ~= cs2;//no dup needed because cs2 goes out of scope next.
// write("cs1 = ");CS_write(stdout,cs1);write("cs2 = ");CS_write(stdout,cs2);writeln("");
}}}
//then for the TM steps going right
foreach(ref st;1 .. nst+1)foreach(ref sy;0 .. nsy){
CS cs1 = new CS;
cs1.s = st;
cs1.p = 2;
cs1.l = 1;
cs1.t.length = 1;
cs1.t[0] = to!char(sy+97);
rtm[cs1] = [];
foreach(ref st1;1 .. nst+1)foreach(ref sy1;0 .. nsy){
ind=(st1-1)*nsy+sy1;
if(irrs[0][ind].cs2.p==2 && irrs[0][ind].cs2.s == st && irrs[0][ind].cs2.t[0] == to!char(sy+97)){
CS cs2 = new CS;
cs2.s = irrs[0][ind].cs1.s;
cs2.p = 1;
cs2.l = 1;
cs2.t.length = 1;
cs2.t[0] = irrs[0][ind].cs1.t[0];
rtm[cs1] ~= cs2;//no dup needed and it makes no difference!
// write("cs1 = ");CS_write(stdout,cs1);write("cs2 = ");CS_write(stdout,cs2);writeln("");
}}}
//4 Write out rtm the reversed Turing Machine
writeln("\n\nThe inverse Turing Machine follows:\n");
foreach(element;rtm.byKeyValue){
CS_write(stdout,element.key);write(", > ");array_CS_write(stdout,element.value);writeln("");}
//5 Generate the set IRR(2) the irreducible regular rules of length 2 using "apply rules"
//add symbols at the pointer s.t. forward TM step takes the pointer to opposite end of tape (l=2).
//alpha_n is the integer representation of the symbol called alpha in the papers: 0 for a ,1 for b etc. to give an irreducible rule
//This (section 5) uses the fact that for this special case there is a simpler procedure than the general one below in section 6.
//Here irr is the rule started with (just a TM table entry) and irr2 is the TM table entry used to progress the TM to the next step
//after the next symbol has been added.
c = 0;
bool inlist;
foreach(irr;irrs[0]){
foreach(irr2;irrs[0]){
if(irr2.cs1.s != irr.cs2.s)continue;//ignore non-matching rows of the machine table
if(irr2.sig == irr.sig)continue;//If pointer moves in same direction, no new IRR can result
IRR new_irr = new IRR;
CS cs3 = irr.cs2.dup;
add_at_ptr_or_end(cs3,irr2.cs1.t[0],'R');//LHS of new element of IRR(2)
new_irr.cs1 = cs3.dup;
CS cs4 = irr.cs1.dup;
char d, dir, lsig;
int n_steps, nm, j2;
if(irr.cs2.p == 0) dir = 'L'; else dir = 'R';
add_at_ptr_or_end(cs4,irr2.cs1.t[0],dir);//origin of new element of IRR(2)
new_irr.cs0.length = 1;
new_irr.cs0[0] = cs4;
apply_rules(cs3,n_steps,nm, d, lsig, j2);
new_irr.cs2 = cs3;
if (new_irr.cs1.p == 1) new_irr.lsig = 'L'; else new_irr.lsig = 'R';
new_irr.sig = d;
new_irr.n_derivation = n_steps;
new_irr.n_machine = nm;
inlist = false;
for(int j = 0; j < c; ++ j){
if(new_irr.cs1 == irrs[1][j].cs1) {
inlist = true;
irrs[1][j].cs0 ~= new_irr.cs0[0];
break;}}
if(!inlist){
++ c;
irrs[1] ~= new_irr;
// if(new_irr.cs1 ne all previous values) only then increment the counts
if(new_irr.sig == 'L'){ if(new_irr.lsig == 'L') ++ countll;else if(new_irr.lsig == 'R') ++ countrl;}
if(new_irr.sig == 'R'){ if(new_irr.lsig == 'L') ++ countlr;else if(new_irr.lsig == 'R') ++ countrr;}}
}}
stdout.write("\nThe numbers of irreducible regular (computation) rules (IRR's) of different lengths and types");
of.write("\nThe numbers of irreducible regular (computation) rules (IRR's) of different lengths and types");
stdout.write("\nderived from this Turing Machine are as follows:");
of.write("\nderived from this Turing Machine are as follows:\n");
stdout.writeln("\nSee the output file "~outputfilename~" for the list of IRR's obtained (if requested).");
write("\nlength: ", 2," # IRR's = total, RLL,RLR,LRL,LRR: ",c, " ", countll," ",countlr," ",countrl," ",countrr);
of.write("\nlength: ", 2," # IRR's = total, RLL,RLR,LRL,LRR: ",c, " ", countll," ",countlr," ",countrl," ",countrr);
igrs.length = n - 1;
//6 Generate the set IRR(3) and IRR's of greater length using "new origins" and "apply rules"
CS[] new_oris = new CS[0];//combined array of origins.
char[] d_alpha;
int n_derivation,n_machine;
bool new_alpha;
//carry out analysis for the required number of iterations i.e.
foreach(ref i; 2 .. n){//given all irreducible rules (IRR's) of length i, find all extensions of them to length i+1.
count = 0;//initialise count of the number of IRR's found of length i.
countll = countlr = countrl = countrr = 0;
foreach(ref r2; irrs[i - 1]){
if(!((r2.lsig == 'L' && r2.sig == 'R') || (r2.lsig == 'R' && r2.sig == 'L')))continue;//ignore IRR's that are not of type LR or RL
IRR[] new_irrs = new IRR[0];//empty array of IRR's to be filled in
NOOCO[] nooco_array;
foreach(ref cs; r2.cs0){//generate nooco_array an array of nooco_objects containing all the origins for each member of the origin of r2.
NOOCO nooco_object;
st.L = [];
st.right_at_start = cs.p == 1;//i.e. pointer is at left in origin (it is at right in the LHS).
st_init();//initialise the Store st used by new_origins.
new_origins(cs, nooco_object);
nooco_object.cs0 = cs.dup;
nooco_array ~= nooco_object;}
c = 0;//counter for the number of distinct new values of alpha
d_alpha = [];//array of distinct values of alpha in the order they are found.
foreach(ref x; nooco_array){//each old IRR gives one NOOCO array
foreach(ref noto; x.nota){//for each value of alpha
new_alpha = true;//assume at first this value of alpha has not been found before
for(int j = 0; j < c; ++j){//loop over all previously found values of alpha
if(d_alpha[j] == noto.alpha){//alpha has been found before i.e. alpha[j]
new_irrs[j].cs0 ~= noto.new_ori;//adds new origin to the IRR derived using the same value of alpha.
//increment one of the 4 counters (whichever one applies)
new_alpha = false; break;}}
if(new_alpha){
IRR new_irr = new IRR;
d_alpha ~= noto.alpha;
new_irr.cs0 ~= noto.new_ori.dup;
new_irr.cs1 = r2.cs1.dup;
add_opposite_pointer(new_irr.cs1,[noto.alpha], '*');//which side are symbols added is determined by new_irr.cs1
new_irr.cs2 = r2.cs2.dup;
add_at_ptr_or_end(new_irr.cs2, noto.alpha, dir2);//dir is not used because new_irr.cs2 has length >=2.
apply_rules(new_irr.cs2,n_derivation,n_machine,new_irr.sig,new_irr.lsig,ind);//last arg missing an int (out)
if(new_irr.cs1.p == 1)new_irr.lsig = 'L';else new_irr.lsig = 'R';
new_irr.n_derivation = n_derivation + 1;
new_irr.n_machine = n_machine + r2.n_machine;
++ count;
++ c;
new_irrs ~= new_irr;
if(new_irr.sig == 'L'){ if(new_irr.lsig == 'L') ++ countll; else if(new_irr.lsig == 'R') ++ countrl;}//in the first pass through loop
if(new_irr.sig == 'R'){ if(new_irr.lsig == 'L') ++ countlr; else if(new_irr.lsig == 'R') ++ countrr;}//determine the type of IRR
}
}}
irrs[i] ~= new_irrs;}
write("\nlength: ", i + 1," # IRR's = total, RLL,RLR,LRL,LRR: ", count, " ", countll," ",countlr," ",countrl," ",countrr);
of.write("\nlength: ", i + 1," # IRR's = total, RLL,RLR,LRL,LRR: ",count, " ", countll," ",countlr," ",countrl," ",countrr);
}
if(print_IRRs){
if(opt == 1)of.write("\nAll IRR's are shown\n\n");
if(opt == 2)of.write("\nOnly IRR's of type LL are shown\n\n");
if(opt == 3)of.write("\nOnly IRR's of type RR are shown\n\n");
foreach(ref i; 0 .. n){
if(irrs[i].length != 0){
if(i)of.writeln("IRR's of length ",i + 1,"(",irrs[i].length,")");
else of.writeln("\nThe Turing Machine table expressed as IRR's of length 1");
sort!()(irrs[i]);
foreach(j,ref it;irrs[i]){
if((opt == 1)||(opt == 2 && it.cs1.p == 1 && it.cs2.p == 0)||(opt == 3 && it.cs1.p == it.cs1.l && it.cs2.p == it.cs2.l + 1)){
of.write(j + 1," ");IRR_write(of,it);of.writeln();}}
of.writeln();}}}
//7 Generate the set IGR(2) from scratch
char[][] d2_alpha;
int n_steps, nm, j1, j2, r1 ,r2;
char[] cont_l,cont_r;
CS cs0,cs1,cs2,cs3,cs4,cs5,cs6;
char lsig,sig;
int[] d_j2;
int_pair[] d_j_pair;
bool new_j2;
bool new_j_pair;
bool next;//jump to next case because TM step takes machine in same direction so result cannot be an IRR or part of an IGR.
bool new_result, first_new_result, new_igr, new_ns, nova;
char d;//out variable for apply_rules
IGR[] igr_array1;
IGR[][] igr_array;
igrs[0].length = 1;//index 0 is out of bounds because igrs has length 0
igrs[0][0].length = 3;
//d2_alpha is here used for the array of distinct alpha values for each distinct value of the index j2 (= i2 in paper)
foreach(llc1,ref r; irrs[0]){//for each single TM step
d_j2 = [];//array of distinct values of j2 for the same r in the order they are found.
d2_alpha = [];
igr_array1 = [];
igr_array1.length = 3;
if(r.cs2.p > 1) {dir1 = 'L'; dir2 = 'R';}
if(r.cs2.p == 0) {dir1 = 'R'; dir2 = 'L';}
new_igr = false;//This indicates whether igr will actually be needed. If not igr goes out of scope without being used.
foreach(ref sy; 0 .. nsy){
alpha = to!char(sy + 97);
cs4 = r.cs1.dup;//reset these
cs5 = r.cs2.dup;//values
//add alpha to r
add_at_ptr_or_end(cs4, alpha, dir2);
add_at_ptr_or_end(cs5, alpha, dir2);
apply_rules(cs5,n_derivation,n_machine,sig,lsig,j2);//"sig" for output CS needed? No but keep sig and lsig
new_result = false;
//conditions on the LHS and TM table entry (as an IRR) needed for the current value of alpha
foreach(llc4,ref irr2;irrs[0]){//search for matching row(s) to find new origins
if((irr2.cs2.s == cs4.s) && (alpha == irr2.cs2.t[0]) && ((cs4.p == 1 && irr2.cs2.p == 0) || (cs4.p == 2 && irr2.cs2.p == 2))){
cs6 = irr2.cs1.dup;
add_at_ptr_or_end(cs6, r.cs1.t[0], dir1);
//j1 is always 0 so this variable is not needed
//All the results for this r, and possible reverse steps have been found i.e. alpha, cs6, cs5, j2.
//These must now be assembled into the IGR's (one for each value of j2) corresponding to r.
r1 = 1;
r2 = min(j2 + 1,2);//on each side of IGR)
new_j2 = true;
for(int k = 0; k < d_j2.length; ++k){
if(j2 == d_j2[k]){//(d_j2[k] is the kth distinct value of j2
new_j2 = false;
new_alpha = true;//this is assumed first to be a new value of (new)alpha.
for(int j = 0; j < d2_alpha[k].length; ++j){
if(d2_alpha[k][j] == alpha){//compare the jth distinct value of alpha for the kth distinct value of j2 with y.alpha
new_alpha = false;
split_string(cs6, 2 - r1, cont_l);//# symbols to be removed
if(cs6 !is null) unique_sort_add(igr_array1[j2].rhs[j].irrp_set.origins, cs6, nova);//adds new origin to the IRR derived using the same value of alpha.
break;}}
if(new_alpha){
RHE rhe_r = new RHE;
IRRP_set irrp_set_r = new IRRP_set;
d2_alpha[k] ~= alpha;
irrp_set_r.alpha = alpha;
split_string(cs6, 2 - r1, cont_l);
irrp_set_r.origins = [cs6];
irrp_set_r.rhs = cs5;
split_string(irrp_set_r.rhs, 2 - r2, cont_r);
rhe_r.n_TM_step = n_machine;
rhe_r.n_Sub_step = n_derivation;
rhe_r.irrp_set = irrp_set_r;
igr_array1[j2].rhs ~= rhe_r;}
break;}}
if(new_j2){
d_j2 ~= j2;
RHE rhe_r = new RHE;
IRRP_set irrp_set_r = new IRRP_set;
IRRP_set irrp_set_l = new IRRP_set;
d2_alpha ~= [alpha];//array of distinct values of the new alpha in the order they are found.
irrp_set_r.alpha = alpha;
irrp_set_r.origins = [cs6];
split_string(irrp_set_r.origins[0], 2 - r1, cont_l);
irrp_set_r.rhs = cs5;
split_string(irrp_set_r.rhs, 2 - r2, cont_r);
rhe_r.irrp_set = irrp_set_r;
rhe_r.n_TM_step = n_machine;
rhe_r.n_Sub_step = n_derivation;
cs3 = r.cs1.dup;
split_string(cs3, 2 - r1, cont_l);//should happen once only for each new IGR
irrp_set_l.alpha = '*';//This removes old alpha from the lhs of the IGR's and results in sorting them by
//their left IRRP_sets i.e. by the rhs of the left IRRP_set, then by its origin.
irrp_set_l.origins = [cs3];
if(r.sig == 'L') irrp_set_l.origins[0].p = 1; else irrp_set_l.origins[0].p = 0;//For IGR(2) the reverse search has 1 TM step
//so the direction of r determines the direction of adding the context. This is given by igr.lhs.p being 1 or not.
irrp_set_l.rhs = r.cs2.dup;
split_string(irrp_set_l.rhs, 2 - r2, cont_r);
IGR igr = new IGR;
igr.ns = 2; igr.j1 = 0; igr.j2 = j2; igr.rhs = [rhe_r]; igr.lhs = irrp_set_l;
igr.context ~= new Context_pair(cont_l,cont_r);
igr_array1[j2] = igr;}}}}
foreach(ref j2_1; 0 .. 3){
if(igr_array1[j2_1] !is null){igrs[0][0][j2_1] ~= igr_array1[j2_1];
}}}
write("\nstarting the calculation of the IGR's of length l greater than 2\n");
//8 Generate the sets of IGRs of length l greater than 2
foreach(ref i; 1 .. n - 1){
write("l = ",i + 2,"\n");
//This is the main loop that when complete generates igrs, the 4D array of IGR's that generate the IRR's up to
//length n. In cycle i these arrays of IGR's are added to such that the new IGR's added generate the IRR(i + 2) from
//the IRR(i + 1). For convenience I introduce the indices j1 and j2 in the paper so that the indices of igrs, i, j1, and j2 all
//start from 0.
//The IGR's generating the IRR(2) have already been found, these are in igrs[0][0][0], igrs[0][0][1], and igrs[0][0][2], where
//the last index j2 indicates the rightmost position of the pointer relative to alpha in the computation of the RHS for the IRR(2) of type RL.
//For the IRR(2) of type LR it is the leftmost position.
//The first cycle (i = 1) will generate the IGR's for obtaining the IRR(3) from the IRR(2) and these will be put in igrs[1][j1][j2]
// for j1 = 0, 0 <= j2 <= 3.
//Cycle i starts with the set of arrays igrs[i - 1][j1][j2] for 0 <= j1 <= i - 2 and 0 <= j2 <= i + 1
//and generates new members for the larger set of arrays igrs[i][j1][j2] for 0 <= j1 <= i - 1 and 0 <= j2 <= i + 2.
//The last cycle is for i = n - 2 generating IGR's to be added to the final set of arrays igrs[n - 2][j1][j2]
//for 0 <= j1 <= n - 3 and 0 <= j2 <= n which describe the generation of IRR(n) from IRR(n - 1).
//For each value of old alpha in the procedure called F, a set results (IRRP_sets) in the temporary array igr_array is produced,
//one for each value of new alpha. As a result of the removal of extra symbols, some are reduced in length.
//The last loop over the origins ensures that each new IGR generated is from a single origin in the RHS of the original IGR
//i.e. it has a single origin on its LHS. The same is true for the IGR(2) because they are generated from a single TM step
//that necessarily has only one origin.
igrs[i].length = max(1, i);
foreach(ref i1; 0 .. max(1, i)){
igrs[i][i1].length = i + 3;}
int l1, l2;
l1 = max(1,i - 1);l2 = i + 2;
foreach(ref i1; 0 .. l1){
foreach(ref i2; 0 .. l2){
foreach(ref igr;igrs[i - 1][i1][i2]){
if(igr.lhs.origins[0].p == 1) dir1 = 'R';else dir1 = 'L';
foreach(lc4,ref cp; igr.context){
foreach(lc5,ref o_rhe;igr.rhs){//o is for old or original. This is a loop over old alpha.
cs2 = o_rhe.irrp_set.rhs.dup;
add_opposite_pointer(cs2,cp.s2,dir1);//cs2 is now the RHS of an IRRP_set with context added
foreach(lc6,ref ori; o_rhe.irrp_set.origins){
d2_alpha.length = 0;
cs0 = ori.dup;
add_opposite_pointer(cs0,cp.s1,dir1);
cs3 = cs0.dup;//are cs0 and cs3 both needed?
if((cs3.p == 1 && cs2.p > 0) || (cs3.p > 1 && cs2.p == 0))continue;//check for the type of the IRRP. Only if it is LRL or RLR can F be applied.
// $cs3\to\to cs2$ is now the LHS of a new IGR to be found (before chopping out the context strings.
st.L = [];//#1: Two initialisations needed for each run of "new_origins"
st.right_at_start = cs0.p == 1;//#2: i.e. pointer is at left in origin (it is at right in the LHS).
st_init();//initialise the Store st used by new_origins. needed?
NOOCO nooco_object;
new_origins(cs0, nooco_object);
nooco_object.cs0 = cs0.dup;
d_j_pair = [];//(j1,j2)
igr_array = [];//empty this array
igr_array.length = i;
foreach(ref l; 0 .. i)igr_array[l].length = i + 3;
foreach(count2, ref y; nooco_object.nota){//new_origins_triplet_array
cs4 = cs2.dup;
if(cs2.p > cs2.l) dir2 = 'R';
if(cs2.p == 0) dir2 = 'L';
add_at_ptr_or_end(cs4, y.alpha, dir2);
apply_rules(cs4,n_derivation,n_machine,sig,lsig,j2);//"sig" for output CS needed? No but keep sig and lsig
j1 = y.j1;
new_j_pair = true;
if(j1 == 0)r1 = 1; else r1 = j1 + 2;//calculate the r1 and r2 parameters (min # symbols needed
r2 = min(j2 + 1, i + 2);//on each side of IGR)
foreach(ref k; 0 .. d_j_pair.length){//d_j_pair is the array of distinct (j1, j2) pairs
if(j1 == d_j_pair[k].j1 && j2 == d_j_pair[k].j2){
new_j_pair = false;
new_alpha = true;//this is assumed first to be a new value of (new)alpha.
foreach(ref j; 0 .. d2_alpha[k].length){
if(d2_alpha[k][j] == y.alpha){
new_alpha = false;
split_string(y.new_ori, i + 2 - r1, cont_l);//# symbols to be removed, so r1 symbols remain
if(y.new_ori !is null) unique_sort_add(igr_array[j1][j2].rhs[j].irrp_set.origins, y.new_ori, nova);//adds new origin to the IRR derived using the same value of alpha.
break;}}
if(new_alpha){
RHE rhe_r = new RHE;
IRRP_set irrp_set_r = new IRRP_set;
d2_alpha[k] ~= y.alpha;
irrp_set_r.alpha = y.alpha;
split_string(y.new_ori, i + 2 - r1, cont_l);
irrp_set_r.origins = [y.new_ori];
irrp_set_r.rhs = cs4;
split_string(irrp_set_r.rhs, i + 2 - r2, cont_r);
rhe_r.n_TM_step = n_machine;
rhe_r.n_Sub_step = n_derivation;
rhe_r.irrp_set = irrp_set_r;
igr_array[j1][j2].rhs ~= rhe_r;}
break;}}
if(new_j_pair){
d_j_pair ~= int_pair(j1,j2);
RHE rhe_r = new RHE;
IRRP_set irrp_set_r = new IRRP_set;
IRRP_set irrp_set_l = new IRRP_set;
d2_alpha ~= [y.alpha];//array of distinct values of the new alpha in the order they are found.
irrp_set_r.alpha = y.alpha;
irrp_set_r.origins = [y.new_ori];
split_string(irrp_set_r.origins[0], i + 2 - r1, cont_l);//r1 symbols remain
irrp_set_r.rhs = cs4;
split_string(irrp_set_r.rhs, i + 2 - r2, cont_r);//r2 symbols remain
rhe_r.irrp_set = irrp_set_r;
rhe_r.n_TM_step = n_machine;
rhe_r.n_Sub_step = n_derivation;
cs1 = cs3.dup;
split_string(cs1, i + 2 - r1, cont_l);//should happen once only for each new IGR
irrp_set_l.alpha = '*';//This removes old alpha from the lhs of the IGR's and results in sorting them by
//their left IRRP_sets i.e. by the rhs of the left IRRP_set, then by its origin.
irrp_set_l.origins = [cs1];
irrp_set_l.rhs = cs2.dup;
split_string(irrp_set_l.rhs, i + 2 - r2, cont_r);
IGR igr1 = new IGR;
igr1.ns = i + 2;
igr1.j1 = j1;
igr1.j2 = j2;
igr1.rhs = [rhe_r];
igr1.lhs = irrp_set_l;
igr1.context ~= new Context_pair(cont_l,cont_r);
igr_array[j1][j2] = igr1;
}}
//sort each element of igr_array individually into z by IRRP_sets in their rhs's using alpha.
//then add z into main arrays igrs maintaining sort order.
foreach(ref x; igr_array){
if(x !is null){
foreach(ref y; x){
if(y !is null){
IGR z = new IGR;
z.ns = y.ns;
z.j1 = y.j1;
z.j2 = y.j2;
z.lhs = y.lhs;
z.context = y.context;
foreach(ref q1; y.rhs){
if(q1 !is null) unique_sort_add(z.rhs, q1, nova);}
// unique_sort_add(igrs[i][j1][j2], z);
igrs[i][j1][j2] ~= z;}}}}}
}//ends loop over o_irrp_set or equivalently old alpha
}//ends loop over cp (context pairs) of input IGR
}//ends loop over igr of a set of IGR's.
}//ends loop over i2
}//ends loop over i1
}//ends loop over i
int[] freq;//freq[i] is the frequency of IGR's having total length i + 2
freq.length = n - 1;
foreach(i,ref x; igrs){
if(print_full_IGRs){of.write("\n\ninclusive (context included) IGR's of length: ",i + 2);}
++ freq[i];
if(print_full_IGRs){
foreach(i1,ref y; x){
foreach(i2,ref t; y){
foreach(ref w; t){
of.writeln(); IGR_write(of,w);}}}}}
of.write("\n");
if(print_IGRs) of.writeln("\nThe set of distinct IGR's (with the context deleted) needed to obtain\nthe IRR's of length 2, then the extra ones needed to obtain IRR's of each length n");
int[] fr;//number of new IGR's needed for each value of i
fr.length = n - 1;
IGR[] igrs_nc = new IGR[0];//These are IGR's that are consolidated by deleting the context
foreach(i,ref x; igrs){
if(print_IGRs) of.write("\nn = ",i + 2);
foreach(i1,ref y; x){
foreach(i2,ref z; y){
foreach(ref w; z){
w.context.length = 0;//deletes context
w.ns = 0;//this value no longer has a meaning, being dependent on context.
foreach(ref rhe; w.rhs){//these values depend on the context, so have no meaning in its absence
rhe.n_TM_step = 0;
rhe.n_Sub_step = 0;}
if(w !is null) unique_sort_add(igrs_nc, w, nova);
if(nova) {
++ fr[i];
if(print_IGRs){of.write("\n");IGR_write(of,w);}}}}}}
of.write("\n\nThe number of distinct extra IGR's needed to obtain the IRR's of length 2, 3, ... : ");
foreach(ref i; 0 .. n - 1)of.write(fr[i], " ");
//Construct the relation "can be followed by" based on the set igrs_nc, the set of IGR's with context removed.
//From this, strongly connected parts and cycles can be identified to get
//formulae for IRR's obtained iteratively.
bool [][] cbf = new bool[][0];
bool x_lg,y_lg;//x resp. y are left_going
bool cbfb;
int s;
cbf.length = igrs_nc.length;
if(print_IGRs) {of.write("\n\nThe IGR's listed in sorted order and numbered");
foreach(i,ref x; igrs_nc){
of.write("\n",i + 1," ");IGR_write(of,x);}//order for IGR's listed
of.write("\n\n");}
FIGR [] figrs = new FIGR[0];//The array of flattened IGR's i.e. all the structure is removed just the original elements remain
//and an IGR can then be a tuple of types CS, CS, char, CS, CS ie lo,lr,alpha,ro,rr for left hand side origin, left hand side right hand side etc.
//This fills in figrs
of.write("Flattened IGR's listed in order with names based on the above order\n");
int l = 0;
string name;
LIGR[] ligrs = new LIGR[0];
RIGR[] rigrs = new RIGR[0];
string[] names = new char[][0];
foreach(i, ref x;igrs_nc){
foreach(w; x.rhs){
foreach(m,k; w.irrp_set.origins){
name = (to!string(i + 1) ~ w.irrp_set.alpha ~ to!string(m + 1)).dup;
names ~= name;
FIGR figr;
figr.lo = x.lhs.origins[0].dup;
figr.lr = x.lhs.rhs.dup;
figr.alpha = w.irrp_set.alpha;
figr.ro = k.dup;
figr.rr = w.irrp_set.rhs.dup;
LIGR ligr;
ligr.lo = figr.lo;
ligr.ro = figr.ro;
RIGR rigr;
rigr.lr = figr.lr;
rigr.rr = figr.rr;
unique_sort_add(ligrs,ligr,nova);
unique_sort_add(rigrs,rigr,nova);
// unique_sort_add(of,0,ligrs,ligr,nova);
// unique_sort_add(of,0,rigrs,rigr,nova);
figrs ~= figr;
// ligrs ~= ligr;
// rigrs ~= rigr;
of.write(name); of.write(" "); of.FIGR_write(figr);
}}}
of.write("\nThe set of LIGR's follows (no repetitions and sorted):\n");
foreach(i, ref x; ligrs) {of.write(i + 1," ");of.LIGR_write(x);of.write('\n');}
// of.write("\nThe set without repetitions and sorted\n");
// foreach(ref x; ligrs) unique_sort_add(uligrs,x,nova);
// foreach(ref x; uligrs) {of.LIGR_write(x);of.write('\n');}
// of.write("\nThe set of RIGR's follows:\n");
// foreach(x; rigrs) {of.RIGR_write(x);of.write('\n');}
// foreach(ref x; rigrs) unique_sort_add(urigrs,x,nova);
of.write("\nThe set of RIGR's follows (no repetitions and sorted):\n");
foreach(i, ref x; rigrs) {of.write(i + 1," ");of.RIGR_write(x);of.write('\n');}
of.write("\nNext follows the matrix representation of the IGR's using the above sets of\n ligr's and rigr's in the same order. Each row corresponds to an ligr, and each column to an rigr.\n");
//go through all the FIGR's to create the boolean matrix specifying which RIGR's go with each LIGR
//Write the column numbers in 3 parts 100's,10's units so that the RIGR's can be identified
of.write("The next 3 rows are the column numbers for the boolean matrix.\n");
of.write(" ");
foreach(i;0 .. rigrs.length){
// if(s == 10){s = 0;of.write(" ");}
// ++ s;
of.write((i+1)/100);}
of.write('\n');
// s = 0;
of.write(" ");
foreach(i; 0 .. rigrs.length){
// if(s == 10){s = 0;of.write(" ");}
// ++ s;
of.write(((i+1)/10)%10);}
of.write('\n');
of.write(" ");
foreach(i;0 .. rigrs.length){
// if(s == 10){s = 0;of.write(" ");}
// ++ s;
of.write((i+1)%10);}
of.write('\n');
bool[][] bigr = new bool[][0];//boolean IGR representation
bigr.length = ligrs.length;
foreach(ref x; figrs){
foreach(i,ref y; ligrs){
bigr[i].length = rigrs.length;
if(x.lo == y.lo && x.ro == y.ro) {
foreach(j, ref z; rigrs) if( x.lr == z.lr && x.rr == z.rr) bigr[i][j] = 1;}}}
foreach(i; 0 .. ligrs.length){
// s = 0;
of.writef("%3s",i + 1);of.write(" ");
foreach(j; 0 .. rigrs.length){
// if(s == 10){s = 0;of.write(" ");}
// ++ s;
if(bigr[i][j]) of.write("1"); else of.write("0");}of.write('\n');}
//write out figrs to check
// of.write("Order of figrs (flattened IGR's)\n");//different order
// foreach(i,ref x;figrs){
// of.write(i + 1);of.FIGR_write(x);}
//Create and write out fcbf ie cbf for "can be followed" for the flattened IGR's together with the borders i.e. a new numbering. This is now a square matrix.
/*
bool [][] fcbf = new bool[][0];
fcbf.length = figrs.length;
of.write("figrs.length = ",figrs.length);
foreach(i,ref x; fcbf){
x.length = figrs.length;
foreach(j,y; x){
fcbf[i][j] = can_follow_2(figrs[i],figrs[j]);}}
of.write("The boolean square matrix meaning 'could be followed by'\n");
of.write("\n\n ");
s = 0;
foreach(i; 0 .. fcbf.length){
if(s == 10){s = 0;of.write(" ");}
++ s;
of.write((i+1)/10);}
s = 0;
of.write("\n ");
foreach(i;0 .. fcbf.length){
if(s == 10){s = 0;of.write(" ");}
++ s;
of.write((i+1)%10);}
of: .write("\n");
int id;
foreach(i; 0 .. figrs.length){
formattedRead(name[i][0 .. 2],"%d",id);
if(id < 10)of.write(" ");of.write(name[i]," ");
s = 0;
foreach(j; 0 .. figrs.length){
if(s == 10){s = 0;of.write(" ");}
++ s;
if(!fcbf[i][j])of.write("0");else of.write("1");}
of.write("\n");}
string[][] sp;
foreach(i,ref x;figrs){
foreach(j,ref y;figrs){
if(can_follow_2(x,y)) sp ~= [name[i],name[j]];}}
writeln("sp = ",sp);
*/
string[][] sp;
of.write("\nThe boolean matrix 'can be followed by' expressed using > and & follows.");
of.write("\nfigrs with corresponding name omitted have zero members in the 'can be followed by' list\n");
foreach(i,ref x1; figrs){
count = 0;
//This loop is done twice to that the count can be obtained and written first
foreach(j,ref x2; figrs){
if(can_follow(x1,x2)) {++ count;sp ~= [names[i],names[j]];}}
if(count > 0){
of.write(names[i], " > ", count, " ");
foreach(j,ref x2; figrs){
if(can_follow(x1,x2)) of.write(names[j]," & ");}
of.write("\n");}}
of.write("\n");
//"This is part is from program Tarjan that implements Tarjan's algorithm for finding the");
//"maximal strongly connected components of the graph G");
//create the input and output file names, and File objects, and reading in the data sp.
//need to convert between boolean matrix and set of pairs of strings. The latter is more efficient if the matrix is sparse
//as seems to be likely in general.
// string[][] sp;//Each line corresponds to a set of strings, of which element 0 is the LHS of the string pairs and the remaining
//elements are the RHS's corresponding to the LHS.
// string[] buf;//the vertex names (with repetition) this is now "name"
//read in string pairs for the edges and populate vp.
string s1,s2;
string line;
/* for(;;){
infile.readf(" %s >",s1);buf ~= s1;//the LHS
writeln("s1 = ",s1);
if(infile.eof())break;
infile.readf(" %s",n);
for(int i=0;i