(updated - more formal, removed some parts and improved general clarity)

Over years, I come again and again back to Halting Problem and it's unsolvability ..now I have created a system, which is theoretically able to find halting problems in far larger class of systems than just FSM's.

It's based on what I call topological identicalness of abstractions and abstraction of programs - it could, basically, prove somewhat different programs identical if they follow the same logic. It is not meant to be runnable on normal computer - it takes a lot of memory. It's, right now, a pure theory. When it's ready as theory, it could be optimized into something practical - but that would be rather special cases.

The main things here are:
  • To define what it is for program to hang.
  • To find a general prover, which could check any program against this definition.

There are 4 class of programs:
  1. Finish their job - they can be run and nothing more is needed to eventually know that.
  2. Go into infinite repeating cycle - those are FSM's and could be detected by normal static analysis.
  3. Go into growing cycle - those are infinite state machines and this text targets those.
  4. Fall into halting paradox - those are possible in case something claims to be a solver; this here tries to address that, but I'm not yet perfectly sure in that.

To get 3rd class of programs detected, there are two things needed:
  1. A method to prove in each case formally that they do not halt.
  2. A method to make those formal cases enumerable so that they could all be reached.
If both are addressed, we would have general solver for halting problem.

Topological identicalness of computer programs

We can order the possible states of any program into finite or infinite (on Turing machine) set. With combinatorics, we can create supersets, which contain all possible sets of these states so that no two sets contain the same element.

This means:

The following is about mathematical set symbols I am going to use. I repeat more important definitions in slightly different wordings so that one can check them against each other for understanding better or chose most convenient one; I know it could be fuzzy otherwise.

Let's call the set of all program states S(p), where p is a program or abstraction. S(p) contains all possible states of this program as a state machine; for infinitely growing programs on Turing machine it contains all distinct (different) states on that program's entire lifetime. Thus it could contain infinite number of states.

Let's create a function U(S(p)), where U contains all state groups of S(p). I call any member of U(S(p) G(S(p)) or simply G, when talking abstractly about common properties of such sets. G(S(p)) is a set of sets of states of S(p), which I call state groups. Each state group, member of G, is set of possible states of state machine p. G does not contain empty set nor two groups, which have a shared member. It might contain one state group containing all elements of S(p).

State groups of S(p) are sets of possible states of S(p). G(S(p)) contains a set of distinct state groups having no overlapping members. U(S(p)) contains all such G(S(p))'s we can form. G is a shortcut for any such set of distinct state groups.

For example, if p has three possible states: A, B, C, then G(S(p)) is any of the following: {(A), (B), (C)}, {(A, B, C)}, {(A, B), (C)}, {(A), (B, C)}, {(A, C), (B)}. In such case, U(S(p)) = { G; G = any of G(S(p)) }, which means that U(S(p)) is set of all G's for p.

I call Z(p) a set of all possible sets of states in S(p).

This repeating was meant to allow to check that it's clear as it's needed for what follows. When I speak about G, I mean any allowed set of state groups, any member of U. One of G-s will also contain state groups with one element in each (unless in some cases of infinite state machine) - this G contains each element of S as state group.

So, terms currently introduced:
  • U is an universe of all possible state groups; U(S(p)) is clearly defined, but actually U in general will contain more of them.
  • G is a set of all possible state groups for a program, which are distinct to each other. In actuality, I will use G for any states into which this state machine can perform a transition into. I will add elements to G in progress of this text. G(S(p)) is mostly needed to explain the process of abstraction.
  • S is a set of all allowed states of a state machine or abstraction. S(p) is a set of all possible states of a program - this is a very straightforward set, which contains, for finite state machines, only the actual states of that machine; S(a) of an abstraction might also contain combined states of that abstraction, but inside an abstraction, they are always straightforward.
  • Z is a set of all possible combinations of allowed states. Z(p) does contain all elements contained in any G(S(p)).

Now, for each state in set S there is a state, which will follow it when program is running. I will call this X=>Y, when this notation does not clearly be an implication, where X and Y are members of S and X will directly cause Y when program is running. If X=>Y, then state X at time moment t means that there is state Y at cycle t+1.

I also speak about transformation rules of members of Z, state groups. I say that state group X directly causes state group Y if X=>Y such that X and Y belong to Z and for every member x of X there is x=>y in effect such that y is some member of Y. This means that if X=>Y, each state in group X must cause some state in group Y for next cycle.

Let's call G an abstraction of a state machine - it is one aspect of it.

I bring an example:

There is a state machine such that:
States: A, B, C belong to S(m1).
Rules: A=>C, B=>C, C=>A.

Then there is another state machine such that:
States: D, E belong to S(m2).
Rules: D=>E, E=>D.

Now, if we make such abstraction of first state machine:
State groups: (A, B), (C) belong to Z(m1) and form some G(S(m1)).
Then we get the following rules:
Rules: (A, B)=>(C), (C)=>(A, B)
C does not produce B, but it produces a state group of (A, B) as it produces any member of it.

So, we can say that there is such abstraction of first state machine, which is topologically identical to second state machine, because both contain two state groups with identical rules between them. We could group abstractions formed by using some G(S(x)) first-order abstractions. In actual reality we would often use Z(y) in place of G(S(x)), where y is some abstract machine - this way, we can draw all possible implications about a state machine's activity on given cycle.

Now, we have a formal rules to do abstractions of state machines. As there are abstractions, there also come the communication channels - we can abstract a state machine into two state machines, which communicate with each other.

We can, for example, have two operations in iteration adding numbers one by one and when they reach 3, they will start from zero again; this iteration is part of a function, which can be called with different starting points. This state machine has 9 states in total. Abstractions of those states will divide this state machine effectively into two state machines, which do not interact with each other - one abstraction would divide them into 3 groups, which contain members each of which imply another. This means - U(S(f)) of that state machine f contains two abstractions, which both count numbers and are topologically identical. There are two such G's for which that applies. This also means that when looking to Z(f), we can find two sets of state groups for which x=>y=>z=>x.

Z(p), where p is infinitely growing machine - infinite state machine - will contain infinite number of state groups, which all are countable. For each cycle of program's work, which does not repeat some historical situation, we can add new state x to S(p) and all sets containing x and zero or more other members of S(p) into Z(p). Thus, Z(p) is enumerable for any state machine.

By abstracting infinite state machines, we will get finite state machines, which we can fully analyze.

State series

Topological identicalness discussed for now could be called "strict identicalness" of abstractions.

We can now form another set of states, which will contain several states following each other.

For example, if A=>B and B=>C, then we could form an arrangement ABC.

Now, if we have such state machine:
Rules: A=>B, B=>C, C=>D, D=>A

Then we see it also implies such state machine:
Rules: ABC=>D, D=>A

By allowing such arrangements in our state sets, we will get a new layer of abstractions. Elements of such sets must have their length as second element.

I will call this set T(Z(p)) or T - by this I mean that we have added time. All possible arrangements we can form from Z(p) are combined in T(Z(p)). G(T) would be shortcut for G containing sets of arrangements of Z(p). Elements of such G do not have to be of same length, thus we also need len(x), where x is a member of T, S or Z and contains the number of cycles such state will remain unchanged. When we have built an abstract machine of such arrangements and arrange them again, length of such arrangement of arrangements is sum of their lengths.

Now, the last and most important of them all - R. R is the combination of all groups of all elements of S and all arrangements we can form from them. R, for any finite S, is finite and for any enumerable S, enumerable. G(R(p)) or - in context - G(R), contains all possible abstractions we can make from a program (abstractions are subsets of possible G-s). R itself is a total sum of abstractions - implication rules of elements of R cover all implication rules of a program as it's elements cover all elements of a program.

State machine abstractions combined

When we abstract a state machine, we will get several state machines, which might interact with each other in several ways.

This means - for any abstraction G of R, there is one such G of R in combination with which the whole program logic could be fully reproduced. I call this a context of G or ~G. Sometimes, this does not matter to G, what happens with ~G - whatever possible states ~G happens to go into, G continues it's determined execution path. In such case G is independent. Context might also be independent from G. G might be independent as long as it stays in some specific state group. Until now, we have looked abstractions - G's - in such way that they have nothing to do with the rest of R. Anyway, R might contain a lot of implication rules (as they are there with states) and additional state arrangement groups, which directly affect some G they directly don't belong to - they can be used to draw conclusions about G. Those implication rules form some smallest G, which is independent itself (anyway, it could be a program itself) - this I call a neighborhood of G.

We can work out rules about the ways they interact with each other. Most interesting interaction rules are restrictions to interactions, which can only be found by looking several machines in combination. For example, one abstract machine G can be proven to not intervene into memory requests of another or to not send any signal before it has got a specific signal. As those interactions involve some time - one machine can, for example, wait for signals only each second tick and leave unnoticed any signals at other times or do different things with signals dependent on timing.

For analyzing any abstract machine, we must make the set of rules of interactions. For interesting abstractions, we usually find such state group that as long as our abstract machine does not leave it, it will also not be interrupted. If it will get signals so systematically that they could be as well included in it's own state groups, we might have chosen a wrong abstraction.

This is normal that in a program run, abstract machines will be created and destroyed. We can find rules covering some specific machines, which is often created, to specify the logic of our whole system.

Indefinite growth of computer programs as topologically equivalent subsequent state machines

We can now formulate:

In case some G_x produces such G_y that G_x==G_y - they are topologically identical -, we can call this a child of that G_x. In case this is a direct result of G_x - any G_x produces such G_y that it's identical to G_x we will call this cycle - in this case, infinite cycle. As we develop the internal rules of state machines, we also look them as a whole - we do implications about their relations. For a child of G_x we will have some slots connecting it to it's parent. We can specify those slots as we do more implications, but basically - getting a child will mean abstract relations, pointers. Child of G_x might have direct connections with other abstraction systems with which G_x has connections; it might have connections with children of those abstractions, G's, connected with G_x and it might have connections with G_x itself - which is a very common case. In latter case, it might have some connection to G_x in such way that it's children would also have connection with G_x, but it might have such that for it's children, the same implication rules produce connection with itself. There is finite number of classes of such connections - those can be expressed as paths in network of relatives. Some paths are absolute (as the topmost G is connected to all G's) and some paths are relative - they are implied from rules of abstraction itself about itself. You see that each context has a local R.

Now we see, what should be happening inside a solver:
  • It grows a network of logic of each G.
  • It finds all topologically identical G's.
  • It understands, which G's are guaranteed to produce which G's and which will be their connections - parent could connect a child with it's own relatives and child could produce it's own.
  • Some parts of it's activities are based on studying G's - abstractions and their direct implications - and others are based on studying the whole - drawing logical connections between abstractions based on what is known. When studying abstractions, it will look their growth in time and convert that into implication rules - it will also know, which rules are necessary and enough to produce which rules; there might be several sets of such relative axioms for any rule. When studying the whole, it will make deductions based on rules it has found, thus removing many states from different G's - showing that based on what is known, they will never be reached; it will also find any kind of cycles.

About cycles, there is such knowledge, which we can find:

  • If some abstraction has a possibility to produce a child (by the way, the parent-child word use is just for convenience, most probably the same child is looked as being child of several abstractions at once etc., the logic between abstractions will largely overlap), then which states will produce it - partially, this is a local analysis of an abstraction.
  • If some abstraction is in such relation to others, then local analysis of abstractions will be connected - if one abstraction has rule that it will not send signal x to another in case that other will not signal y and vice versa, another has the same rule, then the possibility to get signal x will be removed from both, which could simplify the world; same kind of global analysis are needed to make the cycles exact - if child is told to not go into state x unless parent goes into state x, state x will be removed.
  • There could be any kind of complex state logic. It can be analyzed that some state will be definitely reached or definitely reached in case some other state of connected abstraction will be reached; etc..

An abstract machine will born whenever some abstraction can be applied to some local R. It is destroyed when it can not anymore. It has signals and timing of destroyment. Each abstraction is finite state machine and can be fully analyzed in relation to it's connections to outer world - it's basically a random bunch of states applicable to R at any given moment. It's highly probable that most abstractions live for only one cycle at any given birth. Anyway, we will analyze all abstractions thoroughly, then analyze fully the results of this analysis, apply the results to currently active state system and then find new abstractions to analyze. New states come from the fact that our program grows in process of analysis - most notably, in 1D Cellular Automata context, we will analyze some number of cells with given length; as a result of this analysis process, we will produce some transitions, which will change the state of neighbor cell - ask for some new memory. Through this we will have a bunch of new state groups to analyze - we have added a number of elements to global R and now start analyzing the next iteration of it.

An actual logic of hanging is that some abstraction will produce independent child identical to it. Independent means that this child does not go into any state, which would result in state of parent program, which would destroy it. By analyzing an abstraction, we will find the behavior of abstraction; by analyzing the whole, we will find parent-child relations and also parent's relations to it's context. We will also make notice about which implication rulesets are needed to produce such child and if they are guaranteed to be met in child just because they are met in parent - if this is so and the program logic is waiting behind a child, we have detected a normal hanging condition.

What about halting problem unsolvability proof of Turing?

This needs proof that it detects halting problem; it also needs counterexamples about programs, which would be mistakenly detected or a proof that there are none.

Turing's proof has, as one of it's assumptions, an idea that a program, which does halt, could not be considered as not halting. If we could show, formally, that ideed, there is a way to define programs, which are all wrong even if they halt - then it's way to go. This means - we will not disprove the proof of Turing by that, but we might find an exact definition about when this happens. I think that my formulation applies and is detectable with methods described here.

If we can prove that parent produces topologically identical child because of reasons, which are there for a child to produce the next iteration; and we can prove that parent waits for child and it's actions depend on child's decision - I think that no sane program will do that. And we have the logic covered to reach such deduction.

Which Turing-complete machine can be fully described in such way?

My favorite is 1D Cellular Automata. It has been shown that one can fully model Turing machine on 1D CA - such Cellular Automata is Turing-complete.

Cellular Automata is very simple thing. 1D CA in principially infinite array of cells. Each cell has a position, which can addressed as integer on that CA (I think we can make it bounded at beginning so that it would be a natural number as non-bounded 1D CA can also modeled as topologically equivalent bounded-at-beginning CA). Each cell is finite state machine, which is connected to it's local neighbors - cells directly before and after it. For any program with finite code, it would be started with some finite number of cells having non-zero value. Programs can take more space in their activities.

We can make all possible abstractions from first condition and then start running them cycle-by-cycle. In process, new cells might go into non-zero value - thus, an initial automata is connected with one "infinity cell", which has properties of infinity. It can change it's state and this activity can be easily described as simple state change - basically, first cell of infinity is included in this first abstraction set with it's state group.

How to represent those abstractions and their relations?

As careful reading has shown, we do implications about abstractions using data about their context. This means that each abstraction, in reality, will produce restricted abstractions - such abstractions, which already contain some relevant logic of their context. Context can be infinitely growing, thus is often leaves some things open. This is critical to have good model of how this context is expressed.

For an abstract machine, as it's groups combine several groups, it is known, which groups have transitions into which groups - for example, A=>B is clear rule, but sometimes there is no such direct group that if A=>B and C=>D and groups are (A, C), (B), (D), then (A, C) has some clear transition rule into some other group. If we look at Z of this abstract machine, it also contains (B, D) - so we could say that (A, C)=>(B, D). For keeping our abstract machines simple we say that (A, C) has implication rule into two groups - for that, we need potential implication rule. Let's name it ≃> so that (A, C)≃(B) and (A, C)≃(D) and thus, (A, C)!≃(E), for example. A group can have a rule (A, C)≃(A, C) in case one of them have transition rule into member of the same group. We also need a "destroyed" state here, which is the situation after which the world of that state machine goes into some state, which does not belong into it's allowed states. Right now, anyway, we have defined our G in such way that this is not visible, how that could happen - any G has group of any allowed state of system. Anyway, things get more complex when we are allowed to detect cycles - child of a cycle carries some set of implication rules together from parent, but not all of them. Thus, if we see that child has an abstraction, which is topologically identical to abstraction of parent, we create an abstract machine, which only contains those implication rules, which are carried through indefinitely. Such abstract machine has a "destroyed" state group. Let's call this machine M as in meta. Such machines will be found by global analysis of contexts and in their context, new R will be formed - new world, which allows another set of abstractions. States of M are often partially undetermined, but even as such, all known transitions can be fully analyzed. Here we have defined cycles (C was not available) and partial abstractions; cycle is partial abstraction and partial abstractions can have exit state (this state can be reduced out by global analysis).

As global analysis reduces out some states, which are not going to happen - for example, it becomes clear that some cycle will exit only if it gets signal from parent, but it never sends signal to child nor does context send that signal, thus an exit state will be deleted. This produces also reduced abstract machines - each reduced abstract machine is normal abstract machine, each reduced partial abstract machine is normal partial abstract machine.

A notion that some event will produce a specific group of R and this group always produces itself if formal way to say that some states imply that a machine will definitely reach some state group and stay in it infinitely. As there is "destroy" state of abstraction, there are also numerous groups containing it. There is also a group, which contains all states except "destroy" state - if this group is immortal, then that machine will never be destroyed. This shows that static analyzer can have a specific formal knowledge of restricted abstract machines, which leads, for example, to knowledge of hanging.

Reduction of partial abstract machine is achieved in two ways:
  • By removing some states or transition possibilities from it - for each finite machine, there is a finite set of such partial machines, because there is finite number of states and rules you can remove. By removing one of two ≃> rules, you will get one => rule as it becomes exact. Thus, such reduction is a good thing - it might make abstract machines behavior more determined. It allows to detect new cycles - for example that some machine will stay in state group x of it's R whenever it reaches some state of this group. By that it would also produce reduced version of itself on place of itself, which also means a reduction into dependent parent etc. This is an example about how state machines could be born - in this case, it would be born at the same abstract space where it's parent is located, without destroying it's parent.
  • By adding some state transition rules. We can abstract the potential reactions of context as specific state machine connected to our own. For example, if some state transition of our state machine is related to parent in such way that if it sends signal A to parent (goes into any state of group, which we can call signal A), then parent will send signal B to it (change some shared part of state in such way that it will go into state, which is part of group B). Abstracting parent in such way we could get an abstract machine, which will have four states - waiting for signal (C), getting signal (A), doing inner work (D), sending signal (B). Doing inner work is a state arrangement - it could contain, for example, the state group of all elements in S in arrangement of so many cycles as many cycles it takes for parent to finish it's job. It could hypothetically also contain a cycle M - as cycle has undefined length, it could then take more or less time. For arrangement, things are straightforward - if arrangement length is fixed, then we could just make a deduction of our state machine (I basically use that word to differentiate from reduction as this differentiation is very important). This means - as both abstract machines are FSMs (we always locally work with FSMs), we can do things similar to this: if one state machine of two has 4-cycle length and another has 5-cycle length and they are related, we can make up 20-cycle length (LCM) machine and have combined states in it. This would not change our current state groups if we want to keep them (for example to check for topological overlapping of state logic with other machines, which start up from the same machine), but it would produce four specific cases for each state and a new set of implication rules. Anyway, if cycle length is indefinite, it would produce infinite number of states. Luckily, we can use a different practice with cycles - instead of telling that there is connection with cycle, we will tell a specific instance of cycle. In case our connected neighbor is first element of that cycle and we are first element of our cycle and the transition rules of our child are related to transition rules to it's child, we can combine them. In such combination we can add up different arrangements - for example, if the arrangements we have to wait are G and H for parent cycle in form of GH, but GHH for it's child and GHHH for it's child, we need to have state group J, which contains GH for first element of that cycle and JH for second element. In such way we can always keep the relation correct. Rules of abstract child machine can always only build things from states and arrangements of their parents - this means summing them together or using directly. In case there is some more complex logic, we simply wait until this logic has been analyzed enough to draw important conclusions - we would not allow our checker to try to form cycles with more complex mathematical structure, because all patterns (and we are seeking for patterns, remember) can be implied from them. Anyway, the structure of pointers is important - we always need to keep account of all groups of implication rules, which produce something; for non-cycle, those are the basic rules; for cycle, there are cycle-local rules, which are the rules, which children give to their children precisely because they are given to themselves in the same form. Those form a local ruleset of a child. When making abstract relations to cycles, we will keep the data about those connection structures and have a way to create pointers to how far in this structure we need - in case we are connected to M_x of some cycle because of the rules we inherit to our children unimpact, they will be connected to things in their relation to M_x - this can be that our child is connected to M_y = M_x+1 and thus it's child is connected to M_y+1; but it can be that it is connected to M_y = M_x and it's child is connected to M_y because of that. And if M_x is connected to M_x-1 and this connection has any relevance into how it interacts with us, then there must be a possibility to do global analysis in relation of our child and M_x-1. Only in case neither side is dependent of other is there no need to allow such connection or do some global analysis of them in pair.