# «Abstract. Automated reasoning about systems with inﬁnite domains requires an extension of regular automata to inﬁnite alphabets. Existing ...»

The language of A, denoted L(A), is the set of words in Σ ∗ for which there exists a witnessing pattern in L(A).

** Example 1. Let A2 = Σ, A2 where A2 is the automaton appearing in Figure 1.**

Then, L(A2 ) is the language of all words in Σ ∗ in which some letter appears at least twice.

By deleting the x1 labels from the self loops in A2, we get the language of all words in which some letter appears exactly twice.

** Example 2. Let A3 = Σ, A3 where A3 is the NFA appearing in Figure 1.**

Then L(A3 ) is the language of all words in Σ ∗ in which the last letter is different from all the other letters.

Comparison with Other Formalisms In terms of expressive power, VFAs are incomparable with FMAs – the register automata of [10], but can be simulated by NFMA [9], which extend FMAs with nondeterministic updates of the registers. Intuitively, the variables of a VFA are analogous to registers, but while a register can change its content 6 O. Grumberg, O. Kupferman, and S. Sheinvald during the run, a bounded variable cannot change the value assigned to it. VFAs are also incomparable with data automata [2], yet a VFA with no constant letters can be simulated by a data automaton. Intuitively, the transducers of data automata can be used in order to check that the restrictions imposed by the pattern automaton apply.

In the full version, we elaborate more on the relation with the existing formalisms.

As detailed there, the examples showing the expressiveness superiority of the existing formalisms are tightly related to their complexity, and we do not ﬁnd them appealing in practice. For example, it is not surprising that a formalism for which the emptiness problem can be checked in NL (in Theorem 3 we show that emptiness of a VFA can be reduced to emptiness of its pattern automaton) cannot recognize the language of all words in which all letters are different. Data automata can recognize this language since their notion of acceptance involves several runs, on different subwords of the word. Of course, for some applications such an ability is important. VFAs, however, are sufﬁciently strong to specify many natural properties, and for many applications, we rather give up the expressiveness superiority of the other formalisms for a simple and computationally easy formalism.

**3 Properties of VFAs**

This section studies closure properties of VFAs and the decidability and complexity of basic problems. We show that VFAs are closed under union and intersection, but are not closed under complementation. In the computability front, we show that while the emptiness problem for VFAs is not harder than the one for NFAs, the membership problem is harder, yet decidable, whereas the universality and containment problems are undecidable.

** Theorem 1. VFAs are closed under union and intersection.**

Consider two VFAs A1 = Σ, A1 and A2 = Σ, A2 with A1 = Σ1 ∪ X1 ∪ {y1 }, Q1, Q1, δ1, F1 and A2 = Σ2 ∪ X2 ∪ {y2 }, Q2, Q2, δ2, F2.

We start with the union construction. The standard construction for NFAs, which guesses whether to follow A1 or A2, does not work for VFAs. To see why, note that the range of the variables in a standard union construction would be Σ \ (Σ1 ∪ Σ2 ).

Accordingly, words in L(A1 ) in which variables are assigned values in Σ2 may be missed, and dually for L(A2 ). We solve this problem by deﬁning the union of A1 and A2 as a union of several copies of the underlying VFAs. In each copy, a subset of the variables is taken care of, and transitions labeled by variables from the set are labeled by constants of the other VFA.

We proceed to an intersection construction. Recall that in the product construction for NFAs A1 and A2, the state space is Q1 × Q2, and q1, q2 ∈ δ( q1, q2, a) iff q1 ∈ δ1 (q1, a) and q2 ∈ δ2 (q2, a). Since A1 and A2 are pattern automata of VFAs, the letter a may be a variable. Accordingly, there are cases in which it should be possible to intersect two differently labeled transitions: intersecting two transitions with different bounded variables, meaning they get the same assignment in A1 and in A2 ; intersecting a variable with a letter σ, meaning the variable is assigned σ; and intersecting the free variable y with a bounded variable x or with a letter σ, meaning the assignment to y in Variable Automata over Inﬁnite Alphabets 7 this transition agrees with the assignment of x or with σ. Accordingly, we would like to deﬁne δ such that for z ∈ Σ1 ∪ Σ2 ∪ X ∪ {y}, we have that q1, q2 ∈ δ( q1, q2, z) iff there exist z1 ∈ Σ1 ∪ X1 ∪ {y1 } and z2 ∈ Σ2 ∪ X2 ∪ {y2 } such that q1 ∈ δ1 (q1, z1 ) and q2 ∈ δ2 (q2, z2 ) and such that z1 and z2 can be matched according to the cases described above. Formally, we do this by taking several copies of the product construction of the pattern automata, each associated with a relation H that matches the variables and constant letters of A1 with the variables and constant letters of A2.

** Theorem 2. VFAs are not closed under complementation.**

Proof: Consider the VFA A2 of Example 1. Recall that L(A2 ) contains exactly all ˜ words in Σ ∗ in which some letter appears at least twice. The complement L of L(A2 ) then contains exactly all words all of whose letters are different. It can be shown that ˜ a VFA that recognizes L needs an unbounded number of variables, and therefore does not exist.

We now turn to study the decidability and complexity of the emptiness, membership and universality problems for VFAs. Checking nonemptiness of existing formalisms is complex and even undecidable. The fact that a bounded variable keeps its value along the run makes the nonemptiness checking of VFAs very simple. In fact, a VFA is nonempty iff its pattern automaton is nonempty. Beyond the straightforward algorithm this induces, it shows that the VFA formalism is indeed very close to the simple formalism of NFAs.

** Theorem 3. The nonemptiness problem for VFA is NL-complete.**

** Theorem 4. The membership problem for VFA is NP-complete.**

The algorithms for the universality and containment problems for the ﬁnite-alphabet case rely on the closure of NFAs under complementation, which does not hold for VFAs. Similarly to [12], for register automata, the undecidability of the universality problem for VFA is proved by a reduction from Post’s Correspondence Problem. Since we can easily deﬁne a universal VFA, undecidability of the containment problem follows too.

** Theorem 5. The universality and containment problems for VFAs are undecidable.**

Fig. 2. A nondeterministic VFA whose pattern automaton is deterministic, and a DVFA that accepts all words in which the ﬁrst letter is repeated at least twice Deﬁnition 1. A VFA A = Σ, A is deterministic (DVFA, for short), if for every word w ∈ Σ ∗, there exists exactly one run of A on w.

** Example 3. Consider the VFA D = Σ, D, where D is the DFA appearing in Figure 2.**

The language of D is the set of all words over Σ in which the ﬁrst letter is repeated at least twice. To see that it is deterministic, consider a word w = w1 w2... wn in Σ ∗.

A witnessing pattern for w is over x1 and y. Since only x1 exits the initial state, then x1 must be assigned w1, and all other occurrences of other letters must be assigned to y. Therefore, every word that has a witnessing pattern has a single witnessing pattern.

Since D is deterministic, every witnessing pattern has a single run in D. It is easy to see that every word in Σ ∗ can be read along D. It follows that D is deterministic.

Although for VFA, unlike NFA, the deﬁnition of determinization is semantic, an equivalent syntactic deﬁnition does exist, as we show below.

** Theorem 6. Deciding whether VFA is deterministic is NL-complete.**

Proof: We start with the upper bound. Consider a VFA A = Σ, A with variables X ∪ {y} and an initial state qin. We claim that A is not deterministic iff one of the following holds.

– A is nondeterministic, or – there exists a reachable state s such that there exist two bounded variables x and x that exit s, and a path from qin that reaches s and does not contain x and x, or – there exists a bounded variable x such that both x and y exit s, and a path from qin that reaches s but does not contain x, or – there exists a reachable state s such that there exists a constant letter that does not exit s, or a variable that appears along a path from qin to s that does not exit s.

Intuitively, the ﬁrst three conditions check that each word w ∈ Σ ∗ has at most one run in A. Then, the last condition checks that w has at least one run. In order to implement the above check in NL, we guess the condition that is violated, and check that it is indeed violated. Since NL is closed under complementation, we are done. The lower bound can be shown by a reduction from the reachability problem.

Note that Theorem 6 refers to the problem of deciding whether a given VFA is deterministic and not whether it has an equivalent DVFA. As we show in the sequel, the latter problem is much harder.

We now turn to study the closure properties of DVFAs. Note that closure under union and intersection does not follow from Theorem 1, as here we want to end up with a DVFA and not with a VFA. In order to study the closure properties, we introduce Variable Automata over Inﬁnite Alphabets 9 an unwinding operator for VFAs. Given a VFA over Σ with a pattern automaton A = ΣA ∪ X ∪ {y}, Q, Q0, δ, F, the unwinding of A is the VFA U = Σ, U, with U = ΣA ∪ X ∪ {y}, Q × 2X, Q0, ∅, ρ, F × 2X, where ρ is deﬁned, for every q, θ ∈ Q × 2X and z ∈ ΣA ∪ X ∪ {y} as follows.

Intuitively, the states in U keep track of the set of bounded variables that have been assigned along the paths from the initial state. A run of A corresponds to a run of U in which every state is augmented with the set of bounded variables that have appeared earlier in the run. Also, a run of U corresponds to a run of A along which the assignments have been accumulated. Therefore, we have that a VFA is equivalent to its unwinding.

We start with union and intersection. The constructions have the construction for DFAs in their basis, applied to the unwinding of the DVFA.

** Theorem 7. DVFA are closed under union and intersection.**

Proof: The constructions for union and intersection both rely on the unwinding of the DVFAs. Since there is a one-to-one correspondence between runs of a VFA and its unwinding, a VFA is deterministic iff its unwinding is deterministic. Let U1 and U2 be the unwindings of two DVFAs with pattern automata U1 and U2, respectively.

Consider two states q1 and q2 in U1 and U2, respectively. The intersection construction is based on the product construction of U1 and U2. Each state in the unwinding introduces at most one new bounded variable (Theorem 6). If new bounded variables x1 and x2 exit q1 and q2 respectively, the construction matches x1 and x2 together to form a new bounded variable. Similarly for a y1 transition and a new bounded variable x2. Transitions labeled by y1 and y2 are matched together to form a y transition. The states of the intersection construction keep track of the matchings of bounded variables.

The union of U1 and U2 is constructed on top of the intersection construction. Intuitively, a run on the union construction continues along both DVFAs as long as possible.

Once it cannot continue along U1 (w.l.o.g.), it continues along a copy of U2. As in the proof of Theorem 1, several such copies are taken, in which constants of U1 are assigned to variables of U2.

The fact that a DVFA has exactly one run on each input word makes its complementation easy: one only has to complement the pattern automaton. Formally, we have the following.

Theorem 8. Given a DVFA A = Σ, A with a set of states Q and a set of accepting ˜ states F, let A be the pattern automaton obtained from A by deﬁning its set of accepting

## ˜ ˜ ˜

states to be Q \ F, and let A = Σ, A. Then, L(A) = Σ ∗ \ L(A).We now study the computability of the DVFA model. We ﬁrst study the problems of nonemptiness and membership. As argued in the proof of Theorem 3, a VFA is empty iff its pattern automaton is empty. Since the nonemptiness problem in NL-complete also for DFAs, the NL-complete complexity there applies also for DVFAs. For the membership problem, determinism makes the problem easier.

10 O. Grumberg, O. Kupferman, and S. Sheinvald

**Theorem 9. The membership problem for DVFA is in PTIME.**

We note that the question of whether the membership problem is PTIME-hard, or in NL is still open, and we suspect that it is very difﬁcult, as it has the same ﬂavor of the long-standing open problem of the complexity of one-path LTL model checking [11].

We now turn to study the universality and containment problems and show that they are decidable.

**Theorem 10. The universality problem for DVFA is NL-complete.**

This result follows from the NL-completeness of the emptiness problem, and from the fact that complementation only involves a dualization of the acceptance condition. Since DVFA are closed under complementation and instersection, the containment problem is also decidable. In fact, we have the following.

Theorem 11. The containment problem for DVFA is in co-NP.

**4.1 Determinization**

In this section we show that not all VFAs have an equivalent DVFA, and the problem of determinizing a given VFA (or concluding that no equivalent DVFA exists) is undecidable. As good news, we point to a fragment of VFAs that can always be determinized.

One evidence that not all VFAs have an equivalent DVFA is the fact that while DVFA are closed under complementation, VFA are not. As a speciﬁc example, which also demonstrates the weakness of DVFA, consider the VFA A2 of Example 1. In the proof of Theorem 2, we showed that there is no VFA for the complement of A2. Since DVFAs are closed under complementation, it follows that there is also no DVFA equivalent to A2.

Theorem 12. The problem of determinizing a given VFA (or concluding that no equivalent DVFA exists) is undecidable.