Language Learning from Texts: Degrees of Intrinsic Complexity and Their Characterizations
This paper deals with two problems: (1) what makes languages learnable in the limit by natural strategies of varying hardness, and (2) what makes classes of languages the hardest ones to learn. To quantify hardness of learning, we use intrinsic complexity based on reductions between learning problems. Two types of reductions are considered: weak reductions mapping texts (representations of languages) to texts and strong reductions mapping languages to languages. For both types of reductions, characterizations of complete (hardest) classes in terms of their algorithmic and topological potentials have been obtained. To characterize the strong complete degree, we discovered a new and natural complete class capable of “coding” any learning problem using density of the set of rational numbers. We have also discovered and characterized rich hierarchies of degrees of complexity based on “core” natural learning problems. The classes in these hierarchies contain “multidimensional” languages, where the information learned from one dimension aids in learning other dimensions. In one formalization of this idea, the grammars learned from the dimensions 1, 2, …, k specify the “subspace” for the dimension k+1, while the learning strategy for every dimension is predefined. In our other formalization, a “pattern” learned from the dimension k specifies the learning strategy for the dimension k+1. A number of open problems are discussed.
Jain, S., E. Kinber, and R. Wiehagen. "Language Learning from Texts: Degrees of Intrinsic Complexity and Their Characterizations." Journal of Computer and System Sciences 63.3 (2001): 305-354.