Learning Multiple Languages in Groups

Document Type


Publication Date



We consider a variant of Gold’s learning paradigm where a learner receives as input different languages (in the form of one text where all input languages are interleaved). Our goal is to explore the situation when a more “coarse” classification of input languages is possible, whereas more refined classification is not. More specifically, we answer the following question: under which conditions, a learner, being fed different languages, can produce grammars covering all input languages, but cannot produce grammars covering input languages for any . We also consider a variant of this task, where each of the output grammars may not cover more than input languages. Our main results indicate that the major factor affecting classification capabilities is the difference between the number of input languages and the number of output grammars. We also explore the relationship between classification capabilities for smaller and larger groups of input languages. For the variant of our model with the upper bound on the number of languages allowed to be represented by one output grammar, for classes consisting of disjoint languages, we found complete picture of relationship between classification capabilities for different parameters (the number of input languages), (number of output grammars), and (bound on the number of languages represented by each output grammar). This picture includes a combinatorial characterization of classification capabilities for the parameters of certain types.


Originally published:

Jain, Sanjay, Efim, Kinber. "Learning Multiple Languages in Groups." Theoretical Computer Science 387.1 (2007): 67-76.