Publications of Francesco Carravetta

This page shows all publications that appeared in the IASI annual research reports. Authors currently affiliated with the Institute are always listed with the full name.

You can browse through them using either the links of the following line or those associated with author names.

Show all publications of the year  ALL, with author Carravetta F., in the category IASI Research Reports (or show them all):


IASI Research Report n. 19-08  (Previous )


Francesco Carravetta, White L.B.

MODELLING NATURAL LANGUAGE SENTENCES THROUGH STOCHASTIC SYNTACTIC PROCESSES

ABSTRACT
We define a Syntactic Stochastic Process (SSP) as a one valued in the set of terminal symbols of a grammar, and whose realizations are terminal strings generated by some stochastic grammar. We consider first the case of a Stochastic Context-Free Grammar (SCFGs) generating the process and show that any SSP generated by a SCFG (a context free process) can be consistently indexed by a subset of nodes of a suitable defined Graphical Random Field (GRF). We show that in a context-free framework, the inference problem (optimal smoothing, given noisy observations) for a string of lenght T generated by a stochastic grammar, if we simply define the underlying stochastic process by assigning to every string the probability of the set of all parse trees that yield that string, has an exponential complexity in T, and polynomial in the number, N, of symbols constituting the string. In view of application in speech processing, the issue is further complicated by the fact that such a complexity comes up yet as for the SCFGs, which show the second lowest generative power in the Chomsky hierarchy, and do not suit natural language. In the second part of the paper we consider more general grammars than context-free one, and propose a definition of Stochastic Context Sensitive Grammar (SCSG), showing as well that the stochastic process generated by a subclass of SCSG is well posed and admits a representation as a GRF. We show that certain mild context-dependent grammars (CDG), which are known in the recent literature as Tree Adjoining Grammars (TAG), and are able to capture a large and relevant part out of the features of natural language, can be given a stochastic version, namely, stochastic TAG (STAG), and that strings generated by a STAG are reciprocal processes (RPs), thus allowing the solution of the inference problem still in polynomial time with respect to N, but dropping down the complexity to being linear with respect to T.
back
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -