25/05/2021 - 15:30 - 14:00
Add To Calendar
2021-05-25 14:00:00
2021-05-25 15:30:00
Linguistics Colloquium: Reut Tsarfaty
Reut Tsarfaty, Bar-Ilan University
Title: Abstraction in Natural Language
Abstract:
Abstraction is a core construct of human communication which manifests itself in diverse domains of human activity. However, in contrast with the case of formal languages, where the levels of abstractions are predefined and known, abstraction expressed in informal and unrestricted natural language is challenging to automatically identify and process. In this work we propose to systematically address the challenge of identifying and grounding abstract structure expressed in natural language complex instructions. A pre-condition for doing that is the creation of a dataset that elicits from novices (non-programmers) the expression of abstract structures --- such as iterations, conditions and functions --- in natural, unrestricted and non-formal language. In this work we present the Hexagons Dataset, where two players are engaged in a referential game to describe increasingly abstract figures on a two-dimensional hexagons board. In our data, we see a mix of abstraction levels in the utterances. The performance of the baseline models confirm that such abstractions is harder to parse than plain predicate argument structure. This opens up opportunities for learning to extract and ground higher-level asbtractions, leaving ample space for further research towards synthesizing procedures and algorithms from natural language.
Subscribe to our Telegram channel to get notified about upcoming talks and events
אוניברסיטת בר-אילן
internet.team@biu.ac.il
Asia/Jerusalem
public
Reut Tsarfaty, Bar-Ilan University
Title: Abstraction in Natural Language
Abstract:
Abstraction is a core construct of human communication which manifests itself in diverse domains of human activity. However, in contrast with the case of formal languages, where the levels of abstractions are predefined and known, abstraction expressed in informal and unrestricted natural language is challenging to automatically identify and process. In this work we propose to systematically address the challenge of identifying and grounding abstract structure expressed in natural language complex instructions. A pre-condition for doing that is the creation of a dataset that elicits from novices (non-programmers) the expression of abstract structures --- such as iterations, conditions and functions --- in natural, unrestricted and non-formal language. In this work we present the Hexagons Dataset, where two players are engaged in a referential game to describe increasingly abstract figures on a two-dimensional hexagons board. In our data, we see a mix of abstraction levels in the utterances. The performance of the baseline models confirm that such abstractions is harder to parse than plain predicate argument structure. This opens up opportunities for learning to extract and ground higher-level asbtractions, leaving ample space for further research towards synthesizing procedures and algorithms from natural language.
Subscribe to our Telegram channel to get notified about upcoming talks and events