Bringing High-speed Internet to More Americans

Columbia Engineering professor Henning Schulzrinne unpacks President Biden’s $1 trillion infrastructure bill and its promise to expand broadband access for people in rural and low-income areas.

Mechanism Design for Social Good Started as a Reading Group, Now It Launched an ACM Conference

Computer science research is about solving problems with computational tools — it could be how to predict where the next flu outbreak will occur, how robots can make life easier for senior citizens, or how to fight misinformation on social media. But while computer science (CS) researchers have all the technical know-how they still need to collaborate with people who are on the ground and know about the particular problem or situation. 

A group of graduate students from various institutions and disciplines (CS, Economics, and Operations Research, to name a few) recognized the gap and need for connections and collaboration between the different groups. And so, Mechanism Design for Social Good (MD4SG) was born in 2016, co-founded by Rediet Abebe and Kira Goldner. From a 12-member reading group, the multi-institutional initiative expanded to 2,000 participants involved in working groups, colloquium series, tutorials, and workshops at the ACM Conference on Economics and Computation, at EC’17 and EC ’18. 

Riding the wave of support from researchers, practitioners, governments, and non-profit organizations the inaugural ACM Conference on Equity and Access in Algorithms, Mechanisms, & Optimization co-chaired by Ana-Andreea Stoica, Rediet Abebe, and Irene Lo was organized this October.   

The conference highlighted research where CS, economics, operations research, and social and humanistic sciences intersect and help improve equity and access for historically disadvantaged and underserved communities. A number of Best Paper and Poster Awards were presented at the digital conference.

Ana-Andreea Stoica
Ana-Andreea Stoica

We caught up with Ana-Andreea Stoica to find out more about the conference and why it is important to develop multi-disciplinary research opportunities.

What happened to make you realize that the MD4SG workshops could be expanded into a conference? How did the EAAMO conference come about?

Our technical workshop series has been increasingly growing since its first iteration in 2017. In 2020, we had the first standalone workshop that drew over 130 submissions. Given the rapid expansion as well as the expanded scope, we decided to start this conference series that would provide a better inclusion of all fields relevant to our mission of bridging research and practice for the scope of improving access to opportunity for marginalized communities (e.g. Economics, Operations Research, Computer Science, Sociology, Law). Rediet Abebe, Irene Lo, and I served as Program Co-Chairs for this inaugural conference, working closely with our General Co-Chairs, Illenin Kondo, and Francisco Marmolejo-Cossio, in organizing the first EAAMO conference. 

 

How is the conference different from the MD4SG workshops?

The conference series is a natural continuation of the MD4SG workshop series (given the growth in size and scope since its inception). The conference aims to be inclusive of all the fields that create research related to the mission of our organization, including Economics, Operations Research, Computer Science, Sociology, Law, among others. The conference would also serve as a publishing venue for such research — as an ACM-sponsored conference, our archival track includes papers published with proceedings in the ACM Digital Library.

 

How is the conference creating a space for publishing research that relates to your mission?

EAAMO’21 aims to open avenues for creating and sharing research at the intersection of all the fields I mentioned through both the archival and non-archival tracks. In particular, original research can be published in the ACM Digital Library, where it can be recognized and shared in the research community. We hope that EAAMO can serve our community as a space for interdisciplinary research, in particular for the unique ideas and projects that aim to apply computational tools and humanistic methodologies in improving access to opportunities for marginalized groups. 

 

Why does the group aim to connect computer scientists with other non-computational groups such as non-profits and the public sector?

EAAMO’21 aims to foster an interdisciplinary community that can bridge research and practice in tackling topics such as access to education and healthcare, interventions for poverty alleviation, fairness and privacy in labor markets or data markets, and many other topics related to underserved communities. 

To this end, working with non-profits, the public sector, and practitioners is crucial in order to understand the main issues at stake in each of these applications and to construct research-to-practice pipelines that have an impact on the communities we aim to center at the core of our research agenda. The success of our workshop series and previous and ongoing projects relies on this multi-disciplinary approach and on engaging domain experts working in non-profit organizations, municipalities, and companies. Domain-centered interdisciplinary work has always been the focus of MD4SG activities. 

Since its inception, MD4SG has organized various working groups in which students, researchers, and practitioners work on particular topics of interest. Our current working groups vary from 15 to 100+ people in size each and organize bi-weekly meetings with talks, discussions, and publication goals. Our groups have fostered cross-domain collaborations that led to several publications. As of Fall 2020, MD4SG has also organized working groups around specific geographical regions to foster collaborations on topics of relevance related to mechanism design for social good. 

 

How will the conference facilitate these collaborations?

EAAMO’21 featured keynote talks from leading academics and practitioners in domains related to the conference theme, presentations of submitted papers, problem pitches, datasets, and software demonstrations by participants, problem pitches and product demonstrations from domain experts and practitioners, as well as thematic policy & practice discussion panels with practitioners focused on Latin American topics and migration and asylees topics. 

 

Are you working on any projects that resulted from the MD4SG workshops and EAAMO? Please describe it and how is it going?

Definitely, our working groups are continuously working on projects that stemmed from our work together in MD4SG as well as from the MD4SG workshops. A recent paper that came out of the MD4SG Working Group on Bias and Discrimination can be found here. Other projects currently ongoing are related to provisions for social goods (in the Inequality Working Group for example). My co-organizers have several projects published and ongoing, for example, from the Data Economies Working Group, found on this page.

 

How can people become part of MD4SG?

We encourage people who are interested in joining MD4SG to subscribe to our (low volume) listserv, where we post opportunities to join working groups, events, collaborations, and related activities. Our website contains a detailed description of all of our activities as well.  

How Will the Infrastructure Bill Improve Internet Access for Americans?

Professor Henning Schulzrinne unpacks the infrastructure bill and how it will expand broadband access for Americans.

The $1 trillion dollar Bipartisan Infrastructure Deal will deliver $65 billion dollars to improve internet access. Currently, 30 million Americans live in areas where broadband internet is not available like in rural areas and lower-income urban areas. The plan is to build broadband networks, improve internet infrastructure, and help lower internet service cost. We asked Henning Schulzrinne, the Julian Clarence Levi Professor of computer science and an internet networks expert, how the bill will impact internet access.


Q: What is the current state of internet access in the US? Why is it important that the bill allots $65 billion to improve access for rural areas, low-income families, and tribal communities?

Internet access has two facets: availability and adoption. Currently, there is no precise data of how many homes have access to basic fixed internet service, defined as a speed of 25 megabits download. (That is much slower than most current cable or fiber offerings.) A recent effort using a variety of data sources estimates that about 93%of households could subscribe to basic broadband or faster, leaving about 14.3 million households without access except via expensive and unreliable satellite service or very slow DSL. But only about 77% of adults use the internet at home (“adoption”). Affordability is an important reason for the discrepancy between availability and adoption.

The bill is the first large-scale effort to address both availability and adoption; earlier efforts largely provided money to rural areas to build out broadband internet, about $5B a year, but did not address affordability except for the emergency broadband benefit program started in May 2021.


Q: How far behind is the US when it comes to broadband compared to the rest of the world?

Almost all large countries struggle with making high-speed broadband available to rural areas. But many other countries have lower prices and more competition for broadband service, maybe explaining why the United States ranks 16th out of 38 OECD countries. The United States ranks 13th worldwide on average broadband speed, but such comparisons can be difficult and the differences are not that large among the top 20 countries.

 

Q: Why is broadband not available in most rural areas?

Most rural areas have some broadband, typically using older technology based on phone lines (DSL = digital subscriber line). However, it can be quite slow and connections are often overloaded and unreliable. Only about 67% of rural households have access to higher speeds of 100 Mb/s that are typical in urban areas. The reasons are complex: cable companies provide most high-speed broadband in the United States, but have largely chosen not to build out in rural areas. Telephone companies have relied on their old phone lines to provide broadband service, with limited investment in modern fiber technology. Since houses are further apart and since disposable incomes are often lower, private investment in rural broadband has not been considered sufficiently profitable; thus, much of the rural broadband deployment has been subsidized by various federal programs. Many of these programs have been supporting broadband that is now considered obsolete.

 

Q: It also contains $1 billion for enabling the build-out of “middle mile” broadband infrastructure, what is this and how can it help?

The internet infrastructure can be roughly divided into the backbone network connecting major cities, middle mile networks going from those cities to smaller population centers such as county seats, and access or “last-mile” networks that connect homes to the internet. Many smaller communities do not have good fiber connections, or have only one expensive provider. Adding more regional middle mile networks allow smaller network operators to build out access networks, as such small operators cannot afford to build their own fiber network to the next large city.

 

Q: The bill offers an additional $2.75 billion for digital equity and inclusion efforts, which could end digital redlining. What is redlining? Do you think the bill can help with the issue?

Providers in urban areas have been accused of failing to upgrade slower broadband networks in lower-income urban areas. Competitors such as fiber providers often don’t build out new networks in such areas, either. The lower speeds and higher prices for such neighborhoods are referred to as digital redlining. It is not quite clear yet what kind of projects will be funded. There are promising ideas of providing free Wi-Fi in lower-income apartment buildings, for example.

 

Q: It seems that the pandemic has been helpful in revealing how inadequate broadband service is in the US. Can you talk about the key findings of your NSF Broadband Research 2020 Report and if the infrastructure bill will actually help achieve those goals?

The NSF Broadband Research Report emphasizes the need to consider measuring and addressing both availability and adoption, including providing training and devices, so the infrastructure bill offers many of the tools envisioned in the report. However, the report is largely about research questions and recommendations for facilitating such research, not policy mechanisms. Even with new, substantial funding, we have to make sure that the programs are effective and reach the right people. For example, the report recommends that all broadband agencies gather and release data as these programs are initiated so that we can learn from successes.

 

Q: How happy are you with the infrastructure bill? Do you think that it will help fast track the broadband situation in the US? Prior to its passing, how long do you think it would have taken the US to catch up?

The bill is really the first large-scale, all-in, and comprehensive attempt to finally address broadband availability and affordability. It is both a visionary and necessary step towards digital inclusion. My main concern is implementation and coordination. For example, the bill relies on private entities, from for-profit companies to electric cooperatives, to deploy broadband, but cannot force companies to build out everywhere or use the best long-term technology. Grants are made to states who may not have the institutional capacity to ensure that the most efficient organizations build out networks that will still be sufficient to meet local broadband needs 20 years from now. We want to avoid having to spend another few ten billion dollars of taxpayer money ten years from now, after all.

Since the effort is very state-centric, making it possible for researchers and public interest organizations to monitor and evaluate the build-out and digital inclusion efforts will be challenging. (My research group is currently attempting to analyze the existing, much smaller, subsidy efforts, run by two federal agencies, and finding it quite challenging to get a good picture of the impact.)

 

Q: What are the positive effects that you see will come out of this effort? 

Broadband has become a must-have infrastructure for any community, just like clean water or reliable electricity. For education, universal broadband will make it much easier to provide the same learning experience to everyone. Right now, teachers often cannot assign projects or homework that relies on internet resources since not all students have easy access. Continuing education and training will become a bit easier for adults looking to gain new skills. Rural areas lack access to specialists and mental health resources; telemedicine can bridge at least some of these gaps. Some rural areas located within maybe a hundred miles of major cities may be able to attract younger residents who can now work from home and only drive to their office occasionally. Many small businesses need reliable, high-speed internet to offer their goods and services.

That said, I would not expect to fix all societal challenges – broadband access is necessary and even helpful for education, health care and public services, but it is not a replacement for providing high-quality education, health care, and public services more generally.

 

 

Shuran Song and Huy Ha Win the Best System Paper at CoRL 2021

Research papers from the department were accepted to the Conference on Robot Learning 2021 (CoRL) and the Best System Paper award was given to Assistant Professor Shuran Song and PhD student Huy Ha. 

Below are the abstracts and links to the papers: 

Best System Paper Award
FlingBot: The Unreasonable Effectiveness of Dynamic Manipulation for Cloth Unfolding
Huy Ha Columbia University and Shuran Song Columbia University

Abstract: 
High-velocity dynamic actions (e.g., fling or throw) play a crucial role in our everyday interaction with deformable objects by improving our efficiency and effectively expanding our physical reach range. Yet, most prior works have tackled cloth manipulation using exclusively single-arm quasi-static actions, which requires a large number of interactions for challenging initial cloth configurations and strictly limits the maximum cloth size by the robot’s reach range. In this work, we demonstrate the effectiveness of dynamic flinging actions for cloth unfolding with our proposed self-supervised learning framework, FlingBot. Our approach learns how to unfold a piece of fabric from arbitrary initial configurations using a pick, stretch, and fling primitive for a dual-arm setup from visual observations. The final system achieves over 80% coverage within 3 actions on novel cloths, can unfold cloths larger than the system’s reach range, and generalizes to T-shirts despite being trained on only rectangular cloths. We also finetuned FlingBot on a real-world dual-arm robot platform, where it increased the cloth coverage over 4 times more than the quasi-static baseline did. The simplicity of FlingBot combined with its superior performance over quasi-static baselines demonstrates the effectiveness of dynamic actions for deformable object manipulation.

 

Toward Robots That Learn To Summarize Their Actions In Natural Language: A Set Of Tasks
Chad DeChant Columbia University and Daniel Bauer Columbia University

Abstract:
Robots should be able to report in natural language what they have done. They should provide concise summaries, respond to questions about them, and be able to learn from the natural language responses they receive to their summaries. We propose that developing the capabilities for robots to summarize their actions is a new and necessary challenge that should be taken up by the robotic learning community. We propose an initial framework for robot action summarization, presented as a set of tasks that can serve as a target for research and a measure of progress.

 

The Boombox: Visual Reconstruction from Acoustic Vibrations
Boyuan Chen Columbia University, Mia Chiquier Columbia University, Hod Lipson Columbia University, and Carl Vondrick Columbia University

Abstract:
Interacting with bins and containers is a fundamental task in robotics, making state estimation of the objects inside the bin critical.

While robots often use cameras for state estimation, the visual modality is not always ideal due to occlusions and poor illumination. We introduce The Boombox, a container that uses sound to estimate the state of the contents inside a box. Based on the observation that the collision between objects and their containers will cause an acoustic vibration, we present a convolutional network for learning to reconstruct visual scenes. Although we use low-cost and low-power contact microphones to detect the vibrations, our results show that learning from multimodal data enables state estimation from affordable audio sensors. Due to the many ways that robots use containers, we believe the box will have a number of applications in robotics.

 

11 Research Papers Accepted to EMNLP 2021

Papers from CS researchers were accepted to the Empirical Methods in Natural Language Processing (EMNLP) 2021. The Best Short Paper Award was also awarded to a paper from the Spoken Language
Processing Group.


Best Short Paper Award
CHoRaL: Collecting Humor Reaction Labels from Millions of Social Media Users
Zixiaofan Yang, Shayan Hooshmand and Julia Hirschberg

Abstract:

Humor detection has gained attention in recent years due to the desire to understand user-generated content with figurative language. However, substantial individual and cultural differences in humor perception make it very difficult to collect a large-scale humor dataset with reliable humor labels. We propose CHoRaL, a framework to generate perceived humor labels on Facebook posts, using the naturally available user reactions to these posts with no manual annotation needed. CHoRaL provides both binary labels and continuous scores of humor and non-humor. We present the largest dataset to date with labeled humor on 785K posts related to COVID-19. Additionally, we analyze the expression of COVID-related humor in social media by extracting lexico-semantic and affective features from the posts, and build humor detection models with performance similar to humans. CHoRaL enables the development of large-scale humor detection models on any topic and opens a new path to the study of humor on social media.

 

A Bag of Tricks for Dialogue Summarization
Muhammad Khalifa, Miguel Ballesteros and Kathleen McKeown

Abstract:

Dialogue summarization comes with its own peculiar challenges as opposed to news or scientific articles summarization. In this work, we explore four different challenges of the task: handling and differentiating parts of the dialogue belonging to multiple speakers, negation understanding, reasoning about the situation, and informal language understanding. Using a pretrained sequence-to-sequence language model, we explore speaker name substitution, negation scope highlighting, multi-task learning with relevant tasks, and pretraining on in-domain data. Our experiments show that our proposed techniques indeed improve summarization performance, outperforming strong baselines.

 

Timeline Summarization based on Event Graph Compression via Time-Aware Optimal Transport
Manling Li, Tengfei Ma, Mo Yu, Lingfei Wu, Tian Gao, Heng Ji and Kathleen McKeown

Abstract:

Timeline Summarization identifies major events from a news collection and describes them following temporal order, with key dates tagged. Previous methods generally generate summaries separately for each date after they determine the key dates of events. These methods overlook the events’ intra-structures (arguments) and inter-structures (event-event connections). Following a different route, we propose to represent the news articles as an event-graph, thus the summarization task becomes compressing the whole graph to its salient sub-graph. The key hypothesis is that the events connected through shared arguments and temporal order depict the skeleton of a timeline, containing events that are semantically related, structurally salient, and temporally coherent in the global event graph. A time-aware optimal transport distance is then introduced for learning the compression model in an unsupervised manner. We show that our approach significantly improves the state of the art on three real-world datasets, including two public standard benchmarks and our newly collected Timeline100 dataset.

 

Don’t Go Far Off: An Empirical Study on Neural Poetry Translation
Tuhin Chakrabarty, Arkadiy Saakyan and Smaranda Muresan

Abstract:

Despite constant improvements in machine translation quality, automatic poetry translation remains a challenging problem due to the lack of open-sourced parallel poetic corpora, and to the intrinsic complexities involved in preserving the semantics, style and figurative nature of poetry. We present an empirical investigation for poetry translation along several dimensions: 1) size and style of training data (poetic vs. non-poetic), including a zeroshot setup; 2) bilingual vs. multilingual learning; and 3) language-family-specific models vs. mixed-language-family models. To accomplish this, we contribute a parallel dataset of poetry translations for several language pairs. Our results show that multilingual fine-tuning on poetic text significantly outperforms multilingual fine-tuning on non-poetic text that is 35X larger in size, both in terms of automatic metrics (BLEU, BERTScore, COMET) and human evaluation metrics such as faithfulness (meaning and poetic style). Moreover, multilingual fine-tuning on poetic data outperforms bilingual fine-tuning on poetic data.

 

Implicit Premise Generation with Discourse-aware Commonsense Knowledge Models
Tuhin Chakrabarty, Aadit Trivedi and Smaranda Muresan

Abstract:

Enthymemes are defined as arguments where a premise or conclusion is left implicit. We tackle the task of generating the implicit premise in an enthymeme, which requires not only an understanding of the stated conclusion and premise, but also additional inferences that could depend on commonsense knowledge. The largest available dataset for enthymemes (Habernal et al., 2018) consists of 1.7k samples, which is not large enough to train a neural text generation model. To address this issue, we take advantage of a similar task and dataset: Abductive reasoning in narrative text (Bhagavatula et al., 2020). However, we show that simply using a state-of-the-art seq2seq model fine-tuned on this data might not generate meaningful implicit premises associated with the given enthymemes. We demonstrate that encoding discourse-aware commonsense during fine-tuning improves the quality of the generated implicit premises and outperforms all other baselines both in automatic and human evaluations on three different datasets.

 

GOLD: Improving Out-of-Scope Detection in Dialogues using Data Augmentation
Derek Chen and Zhou Yu

Abstract:

Practical dialogue systems require robust methods of detecting out-of-scope (OOS) utterances to avoid conversational breakdowns and related failure modes. Directly training a model with labeled OOS examples yields reasonable performance, but obtaining such data is a resource-intensive process. To tackle this limited-data problem, previous methods focus on better modeling the distribution of in-scope (INS) examples. We introduce GOLD as an orthogonal technique that augments existing data to train better OOS detectors operating in low-data regimes. GOLD generates pseudo-labeled candidates using samples from an auxiliary dataset and keeps only the most beneficial candidates for training through a novel filtering mechanism. In experiments across three target benchmarks, the top GOLD model outperforms all existing methods on all key metrics, achieving relative gains of 52.4%, 48.9% and 50.3% against median baseline performance. We also analyze the unique properties of OOS data to identify key factors for optimally applying our proposed method.

 

Continual Learning in Task-Oriented Dialogue Systems
Andrea Madotto, Zhaojiang Lin, Zhenpeng Zhou, Seungwhan Moon, Paul Crook, Bing Liu, Zhou Yu, Eunjoon Cho, Pascale Fung, and Zhiguang Wang

Abstract:

Continual learning in task-oriented dialogue systems can allow us to add new domains and functionalities through time without incurring the high cost of a whole system retraining. In this paper, we propose a continual learning benchmark for task-oriented dialogue systems with 37 domains to be learned continuously in four settings, such as intent recognition, state tracking, natural language generation, and end-to-end. Moreover, we implement and compare multiple existing continual learning baselines, and we propose a simple yet effective architectural method based on residual adapters. Our experiments demonstrate that the proposed architectural method and a simple replay-based strategy perform comparably well but they both achieve inferior performance to the multi-task learning baseline, in where all the data are shown at once, showing that continual learning in task-oriented dialogue systems is a challenging task. Furthermore, we reveal several trade-off between different continual learning methods in term of parameter usage and memory size, which are important in the design of a task-oriented dialogue system. The proposed benchmark is released together with several baselines to promote more research in this direction.

 

Zero-Shot Dialogue State Tracking via Cross-Task Transfer
Zhaojiang Lin, Bing Liu, Andrea Madotto, Seungwhan Moon, Zhenpeng Zhou, Paul Crook, Zhiguang Wang, Zhou Yu, Eunjoon Cho, Rajen Subba, and Pascale Fung

Abstract:

Zero-shot transfer learning for dialogue state tracking (DST) enables us to handle a variety of task-oriented dialogue domains without the expense of collecting in-domain data. In this work, we propose to transfer the crosstask knowledge from general question answering (QA) corpora for the zero-shot DST task. Specifically, we propose TransferQA, a transferable generative QA model that seamlessly combines extractive QA and multichoice QA via a text-to-text transformer framework, and tracks both categorical slots and non-categorical slots in DST. In addition, we introduce two effective ways to construct unanswerable questions, namely, negative question sampling and context truncation, which enable our model to handle “none” value slots in the zero-shot DST setting. The extensive experiments show that our approaches substantially improve the existing zero-shot and few-shot results on MultiWoz. Moreover, compared to the fully trained baseline on the Schema-Guided Dialogue dataset, our approach shows better generalization ability in unseen domains.Zero-shot transfer learning for dialogue state tracking (DST) enables us to handle a variety of task-oriented dialogue domains without the expense of collecting in-domain data. In this work, we propose to transfer the crosstask knowledge from general question answering (QA) corpora for the zero-shot DST task. Specifically, we propose TransferQA, a transferable generative QA model that seamlessly combines extractive QA and multichoice QA via a text-to-text transformer framework, and tracks both categorical slots and non-categorical slots in DST. In addition, we introduce two effective ways to construct unanswerable questions, namely, negative question sampling and context truncation, which enable our model to handle “none” value slots in the zero-shot DST setting. The extensive experiments show that our approaches substantially improve the existing zero-shot and few-shot results on MultiWoz. Moreover, compared to the fully trained baseline on the Schema-Guided Dialogue dataset, our approach shows better generalization ability in unseen domains.

 

Refine and Imitate: Reducing Repetition and Inconsistency in Persuasion Dialogues via Reinforcement Learning and Human Demonstration
Weiyan Shi, Yu Li, Saurav Sahay, and Zhou Yu

Abstract:

Despite the recent success of large-scale language models on various downstream NLP tasks, the repetition and inconsistency problems still persist in dialogue response generation. Previous approaches have attempted to avoid repetition by penalizing the language model’s undesirable behaviors in the loss function. However, these methods focus on tokenlevel information and can lead to incoherent responses and uninterpretable behaviors. To alleviate these issues, we propose to apply reinforcement learning to refine an MLE-based language model without user simulators, and distill sentence-level information about repetition, inconsistency and task relevance through rewards. In addition, to better accomplish the dialogue task, the model learns from human demonstration to imitate intellectual activities such as persuasion, and selects the most persuasive responses. Experiments show that our model outperforms previous state-of-the-art dialogue models on both automatic metrics and human evaluation results on a donation persuasion task, and generates more diverse, consistent and persuasive conversations according to the user feedback.

 

Attribute Alignment: Controlling Text Generation from Pre-trained Language Models
Dian Yu, Zhou Yu, and Kenji Sagae

Abstract:

Large language models benefit from training with a large amount of unlabeled text, which gives them increasingly fluent and diverse generation capabilities. However, using these models for text generation that takes into account target attributes, such as sentiment polarity or specific topics, remains a challenge. We propose a simple and flexible method for controlling text generation by aligning disentangled attribute representations. In contrast to recent efforts on training a discriminator to perturb the token level distribution for an attribute, we use the same data to learn an alignment function to guide the pre-trained, non-controlled language model to generate texts with the target attribute without changing the original language model parameters. We evaluate our method on sentiment- and topiccontrolled generation, and show large performance gains over previous methods while retaining fluency and diversity.

 

KERS: A Knowledge-Enhanced Framework for Recommendation Dialog Systems with Multiple Subgoals
Jun Zhang, Yan Yang, Chencai Chen, Liang He, and Zhou Yu

Abstract:

Recommendation dialogs require the system to build a social bond with users to gain trust and develop affinity in order to increase the chance of a successful recommendation. It is beneficial to divide up, such conversations with multiple subgoals (such as social chat, question answering, recommendation, etc.), so that the system can retrieve appropriate knowledge with better accuracy under different subgoals. In this paper, we propose a unified framework for common knowledge-based multi-subgoal dialog: knowledge-enhanced multi-subgoal driven recommender system (KERS). We first predict a sequence of subgoals and use them to guide the dialog model to select knowledge from a sub-set of existing knowledge graph. We then propose three new mechanisms to filter noisy knowledge and to enhance the inclusion of cleaned knowledge in the dialog response generation process. Experiments show that our method obtains state-of-the-art results on DuRecDial dataset in both automatic and human evaluation.