Create a language model for new/difficult languages
(e.g. Greek, Croatian, Spanish, French)
Use a wikipedia dump, words2vec and machine learning to create a language model. Take an English language model (pre-trained), use transfer learning to make it applicable for other languages, such as low-resource languages.
Apply the language model to a natural language task:
- document classification
- seq2seq conversation generation for Telekom Hilft.
Equipment needed: Deep learning cloud, Nvidia
- Wikipedia dump of language needed to learn
- English or German data, such as chat texts, text classification
- Test data for low-resource languages
- Machine learning and information extraction
- Python, tensorflow, pytorch, sklearn
- Experience in deep learning, machine translation, seq2seq learning
Main contact person: Fang XU
Knowledge extraction from Telekom Hilft interactions
Use chat data from "Telekom Hilft" to identify topics and keywords, find relations between keywords and ultimately show how to utilize the learned/identified information to support customer care agents or users.
Goal: Extract most relevant topics (i.e. use IBM ppt as basis), extract further keywords and relations between topics/words. Show topics and relations between topics/words and how they are represented. Demonstrate in a live demo on unseen chat messages how the learned topics and keywords can be used to support users or customer care agents.
Participants bring their own device & tools + Access to Open Telekom Cloud and/or DGX
Chat messages from "Telekom hilft" (Dataset with Twitter & Facebook chat messages, approx 2.0 Mio messages, 1.2 GB of data in .csv format, covering Tweets and Facebook messages from 01/2016 until 06/2017)
- AI/Machine Learning expertise,
- Natural Language Processing,
- Knowledge representation, Ontologies,
- Python, R, Scala, Java, ...
- Experience in dialoge systems might be helpful
- Potentially Web-Development & UX/UI skills when going for a non-command line demo.
Main contact person: Mathias Kirsten
Create a customer support chatbot
Build a chatbot to simulate support agent interactions. The bot should be able to reply to questions such as “My Internet connection is not working, what should I do?”. You can use any existing framework such as Dialogflow (api.ai) or wit.ai.
The evaluation consists of a small non public set of questions with varying degree of complexity that will be tested against the system. You can choose German or English language for the bot. Some example questions:
- "Welche Handys haben mindestens 256 GB Speicher?
- "Do you sell the iPhone 7?"
Equipment needed: Own laptop
You can use any data you might find appropriate for the challenge. Some suggestions include the Telekom Hilft online forum and data dump with millions of customer/support agent interaction examples, any social media data you can get and the telekom.de website for product entities, etc.
No programming skills are required but information extraction skills might be useful in order to process and prepare data.
Rules: We kindly ask the participants to document the approach taken to solve the challenge so that we can make a better evaluation.
Main contact person: Yaser Martinez Palenzuela
Autonomously Target Users with Emotional Responses for Extra Customer Care
We are interested in the automatic evaluation of the customer emotional state during textual man-machine interaction, for example in chatbot dialogs.
Mainly two questions arise:
- How (and what exactly) to measure?
- How to deal with the findings?
In preparation for this challenge we collected two data sets from German/Austrian human/machine chatbot dialogs which are described in the next sections.
- Classify emotions based on text input
- For set II: how to unify five annotations
- Deal with sparse data, develop strategies to detect emotional data
- How should an automated agent react on emotions? Build a GUI
- For set I: Search for correlations between emotion and topic and report
Equipment needed: Own laptop
People who have some programming experience, ideally with natural language processing, creative designers are also welcome.
Main contact person: Felix Burkhardt
The task is to generate suitable sentences to describe each image from a given set of images. The training dataset will consist of a set of images and natural language annotations in english and german to each image. The models and concepts will later be tested against a separate set of images to evaluate and compare the performance of the results.
Expected outcomes of the challenge:
The developed and trained models should be able to create captions for an unknown set of images.
Equipment needed: GPU hardware will be needed for training and evaluation
- Natural Language Processing
- Deep Learning (transfer learning)
- Experience using Deep Learning frameworks (preferred: Keras / Tensorflow)
- Preferred programming language: Python
For each developed model there need to be documentation explaining:
- The approach taken to solve this challenge
- (The code for data preprocessing, building the model, training the model and evaluation)
- The results of the tests
Main contact person: Christian Beckmann
Create and implement a dialog system that uses the Telekom ontology as it’s domain model. The system should be capable of dialogs similar to the below examples. Your implementation should take German language text as input and return a textual response.
Example dialog 1:
- Mein mobiles Internet funktioniert nicht.
- Was für ein Telefon nutzen Sie?
- Ein iPhone.
- Können Sie bestätigen, dass sie das mobile Internet eingeschaltet haben? Gehen Sie dazu in “Einstellungen” und dann “Mobile Daten”.
- Ja, es ist eingeschaltet!
- Was für einen Mobilfunkvertrag nutzen Sie?
Example dialog 2:
- Welche Größe hat das iPhone?
- Das iPhone 8 hat die Maße A x B x Z, es ist das bisher größte iPhone.
Example dialog 3:
- Ich würde gerne einen Vertrag mit der Telekom abschließen.
- OK, wollen Sie einen Festnetzvertrag oder einen Mbilfunkvertrag oder beides?
- Wir können Ihnen die folgenden Verträge anbieten: A, B, C. C können Sie für zusätzliche 10 Euro auch mit Entertain bekommen.
- OK, ich nehme C!
- Alles klar! Wie groß soll Ihr mobiles Datenpaket sein? 1 GB, 4 GB or 8 GB
Expected outcomes of the challenge:
Dialog models to automatically derive dialog actions from the ontology.
Equipment needed: Participants bring their own device & tools.
- Dialog models
- The dialog system can choose it’s actions in whatever way (statistical or rule-based, plan-based or seq2seq,…) but the direction of the dialog should be driven by the ontology, i.e., any goal-directed behavior should be derived from the ontology in a principled way.
- The ontology can be extended or modified based on content on the Deutsche Telekom website, chat protocols from the “Telekom Hilft” corpus, and other official information about Telekom products and associated products (e.g. phones available with contracts)
Language generation is not judged. Just make sure that the system responses are clear enough for a user to know how to respond.
Before the challenge starts, we will define three tasks consisting of an initial query and user constraints, as well as a corresponding desired outcome of the conversation. Two judges will mimic those users in a conversation with each candidate system. The system coming closest to the desired solution will win the challenge.
Telekom Ontology in XML-RDF, SKOS structured, latest version; language is German.