The Artificial Intelligence Meetup in Quebec City had over 600 participants this year and highlighted research from Google Brain, Siemens, and many other industries and universities. This second edition took place at the Port of Quebec on April 8 and included two conference tracks with 20 presentations.

The focus of this year was mostly on reinforcement learning, creativity, healthcare, smart cities, and IoT. Here is a summary of my 5 favorite talks of the meetup and a concluding remark on ethics.

Reinforcement Learning and Deep Neural Networks

Marc G. Bellemare, Senior Researcher at Google Brain and Adjunct Professor at McGill University

Marc introduced the concepts of reinforcement learning by showing how to make a "Pâté Chinois" recipe. The reinforcement learning agent starts at the first state: an empty plate. Then, the agent must choose an action to transition to a new state. In this case, an action consists of choosing an ingredient (e.g. chopped steak, corn, or potato), and the new state could be a plate with chopped steak. The goal of the agent is to choose a sequence of actions that will maximize a long-term reward, like succeeding the complete recipe.

Marc has contributed to the creation of the Arcade Learning Environment¹ that serves as an interface to hundreds of Atari 2600 games. The goal of the agent in this environment is to choose the correct sequence of actions to win the game and obtain the highest score. The agent receives the current image of the game as input (state), and chooses an action that simulates pressing a button on a joystick controller. To assess the performances of these agents in new situations, a subset of games is used to train the agents and a distinct set is used for evaluation. Since these Atari games are considerably diversified and were created by independent groups, they provide an interesting way to benchmark reinforcement learning agents and assess their general competency.
atari
In 2015, Marc and his team developed DeepQ-network (DQN)², an agent that combines reinforcement learning with Convolutional Neural Networks (CNN). Previous reinforcement learning approaches were mostly based on linear representations and required to manually define how to transform an image into a relevant set of features (i.e. feature engineering). In contrast, CNNs are known to perform especially well on image classification tasks and can derive their own feature representation of the image. The combination of a reinforcement learning agent with a CNN produced an end-to-end agent that became more robust to changes across different games.

Duckietown

Liam Paull, Assistant Professor, Computer Science and Operational Research Department, Montreal University

As a researcher working on autonomous cars, Liam has spent a lot of time on the engineering problem rather than the algorithmic one. He went on to find a solution to test his research in a simpler environment. Here was born Duckietown, a miniature model of a town where citizens are ducks. The autonomous car consisting of a Raspberry Pi with a camera must navigate the roads of Duckietown while respecting the traffic lights and avoiding pedestrians. The environment is open-source, reproducible at home, and competition benchmarks are available. Live Duckiebot competitions will be held at ICRA and NeurIPS this year.
ducks

The Construction of Generative Musical Models and How to Use them Creatively

Pablo Samuel Castro, Researcher, Google Brain

Pablo is a researcher and musician who uses machine learning models to generate music. He created the LyricAI system that helped the musician David Usher write new lyrics.

While the previous version of LyricAI was based on a Recurrent Neural Network (RNN), the new version³ currently relies on Transformer models⁴. Transformer models don't need recurrent layers (as used by RNN), but require instead a self-attention mechanism. This mechanism is especially important to help the model look at similar words far in the past of the sequence and keep the next generated words coherent (i.e. long-term dependencies).

Lyric AI uses two Transformer models. The first one is used to generate the grammatical structure of the lyrics. This model is trained on a lyric dataset where a lyric line is fed as input, and the model must predict the Part-of-Speech (PoS) tags of the next line. The second model is used to generate the words in the lyrics. In that case, the model is trained on a book dataset where sentences are divided into two parts. Given the words in first part of the sentence and the PoS tags of the second part, the model must predict the sequence of words appearing in the second part of the sentence.
piano
Pablo ended his talk by doing a live demo of creative machine learning models. He used an RNN model to generate drum beats, started a piano improvisation, and let another model generate the remaining piano melody.

AI (ML) and Healthcare Now and the Future

Anna Goldenberg, Senior Scientist at the Hospital for Sick Children and Assistant Professor at the University of Toronto

Machine learning has many applications in healthcare. Anna and her team have used machine learning approaches to tailor treatments to similar groups of individuals⁵, predict the age of cancer onset for people at high risks⁶, and predict malignancy of thyroid cancer to reduce unnecessary surgeries by 20%.

Despite these advances, there are different problems with machine learning models that prevent them from being used in healthcare. One reason is that policies in place can interfere with the data. For example, patients with asthma start with more aggressive treatments for pneumonia because of their higher risks. A model trained on the outcome of patients could then wrongly infer that asthma leads to a higher chance of survival. Studies are also required to compare the prediction of a model with those of the experts to prove that the model causes no harm before being deployed. Once the model is deployed, updates can be harder to implement because of FDA regulations, leading to lower performance over time.

Bandit Algorithms and their Application to Adaptive Clinical Trials

Audrey Durand, Postdoctoral Student, McGill University

Audrey used an interactive reinforcement learning agent to select efficient treatments for mouses with skin cancer⁷. The treatments in this clinical trial were assigned in a sequential way rather than at the beginning of the study. Hence, the effects of a treatment on the first patient were used to help the agent choose a treatment for the following patients.

The agent assigned treatments by estimating the probability of success of different options and compromising exploration and exploitation phases. The agent exploits when it assigns the optimal treatment according to current information. In contrast, the agent explores when it assigns a treatment to acquire additional information.
pills
During the trial, the life expectancy of mouses increased over time and the variance diminished. This behavior is expected because the time spent in the exploration phase decreases as the study progresses. The agent also learned to alternate two treatments over time, which helped the mouses to recover from high doses of chemotherapy.

A Closing Note on Ethics

The meetup ended on a subject that is becoming increasingly popular among the AI community: ethical impacts of AI systems. François Laviolette, Director of the Big Data Research Center at Laval University, presented different ethical concerns surrounding AI systems. One example is the Cambridge Analytica scandal where personal data of millions of Facebook users was used without their consent to influence U.S. votes. Other concerns include fairness, privacy, and accountability of AI systems, as well as their impacts on job security.

To address these concerns, François and Lyse Langlois announced the creation of the Observatory on the Societal Impacts of AI. The observatory regroups 18 educational institutions and 160 researchers who will work to minimize the negative impacts of AI systems. If you or your company are interested in developing responsible AI systems, you are invited to sign the Montreal Declaration for a Responsible Development of Artificial Intelligence.

References

1. Bellemare, Marc G, Yavar Naddaf, Joel Veness, and Michael Bowling. 2013. “The Arcade Learning Environment: An Evaluation Platform for General Agents.” Journal of Artificial Intelligence Research 47. https://www.jair.org/index.php/jair/article/view/10819: 253–79.

2. Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, et al. 2015. “Human-Level Control Through Deep Reinforcement Learning.” Nature 518 (7540). https://www.nature.com/articles/nature14236; Nature Publishing Group: 529.

3. Castro, Pablo Samuel, and Maria Attarian. 2018. “Combining Learned Lyrical Structures and Vocabulary for Improved Lyric Generation.” ArXiv Preprint ArXiv:1811.04651. https://arxiv.org/abs/1811.04651.

4. Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. “Attention Is All You Need.” In Advances in Neural Information Processing Systems, 5998–6008. http://papers.nips.cc/paper/7181-attention-is-all-you-need.

5. Saria, Suchi, and Anna Goldenberg. 2015. “Subtyping: What It Is and Its Role in Precision Medicine.” IEEE Intelligent Systems 30 (4). https://ieeexplore.ieee.org/abstract/document/7156005; IEEE: 70–75.

6. Erdman, Lauren, Ben Brew, Jason Berman, Adam Shlien, Andrea Doria, David Malkin, and Anna Goldenberg. 2017. “Age of Cancer Onset Differentiated by Sex and TP53 Codon Change in Li-Fraumeni Syndrome Patient Population.” http://cancerres.aacrjournals.org/content/77/13_Supplement/3409.short; AACR.

7. Durand, Audrey, Charis Achilleos, Demetris Iacovides, Katerina Strati, Georgios D Mitsis, and Joelle Pineau. 2018. “Contextual Bandits for Adapting Treatment in a Mouse Model of de Novo Carcinogenesis.” In Machine Learning for Healthcare Conference, 67–82. http://proceedings.mlr.press/v85/durand18a.html.