Reinforcement Learning: why it matters for your business.

Reinforcement learning is a type of machine learning which could modernise the way businesses handle complex workflows....

Reinforcement Learning: why it matters for your business.

What is Reinforcement Learning? The intuition behind it

Reinforcement learning (RL) is an “ancient” area of machine learning that recently gained a lot of attention thanks to new discoveries by google.

The objective of Reinforcement Learning is to take sequential decisions in an optimal way. More specifically an RL algorithm takes short-term decisions while optimising for a longer-term goal through trial and error.

Reinforcement learning is one of three basic machine learning paradigms, alongside supervised learning and unsupervised learning.

In this guide you will learn:

  • What are the applications of RL
  • Why RL matters to your business
  • What is the future of Reinforcement Learning

What are the applications?

Reinforcement Learning has recently made the headlines as computer programs ALphaGgo and ALphaZero prove to be breakthrough technologies. Developed by DeepMind Technologies, the now-Google-owned computer program AlphaGo made history as the first computer program to defeat a professional human in the most challenging board game for AI –  the abstract strategy game ‘Go’.

However, the area isn’t limited to merely beating you in games like chess. Many aspects of our lives have already been revolutionised by the concept. These include:

  • Supply chain optimisation – helping supply chains to cut down on waste, excess storage, unneeded transportation costs, and lost time.
  • Robotics delivering your goods – Amazon warehouse robots learn to do the human tasks of moving an object along the logistics chain.
  • Autonomous vehicles – Tesla’s autopilot self-driving technology relies on reinforcement learning to train deep neural networks using diverse real world scenarios from Tesla’s fleet of cars.
  • Determining next best actions online to keep customers engaged – a variety of systems help reveal and quantify customer preferences in order to make marketing messages, ads, offers and recommendations more relevant and engaging.
  • Smart grid optimisation – RL can captured the energy demand and supply and learn optimal energy load allocation
  • Cooling computers at data centres by reducing and optimizing energy usage
  • Personalising e-commerce experiences – crafting a personalised e-commerce experience for individuals based on how they and similar customers to them behave.
  • Recommendation systems for news feeds – personalizing a feed of news specifically tailored to an individual’s interests.
  • Dynamic pricing estimates – how the price of some goods and services, like an Uber, change depending on how a factor like weather or time of year affects the demand.
  • Algorithmic trading – predicting how companies will perform on the stock exchange.

Why Reinforcement Learning matters to your business

Reinforcement Learning can create value for organisations that deal with complex problems.

In particular, organisations managing complex workflows involving people, machinery and other variables that cannot be controlled (e.g. Markets, weather, road traffic, etc.) are the ones who can benefit the most from it.

The reason is simple and it relates to the very nature of Reinforcement Learning: it continuously learns over time by receiving rewards and punishments for every action taken.

Nature of Reinforcement Learning

This interesting property enables Reinforcement Learning to react to events or environments that it has never seen before.

More specifically, it is particularly suited for :

  • Constantly changing and evolving problems
  • Processes where decisions are to be made at every stage
  • Problems for which you don’t have data ready at hand to train Machine Learning models
  • Situations where you have to correct the errors “on-the-go”
  • Workflows where humans want to achieve long-term results

What are the benefits of Reinforcement Learning?

Reinforcement learning brings several upsides when implemented within an organisation. The three most important being:

  1. Flexibility. You can add requirements that emerge “on-the go” whilst using your Reinforcement Learning set-up . Such requirements can be implemented in every stage of the framework: you can modify the rewards, actions and constraints. For instance, if you are a manufacturer and you are adding a step in your production line, you can model it and add it with minor impact.
  2. Extensibility. Reinforcement learning is a powerful computational framework that enables agents to learn from their experiences in order to optimise performance. This framework combines predictive analytics with combinatorial optimisation and active exploration of a dynamic environment in a way that allows you to solve many of your most difficult data science problems.
  3. Exploration and exploitation balance. When there is too much exploration, we try out things randomly and don’t learn from the experience. When there is too much exploitation, we learn nothing new !

Reinforcement learning can maintain a balance between the two. Exploitation seeks faster results at the expense of better results. Exploration does take far longer, but produces far superior results in general because of its randomness.

Reinforcement learning can solve the problem of slow learning in ML with a balance between exploration and exploitation. This can result in a greater performance and boosted efficiency, compared to other algorithms.

Final thoughts

Many believe deep learning to be one of the biggest breakthroughs in the history of AI. That doesn’t mean it’s going to solve all our problems. We still have countless challenges ahead of us.

However, we must admit that the basic idea is fascinating: a computer program iteratively improves its performance on a task by trying different actions and learning from the resulting feedback.

Reinforcement learning is the AI cherry. It’s something that we haven’t had yet, but it’s been a long time coming and it will forever change the way we build software.

Gemmo's noise classification case study with Sonitus