Compound Effects

Why do we have a poor intuition for processes that unfold non-linearly? How can we leverage compound effects in order to spiral ourselves upwards in terms of health, wealth and knowledge? Let’s explore.

Physical World

Many of our intuitions are rooted in the physical world. When we roll a ball gently across the floor, and it disappears behind a couch we know it will reappear on the other side. It may roll a bit slower and come to a halt eventually. However, we certainly don’t expect that ball to continue to accelerate and shoot through the outer wall, across the yard into the house of an adjacent neighbor.

It’s not that we never expect something to accelerate: if you jump out of a plane you expect to accelerate initially, falling faster and faster. However, that only last for a dozen seconds. After this you reach terminal velocity after which you keep hurtling towards the earth’s surface at a constant speed.

Miracles and Catastrophes

We don’t possess a good intuition for things that keep accelerating, for the simple reason that this does not happen in the physical world in a way we can easily observe or experience directly. When we do observe the outcomes of such accelerated processes, we often refer to its outcome as either a miracle or a catastrophe.

Consider our main building block: a single cell. It divides and after twenty generations of dividing gives rise to a million cells in total. Add to that another twenty generations and there are enough cells to make a human being like yourself. Although this process is, to an extent, scientifically explainable, many label this a miracle when they observe it.

In contrast: we can create nuclear energy by a controlled chain reaction of splitting the nuclei of atoms. When left uncontrolled this can lead to a nuclear explosion, which we label a catastrophe.

In both cases the outcome, a human being or an explosion, is an outcome that we observed and can reason about. Nevertheless, even knowing the facts, the outcome still feels surprising. It does not feel intuitive.

A Rational Example

To explore this a bit further, let’s look at a practical example to test your intuition. Let’s say that I gave you a choice between three options: (a) I give you ten euro’s now, which I guarantee will grow with five percent every day for the next four months, or (b) I give you ten euro’s now and ten euro for every day during the next four months, or (c) instead I give you a thousand euro’s and well: that’s it. Think about it for a moment: what would you do? Read back, reason and pick option a, b or c.

Now that you have picked, let’s take a look at what your best option really was. Starting with the last (c) option, the 1000 euro in your hand, that really is the best choice during a little more than the first three months of the time proposed. However, this is surpassed by the middle option (b) for getting ten euro for every day which tops out after about four months at 1200 euro. Spectacularly, the linear growth of (b) is passed even a couple of days earlier by option (a) with an exponential growth of five percent every day. In fact the first option literally explodes and balloons to nearly 3500 euro after the four months have passed! Was this in line with your expectations?

Interestingly the best option in the end performs quite poorly during the first three months. In fact: quite a bit worse than both other options. It is only after quite some time that the exponential approach starts to really pay off, and when it does: it pays off big time.

Seeing exponential effects plotted this way can help to foster a more intuitive grasp for them, which is much more difficult to infer from only a description. Let’s dive a bit deeper into applied implications of this exponential curve.

Money

The effect of making more money with some money is called compounding. The idea is that you start with some initial amount, called the principal, and then get some interest over this at the end of a time period, when you add that interest back to the principal it is called compound interest, as you can keep repeating this cycle like we did in the graph above.

Whether you know it or not, you are heavily relying on this effect if you take part in any sort of pension scheme, have money in a savings account or are holding onto investments. The idea is that if you have money you can lend it out to others. For this you get compensated: either by interest paid on the loan you provide, or with dividends or increased stock value in case of investments. Either way: you are making money with money.

It is important to realize that the flip side also holds: if you borrow money, you pay interest to whomever is providing you with it. In turn that means you can spend less. Thus, the exponential curve can bend upwards, but it can similarly bend downwards. This also explains the fact that people that have a lot of debt, more easily spiral downwards into a situation with even more debt.

So far we have covered familiar territory, but now I ask you to consider that the same thing that applies to money, also applies to your habits and skills.

The Direction of Habits and Choices

Let’s look at two simple habits. Firstly, brushing your teeth. Spiraling downwards: if you forget to brush your teeth for a day, you’re probably okay. However, if you don’t do it for a year, the exponential effect of bacteria feasting in your mouth, will likely cause significant decay of your teeth. In addition to that direct negative effect, there are others collateral ones too. Just think of the social implications of not brushing your teeth for such a long time. A spiral downwards thus pulls down other things in its wake.

Spiraling upwards: if you read in a book every day, you’re likely to read quite a few books in a year. There is knowledge acquisition even if you remember only a fraction of what you read. Though, the real impact comes after, where if you keep doing this consistently, you can make connections between concepts that you learned previously yielding non-linear gains in knowledge.

Skill Acquisition: A brief diversion into learning

Interpreting a post like this requires the skill of reading. While you probably don’t remember it, reading was incredibly difficult at first. The foundations for this skill were created from the very first time you heard anything. Further growth relied heavily on your environment. You only later learned to link sounds to symbols. Learning to do this consistently and growing a vocabulary large enough to read a text like this took many years. However, currently you are probably not exerting conscious effort to read the letters, or to understand the sentences.

Most adults find learning something new very challenging. One reason for this is that initial progress is usually slow which can be quite discouraging. However, this slow growth is entirely to be expected: like the ten euro’s growing very slowly during the first month in our money example.

Unfortunately, many people simply give up too early, perhaps being thrown off course by their linear expectation of returns. This happens especially for the effort they put in early in the process, where the return on the time spend is still fairly low. After all: when picking up something new you first need to master the basics. Getting through that stage can be though. There is no quick fix for this.

The Shape of Learning

When you are learning something new, you should not expect linear gains for the time you put in. Rather, when you are consistent and stick with it, you will see some large jump in competence every so often. This is somewhat similar to the compound interest effect. Let’s look at an example learning curve.

As I alluded to, learning curves share some similarity with the compound effect, but they are certainly not identical. Learning is not a smooth process that continues forever for a specific skill. Rather, as shown in the example graph above: it is full of plateaus, regression and tapers off at certain point. Beyond this point more and more time needs to be invested to get the skill to a higher level. We can see this if we ignore the details and look at the curve at the distance. This reveals an S-shaped curve. The learning plateaus form smaller S-curves inside a larger S-curve.

Looking at learning through these curves is useful, as they reveal both accelerating and decelerating effects visually. However, they are also limited to specific skills that have a clear path towards mastery. The interplay between different skills and the fluidity in many fields makes their real-world application limited. After all, even if you can identify a letters on a page flawlessly, that does not mean that you can actually understand what you read. And, even if you do understand a text at some level, you may not understand it at all levels the author intended to convey.

After this brief diversion into learning, let’s return to the topic of habits and choices and how we can more practically apply compound effects there.

Choices

We are continuously presented with the challenge of making decisions. The effect of all the small choices we make can lead us to either remain level, spiral upwards or downwards.

Consider three areas: things that you are, things that you have and that you know. In all these areas you can make choices daily, that snowball you in a direction with a positive or negative outcome. An important precondition for this is that you own the outcome itself. After all, outcomes are the result of many small decisions that you yourself made in the past.

These choices really are moment-to-moment things. Do you go to the gym or stay at home? Will you impulsively buy something you see in an ad, or stick to your financial plan and budget? Do you stay at the mediocre job you have or find a better one where you can develop new skills?

What is right for you is what aligns with your personal goals. These goals can be in the areas of physical and mental health, financial and job security, and acquisition of knowledge and skills.

Goals

Once you decide on some area to improve, it is time to make a plan and stick with it. Here is were most people are too ambitious about what they can achieve in the short term. They choose process goals that are hard to keep up. Like going to the gym every day, living in an unusually spartan way, or overloading their brains with information.

Instead it may be better to choose very modest process goals that you can keep up over a long time. Go to the gym twice a week and eat hundred calories less per day, transfer five percent of your income to a savings account automatically, choose one on-line course and spend an evening per week to complete it at most. As we have seen many exponential effects are the result of doing something consistently over a long period of time.

Conclusion

If you start doing something consistently, your progress at first will be slow. However, after some time the effect of whatever you do will start compounding. This process is not intuitive. The outcome will surprise you, even if you understand the concept of compounding rationally.

The effect of compounding is that you will either start to spiral up, or down, in any given area based on the many small choices that you make. Setting modest process goals helps in creating long term consistency which will in turn lead to a noticeable compound effect in your outcomes.

If you are not satisfied with were you are with respect to some specific areas of your life: find places where you can leverage compound effects by making small consistent process changes. Take ownership of the result by setting clear goals and tracking your progress. Then enjoy your outcomes spiral upwards.

References

  1. Hardy, D. (2010). The Compound Effect.

Share & Discuss

Chronoception

I was once took part in a class where the instructor performed an interesting experiment. He asked us all to close our eyes, and then raise our hands and open our eyes when each of us thought a minute of time had passed. After that he would tell us how far we were off. To my amazement there was quite some difference, with some people raising their hands quite early, some quite late, and some nearly spot on. Now, this was not a test of aptitude at timing, it was a test of a specific type of perception: chronoception.

I remember being quite bored at times as a child. Many mundane things seemed to take very long. Yet, the older I have become, the faster time seems to pass. Asking around, I found out that I am not the only one with that experience. During that class I raised my hand slightly later than the one minute marker. However, now, many years later, I am convinced that if I’d take it again, I’d raise my hand quite a bit later than the minute mark.

Time of course passes at a steady rate for everyone, that is: time in the physical world. However, that is not the same rate at which time appears to pass: our chronoception. How do physical and perceived time relate? Let’s dive deeper.

Fraction of Life Argument

When you were one, that one year represented one hundred percent of your life. Conversely when you turned two, the first year constituted half of year life and the second year as well. Following this logic, by the time you turn eighteen that eighteenth year adds only about five and a half percent to your life up to that point.

Going ahead in time, the hundredth year of your life would add only one percent. The basic idea of this fractional argument is that each additional year you live is a smaller part of your life. If we discount things that happen before age five, as most people have little recollection of this, and look at this strictly numerically, we get the graph shown below.

Life in Quarters: relative age as we get older [5]

Let’s interpret: roughly your teenage years are about as long as your twenties and thirties combined according to this graph. Although mathematically attractive, there are some problems with this perspective.

Consider that this theory implies that time at age ten seems to go five times more slowly than time at age fifty, and that is not quite what really seems to happen. A ten year old does not see his fifty year old uncle respond in slow motion, and juxtaposed: the fifty year old uncle does not see his ten year old niece dart around five times more quickly. Of course there are differences in time perception, but a five fold difference seems like an unlikely stretch.

In addition to this, there is one other major problem with this fractional argument: it does not accurately represent perceived time, as that does not pass at a constant rate, our chronoception is variable as we’ll see next.

Flow Control

Waiting in a line in the supermarket, particularly when you are in a hurry, seemingly takes forever. You notice the old lady fidgeting with her hands to get the cash money out of her wallet. Then a kid that just can not seem to stop screaming. Followed by someone who nervously taps his foot standing next to you. However, when you finally exit the supermarket and drive home, taking that more quiet route that you know all to well, time passes by very quickly.

Gears of Time by Majentta: https://www.deviantart.com/majentta

This example already shows that perception of time is relative to what occurs around us. When we are bored or blocked, time seems to slow down. Contrast this with when we are performing either routine tasks or are deeply engaged in something: time seems to literally fly by. So, it is easy to disprove the fractional argument on a moment-to-moment basis, but in fact: this holds even for longer spans of time.

Novelty

There is a difference between how we experience time in the moment and how we remember it when we look back. Waiting in line seemingly takes forever in the moment, but after a day or two, in hindsight, it was really just a very small part of that day.

In a similar vein: holidays always seem to go by very quickly. At least: that is what many conclude as soon as theirs are over. However, during your holiday, time actually seems to slow down. There is a good reason for that: new experiences.

In your daily life you see many of the same things every day, you do many of the same routine tasks everyday, and if you enjoy your work you are likely quite engaged in it. In this day-to-day life you have become highly skilled at filtering out distractions. Contrast this with your vacation where you have to do all kinds of non-routine tasks even to get to your destination, and then have complete days to fill in by yourself.

If on those days you do all kinds of activities you do not usually do, that’s all novelty for your brain. These novel things take more mental processing power and occupy more mental space. Your filters don’t work there, and hence everything seems to last longer. This is noticeable in the moment, but also if you start episodically telling others about your novel experiences.

The reverse is also true: if you would not do anything on your holiday, you will experience boredom, which also makes time appear to pass more slowly, at least: in the moment, perhaps not on retelling. Hence, the benefit of holidays for altering your perception of time, whether you do something or just sit there, either way: it helps slow down time perception at least as you experience it in the moment.

Excitement

This same phenomenon of things seeming to take much longer than they actually do also occurs when there is something physically happening that is exciting. People can overestimate the actual time something took by orders of magnitude.

I once had the genius idea to step into a wooden roller coaster, after not having been in one for many years, and not remembering how much I actually disliked such experiences. While the cars were being pulled up, I started remembering that roller coasters were not a pleasant experience, but by then it was too late. As the carts were released at the apex, and my stomach started to turn, I had no other option than to simply endure it. That ride probably took only a minute or two, but really: it seemed way longer than that.

The Brain

As anything in the reality you experience. Time perception too is a construct of your brain. And as your body becomes less agile with age, so does your brain. In fact your brain uses most energy to perceive new things when you are about five, and this tapers off from that point onward.

Consider that as you get older, you have had more opportunity to learn. Hence, the more you learn, the more complex the networks in your brain to represent what you’ve soaked up. Hence the size and complexity of the webs of connected neurons in your brain increases, which leads to longer paths that signals need to traverse.

When these paths themselves start to age, they degrade, giving more resistance to the flow of signals. This causes the rate at which mental images are acquired and processed to decrease as you get older: chronoception changes. Since your brain is perceiving fewer new images in the same amount of time, it seems as though time is passing more quickly. While in fact it is your own brain slowing down. This is an interesting form of perceptual relativity: the world around you is not going faster, you are going slower relative to it.

Your brain also becomes better at filtering out signals irrelevant to whatever you are doing. This is evidenced for example when something small changes in an environment you have been in for a long time. It is very common not to notice that change for a while, since you have tuned out certain details in your environment. The net effect is not only that you see less images, but that you also see less detail in those images. A complete change-up of environment can of course work wonders here.

Conclusion

We know that the older we get, the faster time seem to pass, but the question is: by how much? We know that for people in their early twenties physical time and chronoception are almost equal: they experience time approximately as it passes in physical reality. Seniors, between sixty and eighty, are off with their estimates by approximately twenty to twenty-five percent.

This leads me to conclude that as a rough rule of thumb what on average feels like a week for a twenty year old, feels like about five and half days for a senior. However, that’s an average. This strongly fluctuates based on the moment-to-moment experience.

As anything that you experience, chronoception too is a construct of your brain. It seems that as we get older we literally gradually lose track of time. One of the few ways to mitigate this to some extent is to expose yourself to novelty in any form, in short: go to new places, learn new things and meet new people. But most of all: enjoy your time.

References

  1. Kingery. K. (2019). It’s Spring Already? Physics Explains Why Time Flies as We Age.
  2. Muller, D. (2016). Why Life Seems to Speed Up as We Age.
  3. Livni, E. (2019). Physics explains why time passes faster as you age.
  4. Haden, J. (2017). Science Says Time Really Does Seem to Fly as We Get Older.
  5. Bonwit, H. (2012). Time Dilation & Back to the Future.
  6. Kiener, M. (2015). Why Time Flies.
  7. Spencer, B. (2017). Time Perception.

Share & Discuss

The Machine Learning Myth

“I was recently at a demonstration of walking robots, and you know what? The organizers had spent a good deal of time preparing everything the day before. However, in the evening the cleaners had come to polish the floor. When the robots started, during the real demonstration the next morning, they had a lot of difficulty getting up. Amazingly, after half a minute or so, they walked perfectly on the polished surface. The robots could really think and reason, I am absolutely sure of it!”

Somehow the conversation had ended up here. I stared with a blank look at the lady who was trying to convince me of the robot’s self-awareness. I was trying to figure out how to tell her that she was a little ‘off’ in her assessment.

As science-fiction writer Arthur C. Clarke said: any sufficiently advanced technology is indistinguishable from magic. However, conveying this to someone with little knowledge of ‘the magic’, other than that gleaned from popular fiction, is hard. Despite trying several angles, I was unable to convince this lady that what she had seen the robots do had nothing to do with actual consciousness.

Machine learning is all the rage these days, demand for data scientists has risen to similar levels as that for software engineers in the late nineties. Jobs in this field are among the best paying relative to the number of years of working experience. Is machine learning magic, mundane or perhaps somewhere in between? Let’s take a look.

A Thinking Machine

Austria, 1770. The court of Maria Theresa buzzes. The chandeliers mounted on the ceiling of the throne room cast long shadows on the carpeted floor. Flocks of nobles arrive in anticipation of the demonstration about to take place. After a long wait, a large cabinet is wheeled in, overshadowed by something attached to it. It looks like a human-sized doll. Its arms lean over the top of the cabinet. In between those puppet arms is a chess board.

The Mechanical Turk by Wolfgang von Kempelen

The cabinet is accompanied by its maker, Wolfgang van Kempelen. He opens the cabinet doors which reveals cogs, levers and other machinery. After a dramatic pause he reveals the true name of this device: the Turk. He explains it is an automated chess player and invites Empress Maria Theresa to play. The crowd sneers and giggles. However, scorn turns to fascination as Maria Theresa’s opening chess move is countered by the mechanical hand of the Turk by a clever counter move.

To anyone in the audience the Turk looked like an actual thinking machine. It would move its arm just like people do, it took time between moves to think just like people do, it even corrected invalid moves of the opponent by reverting the faulty moves, just like people do. Was the Turk, so far ahead of its time, really the first thinking machine?

Unfortunately: no, the Turk was a hoax, an elaborate illusion. Inside the cabinet a real person was hidden. A skilled chess player, who controlled the arms of the doll-like figure. The Turk shows that people can see and have an understanding of an effect, but fail to correctly infer its cause. The Turk’s chess skills were not due to its mechanics. Instead they were the result of clever concealment of ‘mundane’ human intellect.

The Birth of Artificial Intelligence

It would take until the early 1950’s before a group of researchers at Darthmouth started the field of artificial intelligence. They believed that if the process of learning something can be precisely described in sufficiently small steps, a machine should be able to execute these steps as well as a human can. Building on existing ideas that emerged in the preceding years, the group set out to lay some of the groundwork for breakthroughs to come in the ensuing two decades. Those breakthroughs did indeed come in the form of path finding, natural language understanding and even mechanical robots.

Arthur Samuel

Around the same time, Arthur Samuel of IBM wrote a computer program that could play checkers. A computer-based checkers opponent had been developed before. However, Samuel’s program could do something unique: adapt based on what it had seen before. He roughly did this by making the program store the moves that led to games that were won in the past, and then replaying those moves at appropriate situations in the current game. Samuel referred to this self-adapting process as machine learning.

What then really is the difference between artificial intelligence and machine learning? Machine learning is best thought of as a more practically oriented sub field of artificial intelligence. With mathematics at its core, it could be viewed as a form of automated applied statistics. At the core of machine learning is finding patterns in data and exploiting those to automate specific tasks. Tasks like finding people in a picture, recognizing your voice or predicting the weather in your neighbourhood.

In contrast, at the core of artificial intelligence is the question of how to make entities that can perceive the environment, plan and take action in that environment to reach goals, and learn from the relation between actions and outcomes. These entities need not be physical or sentient, though that is often the way they are portrayed in (science) fiction. Artificial intelligence intersects with other fields like psychology and philosophy, as discussed next.

Philosophical Intermezzo: Turing and the Chinese room

Say, a machine can convince a real person into thinking that it – the machine – is a human being. By this very act of persuasion, you could say the machine is at least as cognitively able as that human. This famous claim was made by mathematician Alan Turing in the early fifties.

Turing’s claim just did not sit well with philosopher John Searle. He proposed the following thought experiment: imagine Robert, an average Joe who speaks only English. Let’s put Robert in a lighted room with a book and two small openings for a couple of hours. The first opening is for people to put in slips of paper with questions in Chinese: the input. The second opening is to deposit the answers to these questions written on new slips, also in Chinese: the output.

Searle’s Chinese Room

Robert does not know Chinese at all. To help him he has a huge book in this ‘Chinese’ room. In this book he can look up what symbols he needs to write on the output slips, given the symbols he sees on the input slips. Searle argued that no matter how many slips Robert processes, he will never truly understand Chinese. After all, he is only translating input questions to output answers without understanding the meaning of either. The book he uses also does not ‘understand’ anything, as it contains just a set of rules for Robert to follow. So, this Chinese room as a whole can answer questions. However, none of its components actually understands or can reason about Chinese!

Replace Robert in the story with a computer, and you get a feeling for what Searle tries to point out. Consider that while a translation algorithm may be able to translate one language to the other, being able to do that is insufficient for really understanding the languages themselves.

The root of Searle’s argument is that knowing how to transform information is not the same as actually understanding it. Taking that one step further: in contrast with Turing, Searle’s claim is that merely being able to function on the outside like a human being is not enough for actually being one.

The lady I talked with about the walking robots had a belief. Namely, that the robots were conscious based on their adaptive response to the polished floor. We could say the robots were able to ‘fool’ her into this. Her reasoning is valid under Turing’s claim: from her perspective the robots functioned like a human. However, it is invalid under Searle’s, as his claim implies ‘being fooled’ is simply not enough to prove consciousness.

As you let this sink in, let’s get back to something more practical that shows the strength of machine learning.

Getting Practical with Spam

In the early years of this century spam emails were on the rise. There was no defense against the onslaught of emails. So too thought Paul Graham, a computer scientist, whose mailbox was overflowing like everyone else’s. He approached it like an engineer: by writing a program. His program looked at the email’s text, and filtered those that met certain criteria. This is similar to making filter rules to sort emails into different folders, something you have likely set up for your own mailbox.

Graham spent half a year manually coding rules for detecting spam. He found himself in an addictive arms race with his spammers, trying to outsmart each other. One day he figured he should look at this problem in a different way: using statistics. Instead of coding manual rules, he simply labeled each email as spam or not spam. This resulted in two labeled sets of emails, one consisting of genuine mails, the other of only spam emails.

He analyzed the sets and found that spam emails contained many typical words, like ‘promotion’, ‘republic’, ‘investment’, and so forth. Graham no longer had to write rules manually. Instead he approached this with machine learning. Let’s get some intuition for how he did that.

Identifying Spam Automagically

Imagine that you want to train someone to recognize spam emails. Your first step is showing the person many examples of emails that are labeled genuine and that are labeled spam. That is: for each of these emails the participant is told explicitly whether each is genuine or spam. After this training phase, you put the person to work to classify new unlabeled emails. The person thus assigns each new email a label: spam or genuine. He or she does this based on the resemblance of the new email with ones seen during the training phase.

Replace the person in the previous paragraph with a machine learning model and you have a conceptual understanding of how machine learning works. Graham took both email sets he created: one consisting of spam emails, the other of genuine ones. He then used these to train a machine learning model that looks at how often words occur in text. After training he used the model to classify new incoming mails as being either spam or genuine. The model would for example mark an email as spam if the word ‘promotion’ would appear in it often. A relation it ‘learned’ from the labeled emails. This approach made his hand-crafted rules obsolete.

Your mailbox still works with this basic approach. By explicitly labeling spam messages as spam you are effectively participating in training a machine learning model that can recognize spam. The more examples it has, the better it will become. This type of application goes to the core of what machine learning can do: find patterns in data and bind some kind of consequence to the presence, or absence, of a pattern.

This example also reveals the difference between software engineering and data science. A software engineer builds a computer program explicitly by coding rules of the form: if I see this then do that. Much like Graham tried to initially combat spam. In contrast, a data scientist collects a large amount of things to see, and a large amount of subsequent things to do, and then tries to infer the rules using a machine learning method. This results in a model: essentially an automatically written computer program.

Software Engineering versus Machine Learning

If you would like a deeper conceptual understanding and don’t shy away from something a bit more abstract: let’s dive a little bit deeper into the difference between software engineering and machine learning. If you don’t: feel free to skip to the conclusion.

As a simple intuition you can think of the difference between software engineering and machine learning as the difference between writing a function explicitly and inferring a function from data implicitly. As a minimal contrived example: imagine you have to write a function f that adds two numbers. You could write it like this in an imperative language:

function f(a, b):
    c = a + b
    return c

This is the case were you explicitly code a rule. Contrast this with the case where you no longer write f yourself. Instead you train a machine learning model that produces the function f based on many examples of inputs a and b and output c.

train([(a, b, c), (a, b, c), (...)]) -> f

That is effectively what machine learning boils down to: training a model is like writing a function implicitly by inferring it from the data. After this f can be used on new previously unseen (a, b) inputs. If it has seen enough input examples, it will perform addition on the unseen (a, b) inputs. This is exactly what we want. However, consider what happens if the model would be fed only one training sample: input (2, 2, 4). Since 2 * 2 = 4 and 2 + 2 = 4, it might yield a function that multiplies its inputs instead of adding them!

There are roughly two types of functions that you can generate, that correspond to two types of tasks. The first one, called regression, returns some continuous value, as in the addition example above. The second one, called classification, returns a value from a limited set of options like in the spam example. The simplest set of options being: ‘true’ or ‘false’.

Effectively we are giving up precise control over the function’s exact definition. Instead we move to the level of specifying what should come out given what we put in. What we gain from this is the ability to infer much more complex functions than we could reasonably write ourselves. This is what makes machine learning a powerful approach for many problems. Nevertheless, every convenience comes at a cost and machine learning is no exception.

Limitations

The main challenge for using a machine learning approach is that large amounts of data are required to train models, which can be costly to obtain. Luckily, recent years have seen an explosion in available data. More people are producing content than ever, and wearable electronics yield vast streams of data. All this available data makes it easier to train useful models, at least for some cases.

A second caveat is that training models on large amounts of data requires a lot of processing power. Fortunately, rapid advances in graphics hardware have led to orders of magnitude faster training of complex models. Additionally, these hardware resources can be easily accessed through cloud services, making things much easier.

A third downside is that, depending on the machine learning method, what happens inside of the model created can become opaque. What does the generated function do to get from input to output? It is important to understand the resulting model, to ensure it performs sanely on a wide range of inputs. Tweaking the model is more an art than a science, and the risk of harmful hidden biases in models is certainly realistic.

Applying machine learning is no replacement for software engineering, but rather an augmentation for specific challenges. Many problems are far more cost effective to approach by writing a simple set of explicitly written rules instead of endlessly tinkering with a machine learning model. Machine learning practitioners are best of not only knowing the mechanics of each specific method, but also whether machine learning is appropriate to use at all.

Conclusion

Many modern inventions would look like magic to people from one or two centuries ago. Yet, knowing how these really work shatters this illusion. A huge increase in the use of machine learning to solve a wide range of problems has taken place in the last decade. People unfamiliar with the underlying techniques often both under and overestimate the potential and the limitations of machine learning methods. This leads to unrealistic expectations, fears and confusion. From apocalyptic prophecies to shining utopias: machine learning myths abound where we are better served staying grounded in reality.

This is not helped by naming. Many people associate the term ‘learning’ with the way humans learn, and ‘intelligence’ with the way people think. Though sometimes a useful conceptual analog, it is quite different from what these methods currently actually do. A better name would have been data-based function generation. However, that admittedly sounds much less sexy.

Nevertheless, machine learning at its core is not much more than generating functions based on input and, usually, output data. With this approach it can deliver fantastic results on narrowly defined problems. This makes it an important and evolving tool, but like a hammer for a carpenter: it is really just a tool. A tool that augments, rather than replaces, software engineering. Like a hammer is limited by laws of physics, machine learning is fundamentally limited by laws of mathematics. It is no magic black box, nor can it solve all problems. However, it does offer a way forward for creating solutions to specific real-world challenges that were previously elusive. In the end it is neither magic nor mundane, but is grounded in ways that you now have a better understanding of.

Resources

  1. Singh, S. (2002). Turk’s Gambit.
  2. Rusell, S. & Norvig, P. (2003). Artificial Intelligence.
  3. Halevy, A. & Norvig, P. & Pereira, F. (2010). The Unreasonable Effectiveness of Data.
  4. Lewis, T. G. & Denning, P. J. (2018). Learning Machine Learning, CACM 61:12.
  5. Graham, P. (2002). A Plan for Spam.
  6. O’Neil, C. (2016). Weapons of Math Destruction.
  7. Stack Overflow Developer Survey (2019).

Share & Discuss