You’ve probably heard the term “neural networks” somewhere before.

Even if you’re a complete newbie to machine learning.

The recent explosion of popular AI tools like ChatGPT has made the term a popular buzzword in articles and discussions. So what are they?

If you’ve read my introductory post on machine learning, you’ll know that machine learning is all about computers learning by themselves.

Well, neural networks are like the digital brains that make that happen.

Heck, they even have “neural” in their name.

Neural Networks are a super important topic in ML, and something you need to understand.

Let’s conduct a bit of digital brain surgery.

🤖 What Are Neural Networks, Actually?

What is this gorilla missing?

Some see a gorilla. I see machine learning without neural networks.

The blank, derpy stare and conspicuous loading icon are dead giveaways.

Obviously, it’s missing its brain.

That poor gorilla is actually a decent representation of what machine learning would be without neural networks.

At its core, a neural network is a computer system that learns patterns by connecting nodes.

Kinda sounds like a brain, right?

Sure enough, the human brain is the real source of inspiration behind neural networks.

Neural Networks: “We're not so different, you and I."

🧠 Human Brain vs Neural Networks

Our brains pass electrical signals between billions of interconnected cells called neurons. Learning happens when these connections (synapses) get stronger with repeated use, allowing our brains to find patterns.

A Neural Network (let’s call it NN) is a simplified model of this process.

It uses artificial neurons (called nodes) that are also interconnected. However, instead of strengthening synapses, the network learns by adjusting the importance of its connections based on the data it’s given.

This allows it to learn, much like an actual human brain!

⚙️ How Neural Networks Work

Here’s an illustration of a typical NN:

Those circles are like the neurons, called nodes. They basically take in data and spit out new information according to that data.

The lines connecting the nodes are weights. They control how strongly one node influences the next.

A NN is typically made up of 3 layers:

  • Input Layer

    • This is the layer where data comes in (ex., pixels from an image or words in a sentence). There is always just one input layer in a NN.

  • Hidden Layer(s)

    • This is where all the real “thinking” happens. There can be as many hidden layers as you want! Each hidden layer transforms the inputs into something more meaningful — like going from pixels → edges → shapes → “mean cat face”.

  • Output Layer

    • This is the final decision-maker. Like the input layer, there’s only one output layer. After all the hidden layers are done with the data, the output layer gives you the result!

Example. Let’s say your alarm clock says 6:00 am. That’s your input layer. Hidden layers then determine whether you’re actually getting out of bed. The output later decides: snooze button.

🔗 Weights & Biases

As I said earlier, the weights are the connections between the nodes. They gauge the importance level of the data: “Should I take this piece of info seriously, or meh?”

Then there are biases.

Biases are like the NN’s hunches — little intuitions that shift the outcome. Additional learnable parameters help the NN adjust its output, providing flexibility to fit the data.

🗝️ Activation Function

The activation function plays an important role in NNs. It’s kind of like the bouncer at a club.

After a node adds up all the weighted inputs + bias, the activation function decides: “Should this signal get through or not? And if yes, how strong?”

An activation function is a mathematical function applied to a node's weighted sum of inputs and its bias. Its role is to introduce non-linearity into the network, which enables it to learn and model complex, non-linear relationships in data

Non-linearity allows a NN to learn and model complex, curvy relationships in data that can't be represented by a straight line.

Basically, it turns raw numbers into meaningful signals the network can actually learn from.

Here’s how an activation function looks in a node:

Image Credit: Geek For Geeks

Now, there are different types of activations available such as ReLu, Sigmoid, etc. (all of which I’ll cover in the future😉.

⭐️ Conclusion

Ultimately, a NN is just a bunch of fancy math trying to mimic how our own brains learn.

It's not magic, just a clever way of having a computer play a massive game of "hot and cold," adjusting its internal knobs and dials until it gets the right answer.

While we’re still far from true AI, it's pretty incredible to watch what these things can do.

Reply

or to participate