Mathematics

February 28, 2025
in TLU, Biology, Mathematics, Python, Artificial Intelligence
3 min read

Introduction to Threshold Logic Unit (TLU): Biology, Mathematics, and Python Implementation

Introduction

Artificial intelligence and neural networks are inspired by biological neurons. One of the simplest artificial neuron models is the Threshold Logic Unit (TLU), which forms the foundation of perceptrons and modern deep learning architectures. In this post, we will explore the biological origins, mathematical formulation, and a practical Python implementation of the TLU. That's a great starting point to understand artificial neurons and the foundation of deep learning. Don't be afraid by the complexity, it's a great place to start.

Biological Inspiration

The TLU is inspired by the behavior of biological neurons. A neuron in the human brain receives inputs from other neurons through synapses. If the combined signal exceeds a certain threshold, the neuron fires and sends a signal to the next neuron.

In simple terms:

Neurons receive input signals.
Each input is weighted based on its importance.
If the weighted sum of inputs exceeds a threshold, the neuron activates.

This is a binary activation system, where the neuron either fires (1) or remains inactive (0).

TLU

\((a)\) : Biological Neuron
\((b)\) : Artificial Neuron

Mathematical Model

A Threshold Logic Unit (TLU) is mathematically defined as:

\[ y = f(w_1 x_1 + w_2 x_2 + \dots + w_n x_n - \theta) \]

where: - \( x_i \) are the input values. - \( w_i \) are the corresponding weights. - \( \theta \) is the threshold. - \( f(z) \) is the step activation function:

\[ f(z) = \begin{cases} 1 & \text{if } z \geq 0 \\ 0 & \text{otherwise} \end{cases} \]

This function determines whether the neuron fires (1) or stays inactive (0).

Python Implementation

Let's implement a simple Threshold Logic Unit (TLU) in Python.

import numpy as np

def step_function(x):
    return 1 if x >= 0 else 0

def TLU(inputs, weights, threshold):
    weighted_sum = np.dot(inputs, weights) - threshold
    return step_function(weighted_sum)

# Example usage:
inputs = np.array([1, 0, 1])   # Binary inputs
weights = np.array([0.5, 0.5, 0.5])  # Weight vector
threshold = 0.7

output = TLU(inputs, weights, threshold)
print(f"TLU Output: {output}")

Explanation:

The step_function(x) returns 1 if x is greater than or equal to zero, otherwise 0.
The TLU(inputs, weights, threshold) computes the weighted sum, subtracts the threshold, and applies the step function.
We test it with a simple example where inputs = [1, 0, 1], weights = [0.5, 0.5, 0.5], and threshold = 0.7.

Visualization of Decision Boundary

A TLU is a linear classifier, meaning it can separate data with a straight decision boundary. Let's visualize this boundary using Matplotlib.

import matplotlib.pyplot as plt

def plot_tlu_decision_boundary(weight, threshold):
    x = np.linspace(-2, 2, 100)
    y = (-weight[0] * x + threshold) / weight[1]
    plt.plot(x, y, label="Decision Boundary")
    plt.axhline(0, color='gray', linestyle='--')
    plt.axvline(0, color='gray', linestyle='--')
    plt.xlim(-2, 2)
    plt.ylim(-2, 2)
    plt.xlabel("x1")
    plt.ylabel("x2")
    plt.legend()
    plt.title("TLU Decision Boundary")
    plt.show()

# Example decision boundary
weights = np.array([0.5, -0.5])
threshold = 0.2
plot_tlu_decision_boundary(weights, threshold)

Limitations of TLU

While the Threshold Logic Unit is a fundamental concept, it has limitations: - It can only solve linearly separable problems (like AND and OR logic gates). - It fails for problems like XOR, which are not linearly separable. - More advanced models like Perceptrons and Multi-Layer Neural Networks extend this idea by using different activation functions and multiple layers.

Conclusion

The TLU is a great starting point to understand artificial neurons and the foundation of deep learning. Even though modern neural networks use non-linear activation functions like sigmoid, ReLU, and tanh, understanding TLU provides fundamental insights into how neurons process information.

In future posts, we will extend this idea to Perceptrons and Neural Networks.

References

Rosenblatt, F. (1958). The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain.
McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity.

August 5, 2024
in LLM, Python, Chatbot, Artificial Intelligence, Mathematics
4 min read

ChatBot Memory

Nous présentons ici une approche pour la gestion de la mémoire à court terme dans les chatbots, en utilisant une combinaison de techniques de stockage et de résumé automatique pour optimiser le contexte conversationnel. La méthode introduite repose sur une structure de mémoire dynamique qui limite la taille des données tout en préservant les informations essentielles à travers des résumés intelligents. Cette approche permet non seulement d'améliorer la fluidité des interactions mais aussi d'assurer une continuité contextuelle lors de longues sessions de dialogue. En outre, l'utilisation de techniques asynchrones garantit que les opérations de gestion de la mémoire n'interfèrent pas avec la réactivité du chatbot.

Modélisation Mathématique de la Gestion des Conversations

Dans cette section, nous formalisons mathématiquement la gestion de la mémoire de conversation dans le chatbot. La mémoire est structurée comme une liste de paires représentant les échanges entre l'utilisateur et le bot.

Structure de la Mémoire de Conversation

La mémoire de conversation peut être définie comme une liste ordonnée de paires \((u_i, d_i)\), où \(u_i\) représente l'entrée utilisateur et \(d_i\) la réponse du bot pour le \(i\)-ième échange. Cette liste est notée \(\mathcal{C}\) :

\[ \mathcal{C} = [(u_1, d_1), (u_2, d_2), \ldots, (u_n, d_n)] \]

où \(n\) est le nombre total d'échanges dans l'historique actuel.

Mise à Jour de la Mémoire

Lorsqu'un nouvel échange se produit, une nouvelle paire \((u_{n+1}, d_{n+1})\) est ajoutée à la mémoire. Si la taille de \(\mathcal{C}\) dépasse une limite maximale prédéfinie \(M_{\text{max}}\), l'échange le plus ancien est retiré :

\[ \mathcal{C} = \begin{cases} \mathcal{C} \cup \{(u_{n+1}, d_{n+1})\}, & \text{si } |\mathcal{C}| < M_{\text{max}} \\ (\mathcal{C} \setminus \{(u_1, d_1)\}) \cup \{(u_{n+1}, d_{n+1})\}, & \text{si } |\mathcal{C}| = M_{\text{max}} \end{cases} \]

Comptage des Mots

Pour gérer l'espace de mémoire et décider quand la compression est nécessaire, nous calculons le nombre total de mots \(W(\mathcal{C})\) dans la mémoire :

\[ W(\mathcal{C}) = \sum_{(u_i, d_i) \in \mathcal{C}} (|u_i| + |d_i|) \]

où \(|u_i|\) et \(|d_i|\) sont respectivement le nombre de mots dans \(u_i\) et \(d_i\).

Compression de la Mémoire

Lorsque \(W(\mathcal{C})\) dépasse un seuil \(W_{\text{max}}\), la mémoire est compressée pour maintenir la pertinence du contexte. Cette compression est réalisée par un modèle de résumé \(\mathcal{S}\), tel que BART :

\[ \mathcal{C}_{\text{compressed}} = \mathcal{S}(\mathcal{C}) \]

où \(\mathcal{C}_{\text{compressed}}\) est la version résumée de la mémoire, réduisant le nombre total de mots tout en préservant l'essence des interactions passées.

Intégration dans le Modèle de Langage

Le modèle de langage utilise le contexte compressé pour générer des réponses pertinentes. Le prompt \(P\) utilisé par le modèle est construit comme suit :

\[ P = f(\mathcal{C}_{\text{compressed}}, \text{contexte}) \]

où \(\text{contexte}\) est le contexte supplémentaire récupéré à partir d'un pipeline RAG, et \(f\) est une fonction de concaténation qui prépare le texte pour le modèle.

Cette approche assure que le chatbot dispose toujours d'un contexte conversationnel à jour, permettant des interactions plus naturelles et engageantes avec l'utilisateur.

Implémentation du Code pour la Gestion de la Mémoire d'un Chatbot

Dans cette section, nous allons examiner un exemple de code en Python qui illustre la gestion de la mémoire dans un chatbot. Le code utilise PyTorch et les transformers de Hugging Face pour gérer et compresser l'historique des conversations.

Préparation de l'Environnement

Nous commençons par vérifier si un GPU est disponible, ce qui permet d'accélérer le traitement si nécessaire.

import torch
from transformers import pipeline
import logging

if torch.cuda.is_available():
    device: int = 0
else:
    device: int = -1

MAX_MEMORY_SIZE: int = 2000

Définition de la Classe ChatbotMemory

La classe ChatbotMemory gère l'historique des conversations et effectue des opérations de mise à jour et de compression. Donc à chaques fois que update_memory est appelé, le texte en mémoire est compté et traité au besoin.

class ChatbotMemory:
    def __init__(self, conv: list = []):
        self.conversation_history = conv

    def update_memory(self, user_input: str, bot_response: str) -> None:
        self.conversation_history.append(f"'user': {user_input}, 'bot': {bot_response}")

        if memory_counter(self.conversation_history) > 1000:
            self.conversation_history = compressed_memory(self.conversation_history)
            logging.info("Mémoire compressée.")

        if len(self.conversation_history) > MAX_MEMORY_SIZE:
            self.conversation_history.pop(0)
            logging.info("Mémoire réduite.")
        return 0

    def get_memory(self):
        return self.conversation_history

Compression et Comptage de la Mémoire

La fonction _get_compressed_memory utilise le modèle BART pour résumer l'historique des conversations.

def _get_compressed_memory(sentence: str) -> str:
    summarizer = pipeline("summarization", model="facebook/bart-large-cnn", device=device)
    summary = summarizer(sentence, max_length=50, min_length=5, do_sample=False)
    return summary[0]['summary_text']

La fonction compressed_memory applique la fonction _get_compressed_memory à chaque segment de l'historique des conversations. Pour cela nous optimisons la procédure en effectuant un traitement par Batch. Cette méthode est dissocié de la fonction _get_compressed_memory de manière à pouvoir introduire de nouvelles méthodes de compression.

def compressed_memory(conv_hist: list) -> list:
    return [_get_compressed_memory(' '.join(conv_hist[i:i+5])) for i in range(0, len(conv_hist), 5)]

La fonction memory_counter compte le nombre total de mots dans l'historique. (Note qu'il pourrais être interessant de réaliser cette étape avec des tokens plutot que des mots.)

def memory_counter(conv_hist: list) -> int:
    st = ''.join(conv_hist)
    return len(st.split())

Conclusion

Ce code établit un cadre efficace pour la gestion de la mémoire dans un chatbot, en utilisant des techniques de compression pour maintenir un contexte pertinent et en améliorant la performance globale du système. L'utilisation de modèles de résumé comme BART assure que même lorsque la mémoire est compressée, le contexte essentiel est préservé.

Code sur GitHub